Version 2.2.11
Handouts:
Graph drawings on this page were made using the dot and neato programs of the open source Graphviz package from AT&T.

An undirected graph

A directed graph

Man dressing activities, for topological sort

Graph search: the graph to be searched

Graph search: the search tree

Weighted graph no. 1
Highlights:
Definition. A (directed or undirected) graph consists of a set of vertices, V, and a set of (directed or undirected) edges, E; each edge connects two vertices.
Conventionally, the vertices are represented as dots or nodes, and the edges as lines connecting them.
Some examples of graph representations of information:
Some typical graph problems:
In an undirected graph, there may be no more than one edge connecting a given pair of vertices. In an undirected graph, there may be no more than two edges connecting a given pair, one edge in each direction.
Something graph-like with more edges is often called a multigraph, but the term is vague.
Each edge connects two different vertices.
Drake allows graphs to have self loops, i.e., an edge between a vertex and itself, but these are normally not considered graphs. They may be called pseudographs.
Graphs may contain data in their vertices or in their edges or both. For example, in a city-highway graph, cities (vertices) have names, and highway segments (edges) have distances. Or the vertices and edges may be "bare".
The complexity (running time) of graph algorithms depends on the number of vertices v and the number of edges e.
Trees are a special case of graphs.
Focusing on the bare "dots and lines" kind of graph, there are two main types of representation: adjacency lists and adjacency matrices.
In both representations we assign each node an identifying number which is used as an array index to retrieve the node.
A graph can also be represented as a pair of sets. Since a graph is a set of vertices with a set of edges, it would be straightforward to use any of the implementations of the set data type to store the two sets. However, other forms of representation are typically more efficient.
We use a one-dimensional array representing the vertices, and each vertex has a adjacency list (Drake calls it neighbor list) — a list (or set, array, bit vector) of its neighbors (or their node numbers). For a directed graph, the neighbors are those on the receiving end of an edge from the vertex in question.
The space required is Θ(v + e), and the time to check whether an edge exists between two particular vertices is Θ(e).
An adjacency matrix (Drake: neighbor matrix) is a two-dimensional array of boolean values. Call it "edges" (Though maybe it's better to call it "graph" or "adjacent"). Edges[i][j] is true if there is an edge between vertices i and j (in an undirected graph), or an edge from i to j (in a directed graph).
The space required is Θ(v2), and the time to check whether an edge exists between i and j, is Θ(1).
Adjacency matrix: faster test for "does this edge exist?"
Adjacency matrices are okay for dense graphs with a static set of vertices.
Sparsely connected graphs (where e is much less than v2) are better with the adjacency list representation, since a matrix wastes storage for the edges that are not there (Θ(v2) compared to Θ(v + e)).
Dynamic (changing) graphs: inserting or deleting a vertex is awkward (at best!) with a matrix representation, although inserting or deleting an edge is not difficult.
For small graphs, either a contiguous or linked implementation should do well. For large graphs with the adjacency list representation, we should replace the adjacency lists by an efficient implementation of sets, such as balanced search trees or hash tables.
Figures 15–11 and 15–12, pp. 414–416.
java.util.Graph.Exercises, pp. 412–413.
Graph traversal, commonly presented in data structures courses and texts, is boring. The object seems to be just to visit as many nodes as possible, either all of the nodes, or at least all those connected to some starting point. If it's just to visit every node, we can easily iterate through the array of vertices (in the adjacency list representation) or through the rows, representing vertices, in the adjacency matrix representation. There is no need for anything as sophisticated as depth- or breadth-first.
Graph search is a much more interesting problem. Here we are given a starting vertex and a desired destination vertex (goal), and try to find a path connecting the start to the goal. When we reach the goal, we stop, because there is no point in visiting all those other vertices.
These are brute force searching algorithms: they search the graph exhaustively, without using any "intelligence" to eliminate unlikely paths.
Here's the general procedure for DFS or BFS. It uses a generalized COLLECTION of pending nodes, with operations ADD and REMOVE. The COLLECTION might be a stack or a queue or something else. In case the graph is cyclic, we need to avoid going around in circles, so we maintain a set of nodes already visited.
Returns true if there is a path from start to goal; otherwise returns false
def search (graph g, vertex start, vertex goal):
pending = make empty COLLECTION of vertices
visited = make empty set of vertices
node = start
ADD node to pending
while pending is not empty:
node = REMOVE a vertex from pending
if node is equal to goal:
return true
else if node is not in visited:
add node to visited
for each n in neighbors of node:
ADD n to pending
endfor
endif
endwhile
(assert: pending is empty)
return false
To make DFS: replace COLLECTION with stack, ADD with push, REMOVE with pop.
To make BFS: replace COLLECTION with queue, ADD with enqueue, REMOVE with dequeue.
Conceivably, we could have some other type of collection, maybe one that removes elements in no particular order (instead of FIFO or LIFO), and then we have a new kind of graph search. But DFS and BFS are the standard ones.
Version 1 just returns true or false, indicating whether there's a path from start to goal. But usually we want to know what this path is, if it exists. To accomplish this, in version 2 we modify the use of the variable pending: instead of storing just the pending vertices, we also store the paths taken to reach them.
Returns a path from start to goal, if there is one; otherwise returns FAILURE.
def search (graph g, vertex start, vertex goal):
pending = make empty COLLECTION of (vertex, path) pairs
visited = make empty set of vertices
node = start
path = [start]
ADD (node, path) to pending
while pending is not empty:
(node, path) = REMOVE an element from pending
if node is equal to goal:
return path
else if node is not in visited:
add node to visited
for each n in neighbors of node:
ADD (n, extend path to n) to pending
endfor
endif
endwhile
(assert: pending is empty)
return FAILURE
A cheap implementation of the "path" data type would be a queue (enqueue node to extend the path). To display the path at the end, we could repeatedly dequeue nodes and print them.
(And actually if we used a deque, which provides efficient insert and retrieve at both ends, instead of a queue, we wouldn't need to store (node, path) tuples in pending; we could just store the path and extract node as the last node in the path. A deque can easily be implemented as an array list, or a linked list with a direct pointer to the last element.)
Highlights:
Intuitively, present the graph data type as a partial order and a sequence A → B → … → N as a total order.
(Skim over the details of partial and total order)
The algorithm for topological sort will also tell us if the graph is cyclic, i.e., it does not represent a partial order.
Procedure:
def topsort (g):
repeat:
x = any vertex of g that has no predecessor
output x
remove x, and all edges leading out from x, from g
until g is empty or
all vertices remaining in g have predecessors
if g is non-empty:
report a cycle
end topsortExamples: apply topsort to
Efficiency:
Using adjacency lists and a count, for each vertex, of the number of its immediate predecessors, this algorithm can be implemented very efficiently (see Kruse and Ryba, sec. 12.4, breadth-first algorithm), namely in O(v + e) time, where v = number of vertices and e = number of edges in g.
This is apparently different from the algorithm in Drake Fig. 15-28, p. 425-426, which he says has running time Θ(v2).
Another application of topological sorting is in Kruskal's algorithm, which we will see later (?), to detect when adding an edge would create a cycle in the graph.
Double.POSITIVE_INFINITY).Inputs: a weighted graph g, a designated source vertex s.
Outputs: a list of minimum cost paths from s to each other vertex in g.
Procedure:
1. Create result as an array of length v
of distances (costs) from s to
each vertex. Initialize each element as
result[s] = 0; result[v] = infinite, if v != s.
This array is our initial estimate of mincost.
When finished, it will be an exact representation of mincost.
2. Create visited, a set of vertices that are visited.
This can be done as a bitmap set or an array
of boolean with one element per vertex.
Initialize each element as visited[v] = false.
(Implicit initialization in Drake's code.)
3. Repeat v times:
a. x = least cost unvisited vertex,
i.e., the vertex x such that visited[x] = false and
for every vertex y, result[x] <= result[y] or visited[y].
There may be more than one such vertex; take any one of them.
b. visited[x] = true
c. for EACH vertex y,
result[y] = min(result[y],
result[x] + directcost(x, y))
Comments:
Example: apply Dijkstra's algorithm to the weighted graph example …
result[i][j] = the shortest (known) distance (or path) from vertex i to vertex j.cost(i, j), where an edge exists, and infinity otherwise.result[i][j] is the minimum cost path (or its cost) from i to j going only through the vertices 0..(m − 1). (The path does not have to go through any intermediate vertices, but it cannot go through any not in {0..(m − 1)}.)
result[i][j] with the direct distance/path from i to j, through no intermediaries.The shortest path from i to j, using intermediate vertices only in {0..m}, either uses or does not use m as an intermediate vertex.
if it does not use m, then result[i][j] still has the correct value and is not updated.
if the path does go through m, then it has the form (path({0..(m − 1)}, i, m)) + (path({0..(m − 1)}, m, j)),
(where path(S, x, y) means a path from x to y that goes through no intermediate nodes except those in the set S)
i.e., it consists of a subpath from i to m and a subpath from m to j, where both subpaths are the shortest possible paths going only through vertices in {0..(m − 1)} as intermediate points.
(m cannot be an intermediate point of either subpath, because it is an endpoint of both.)
So all we need to do, to preserve the invariant, is for all vertices i and j, look at (path({0..(m − 1)}, i, m)) + (path({0..(m − 1)}, m, j)). If the cost of this path is less than the current result[i][j], then it replaces result[i][j].
We can get both subpaths from our results table, i.e., result[i][m] + result[m][j].
So (finally) the update loop is simply:
for m = 0 to v - 1:
for i = 0 to v - 1:
for j = 0 to v - 1:
newpath = path({0..(m-1)}, i, m) + path({0..(m-1)}, m, j)
= result[i][m] + result[m][j]
if newpath's cost is less than result[i][j]
then result[i][j] = newpathExample: ...
Definitions:
This is called a greedy algorithm, because its action at each step involves choosing what looks best from the restricted viewpoint of that step. Yet it leads to an overall optimal outcome!
Example: ...
(Compare with Drake's exposition)
Inputs: a weighted graph G
Output: a minimum spanning tree for G.
Procedure:
Remarks:
Sets A and B are disjoint if they have no common elements, i.e., their intersection is empty.
If a set U is divided up into n disjoint sets, S1 … Sn, so that every element of U is in exactly one of the sets, then {S1, … Sn} is called a partition of U.
Examples:
A student is either a freshman or sophomore or junior or senior, can't be more than one of these at the same time.
An animal is either a legless animal or an animal with 1 leg or with 2 legs or with 3 or with 4 or more than 4.
Now if a graph has two or more connected subgraphs—that is, all the vertices in a subgraph are connected to each other, but not to the other subgraphs—then the subgraphs are disjoint sets, because each vertex is in one and only one of them. (And in fact, even if none of the vertices are connected, each of the disjoint sets contains only a single vertex.)
In Section 14.2, Drake presents a very efficient implementation of disjoint set clusters based on up-trees. Specifically, the implementation performs these operations efficiently:
Determine whether two elements belong to the same set.
Merge (combine, make the union of) two sets.
These are exactly the set operations needed in Kruskal's algorithm!
An up-tree is a tree, where the nodes know their parent, rather than the parent knowing its children. See Fig. 14-11, p. 378.
To check whether two elements belong to the same set, follow the links from each node to its tree's root. If they have the same root, they are in the same set.
To merge (union) two sets, simply make one of them a child of the other's root. See Fig. 14-11 and 14-12.
That's the basic idea, but there are some enhancements to make it efficient.
Use an array representation in which each array element represents a node by storing the index of its parent (−1 for roots).
(Fig. 14-13)
When merging, if the two trees are of unequal height, make the shorter one become a subtree of the taller one. (Fig. 14-15)
Store the tree heights h so this can be done efficiently, in the form of (−h − 1), in the root element for each tree. (Since the root has no parent, this takes no extra space. We use −h to distinguish a height from a parent index, and subtract 1 because there is no way to distinguish −0 from just 0.) (Fig. 14-16)
To make the structure flatter, whenever we go finding the root of the tree of any node, we will make each node on the path from that node to the root become a child of the root. (Fig. 14-18)
In fact, this is best done recursively, because we cannot make the change until we've reached the root, and so have to remember which nodes were visited. (Fig. 14-19)
The result:
We can't present the analysis, but the result is very impressive.
Let log* n =
0, if n is 1;
1 + log* (log n), otherwise.
For example, using base 2 logarithms,
log*(65536)
= 1 + log*(16) (since 216 = 65536)
= 1 + 1 + log*(4) (since 24 = 16)
= 1 + 1 + 1 + log*(2) (since 22 = 4)
= 1 + 1 + 1 + 1 + log*(1) (since 21 = 2)
= 1 + 1 + 1 + 1 + 0 (base case)
= 4.
I.e., log*(65536) = 4 because log(log(log(log(65536)))) = 1: it takes only 4 applications of log to "reduce" 65536 to 1.
This function grows at an incredibly slow rate: see Fig. 14-20.
And the result of the analysis is that the amortized running time of the optimized up-tree operations is O(log*(n)), or almost O(1).
(Future improvements)
Drake says v2 because he's allowing self-loops in graphs.↩