Understanding the Depth-First Search and the Topological Sort with Python

Understanding the Depth-First Search and the Topological Sort with Python

Image for postPhoto by Sbastien Marchand on Unsplash

There are two fundamental ways of graph search, which are the breadth-first search (BFS) and the depth-first search (DFS). In this post, I?ll explain the depth-first search. Here, I focus on the relation between the depth-first search and a topological sort. A topological sort is deeply related to dynamic programming which you should know when you tackle competitive programming. For its implementation, I used Python. If you?d like to know the breadth-first search, check my other post: Understanding the Breadth-First Search with Python.

1. The algorithm of the depth-first search

In the depth-first search, we visit vertices until we reach the dead-end in which we cannot find any not visited vertex. When we reach the dead-end, we step back one vertex and visit the other vertex if it exists. I?ll show the actual algorithm below.

1. Apply step 2 to the starting vertex2. Re-apply step 2 to the neighbor of the given vertex if the vertex is not visited

Step 2 is the most important step in the depth-first search. Basically, it repeatedly visits the neighbor of the given vertex. Note that it visits the not visited vertex. This is because the program has never ended when re-visiting.

Let?s check the way how that algorithm works. On the figure below, we start the depth-first search from vertex A. Initial state will become as follows.

Image for post

First, we visit vertex B, the neighbor of vertex A, and mark B visited. Note that we visit vertices in alphabetical order if the visiting vertex has multiple neighbors. So we don?t visit vertex G here.

Image for post

Then we visit the neighbor of vertex B, vertex C, and mark it visited.

Image for post

We try to visit the vertex C?s neighbor A, but it?s already visited. So we visit vertex D and mark it visited.

Image for post

We visit the neighbor of vertex D, vertex E and mark vertex E visited.

Image for post

Vertex E doesn?t have neighbors, so we reached a dead-end. In the depth-first search, we visit the not visited neighbors of the visited vertex before reaching a dead-end. In this case, we visited vertex D before reaching the dead-end, so we visit the not visited neighbor F of the vertex D.

Image for post

In the depth-first search, we first visit the vertices in one path and its neighbors, then visit vertices in another path. For example, A -> B and A -> G are two paths. We first visit all vertices in the first path, A -> B -> C -> D -> E -> F. And then visit the second path A -> G. We?ve already visited A, so just add G to the visited hash-table.

Image for post

Here we visited all the vertices, so we get the depth-first search done.

2. Edge classification in the depth-first search

We get the path as a tree structure by the depth-first search as follows. We can get the following path when we apply the depth-first search to the graph used in the last section.

Image for post

From the figure above, we got the characteristic tree structure. In the depth-first search, we classify the edges into four kinds below.

  • tree edge
  • forward edge
  • back edge
  • cross edge

I?ll describe these edges in details.

Tree edges are the edges included in the path of the depth-first search. So all the edges in the earlier figure are tree edges

Rest of three edges are the edges not included in the path of the depth-first search but contained in the graph. First, forward edge is the edge from vertex C to vertex F. In the tree structure of the depth-first search, this edge goes from some node to the descendant node, so we call it the forward edge.

Image for post

Next, I?ll explain back edges. The back edge is the edge from vertex C to vertex A. In the tree structure of the depth-first search, this edge goes from some node to the parent node, so we call it the back edge.

Image for post

Finally, I?ll explain cross edges. The cross edge is the edge from vertex G to vertex C. In the tree structure of the depth-first search, this edge goes from some node in a sub-tree to the other node in other sub-tree, so we call it the cross edge.

Image for post

That?s all for edge classification.

3. Relation to a topological sort

Here I?ll explain the relationship between the depth-first search and a topological sort. First, I?ll describe a topological sort and then explain the relation.

In a topological sort, we sort the vertices to make all edges go from left to right. Look at the graph below.

Image for post

We get the figure below when we apply a topological sort to this graph. I?ll describe the way how to apply later. Also, I show the edges of the original graph above in the figure below. From the figure below, you can see all the edges go from left to right.

Image for post

We call the node order above topological order. Let?s think what it tells us. Assume each vertex to be cooking steps for curry. Broadly speaking, the steps consist of cutting ingredients, seasoning meat, stir-frying them, and stewing. Here cutting ingredients or seasoning meat correspond to the vertices towards the left and stir-frying or stewing correspond to the vertices towards the right. This is because we cannot stew before cutting ingredients. So we can sort vertices in dependency order by using a topological sort. In the figure above, we put less dependent vertices from left to right.

On the other hand, we cannot use a topological sort to the following graph or the graph which has the part of the structure below. This is because all the vertices depend on the other vertices.

Image for post

We call the path going around the vertices like this cycle. So we cannot use a topological sort to the graph has a cycle. We call the graph without any cycle directed acyclic graph, also known as DAG. The important thing is that if the graph can be topological-sorted, it is a DAG and DAG can be topological sorted.

We can get a topological order by applying the depth-first search to DAG. Look at the following DAG. This graph is the same as the example of a topological sort.

Image for post

We can show the path of the depth-first search on this graph as this following tree structure.

Image for post

We put each node in the tree structure from left to right by the order last visited in the depth-first search:

Image for post

Then we reverse the order and add the edges in the original graph. We get the following figure. You can see this is a topological order.

Image for post

Now we know we can get a topological order by applying the depth-first search to a DAG.

4. Implementation

We can implement the depth-first search with Python as follows.

def dfs(graph, vertex): parents = {vertex: None} dfs_visit(graph, vertex, parents)def dfs_visit(graph, vertex, parents): for n in graph[vertex]: if n not in parents: parents[n] = vertex dfs_visit(graph, n, parents)

We?re initializing to start the depth-first search in this line.

parents = {vertext: None}

We manage the vertices if they are visited by using keys in the parents. Also, values in parents correspond to the visited vertices before visiting the vertex of the key in parents. We call this variable ?parents? because it manages the parent node in the tree structure of the depth-first search. It means parents[key] returns the parent node (some vertex in a graph) of the key node (also another vertex in a graph).

dfs_visit is the main process of the depth-first search.

def dfs_visit(graph, vertex, parents): for n in graph[vertex]: if n not in parents parents[n] = vertex dfs_visit(graph, n, parents)

In this operation, we extract the neighbors of the given vertex and apply dfs_visit to the extracted vertex if it?s not marked visited. Note that graph[vertex] returns the neighbors of the vertex. When the recursively called dfs_visit is done, it gets back to the for-loop and applies the same operation to the other neighbors. After the first for-loop is finished, we can reconstruct the path of the depth-first search from parents.

Let?s think about the time complexity of the depth-first search. Its time complexity will be the called number of dfs_visit because the other operations take constant time. We call dfs_visit in the number of neighbors of the vertex times inside dfs_visit. In other words, we call dfs_visit in the number of degree of the vertex inside dfs_visit. In the worst case, we should call dfs_visit in the number of all the vertices times. So the time complexity will be the sum of the number of all the vertices and the number of the degrees of each vertex. The number of the degrees of each vertex will become 2|E| by the handshaking lemma. Therefore, the depth-first search run in O(|V|+|E|). Note that V is a set of vertices and E is a set of edges. I explain the handshaking lemma in my other post: Understanding the Breadth-First Search with Python. So if you are not familiar with it, please check it out.

That?s all for the explanation for the relationship between the depth-first search and the topological sort. Topological order deeply relates to dynamic programming. So I recommend you to keep it in mind if you?re going to tackle competitive programming. Thank you for reading.

References

  • MIT OpenCourseWare 6.006 Lecture 14: Depth-First Search (DFS), Topological Sort
  • MIT OpenCourseWare 6.006 Recitation 14: Depth-First Search (DFS)
23