 # Reese Schultz

## BFS and DFS Demystified

Graph traversals with breadth- and depth-first search explained, including basic graph theory and JavaScript.

Last updated on June 8, 2020. Created on June 7, 2020.

### Introduction

1. Does a connecting path exist between two places or states?

BFS, specifically, helps answer another important question in a wide variety of contexts:

1. What is the shortest path between said places or states?

These algorithms share broad applicability, being as they traverse graphs, which are collections of vertices (singular: vertex). Vertices are like nodes or points, and they can be connected by edges. Neighbors are vertices sharing the same edge. BFS finds the path with the fewest edges, but there are other ways to define the shortest path discussed later. We're mainly concerned with the traversal of acyclic graphs in this tutorial. These are graphs lacking cycles. What is a cycle? Imagine a picturesque oak tree. But what if, instead of having leaves, branches were tendrils extending back into the tree itself? Those would be cycles, reflecting the analogy back to graph theory. Cyclic oak trees are disturbing.

### BFS and DFS

So—what are BFS and DFS, and what's the difference between them?

#### BFS

BFS searches breadth-first, from the starting vertex's immediate neighbors, to their neighbors, to their neighbors' neighbors, and so on. The first neighbor in the search is the first neighbor out—first in, first out—FIFO. The data structure for this is the queue, which is like customers waiting in line at a restaurant. The first customer in line is the first served with a hot plate. As previously stated, BFS is useful for finding the shortest path. While not covered in this tutorial, it's also good for testing bipartiteness, and it's used in the Edmonds-Karp algorithm for calculating maximum flow in a transportation network, among other things.

#### DFS

DFS searches depth-first, from one of the starting vertex's immediate neighbors, to one of that vertex's neighbors, to one of that neighbor's neighbors, until a local branch height has been met, in which case the search resumes at the next unsearched neighbor available—then another local branch height is eventually met—ad nauseam. The last neighbor in the search is the first neighbor out—last in, first out—LIFO. The data structure for this is the stack, which is like a stack of plates. The last plate on the stack is the first plate taken off to serve a customer. DFS can be leveraged for cycle detection, topological sorting of a directed acyclic graph (DAG), finding bridges, etc., but all of that falls out of scope for this tutorial.

### Code

First and foremost, we need a way to represent a graph of vertices with edges connecting one another. One way is to use a map (ideally some kind of hash table or dictionary) as an adjacency list, meaning vertices are mapped to lists or sets of neighbor (adjacent) vertices:

``````const graph = new Map()

graph.set('A', ['B', 'K'])
graph.set('B', ['C', 'E', 'I'])
graph.set('C', ['O'])
graph.set('D', [])
graph.set('E', ['F'])
graph.set('F', [])
graph.set('G', ['D'])
graph.set('H', ['G'])
graph.set('I', ['J', 'H'])
graph.set('J', ['H'])
graph.set('K', ['L', 'J'])
graph.set('L', ['M'])
graph.set('M', ['N'])
graph.set('N', [])
graph.set('O', [])
``````

Time complexity. For both BFS and DFS, correctly operating on an adjacency list representation implies a time complexity of O(V + E), where V is the number of vertices, and E is the number of edges. An adjacency matrix, instead, would imply O(V2), which does not scale as well, and neither would an edge list, implying O(V * E).

Furthermore, note that our example graph can be visualized as follows: To operate on such a graph, let's write two functions:

1. One for checking the existence of a path.
2. Another for getting the ordered vertices of a path.

These functions will be iterative, not recursive. Why? Recursion may be more "elegant," especially for DFS, but it assumes the call stack is an unlimited resource. It's not.

#### 1. Checking Path Existence

Checking path existence is pretty straightforward (you'll have to write input validation yourself):

``````const checkPathExistsWithBfsOrDfs = (graph, start, end, bfs = true) => {
let open = graph.get(start)
const closed = new Set()

while (open) {
const current = bfs ? open.shift() : open.pop()

if (current === end) return true
if (closed.has(current)) continue

open = open.concat(graph.get(current))
}

return false
}
``````

We can run the code and log the output like so:

``console.log(checkPathExistsWithBfsOrDfs(graph, 'A', 'D'))``

The output should be `true` since there is in fact a path from `A` to `D`. This output should be the same whether using BFS or DFS. The next function will show you how to get the shortest path as an ordered list of vertices.

But first, let's go through this function, one part at a time:

``````const checkPathExistsWithBfsOrDfs = (graph, start, end, bfs = true) => {
...
}
``````

We define a function that accepts a graph, a start vertex, end vertex, and a flag. If the flag is `true`, the default, then BFS is used; otherwise, it's DFS.

``````let open = graph.get(start)
``````

First thing we do in the function is initialize what is colloquially known as the open list with the starting vertex in the collection. This collection contains vertices under search. The collection itself should actually be a queue for BFS, and a stack for DFS. Furthermore, the queue should be a double-ended queue called a deque (pronounced "deck") supported by a ring buffer.

Why? For BFS, we need to shift the open list, as in pop the first element off. Using a ring buffer, that is an O(1) operation. With a standard queue or list, that is instead a less efficient O(N) operation. The example here assumes a JavaScript array for ease of demonstration, not efficiency.

``````const closed = new Set()
``````

There is also the closed list, a list of previously searched vertices used to prevent duplicative future searches. It should actually be a set-like data structure in most languages. While in JavaScript a simple object is probably more efficient at runtime, the `Set` is more idiomatic. Note that, specifically for JavaScript, an object would require a default value for each vertex key, such as `true`.

``````while (open) {
...
}
``````

This effectively means, "while the open list is not empty," dependent on JavaScript's notion of "truthiness."

``````const current = bfs ? open.shift() : open.pop()
``````

If using BFS, we shift the (ideally deque) open list, otherwise for DFS, we pop the end (of what is ideally a stack). Whatever is shifted or popped becomes the `current` vertex.

``````if (current === end) return true
``````

If the `current` vertex is actually the goal, as in the `end` of the path, then we return `true` ASAP to process the least possible, expressing there is in fact a path. Exiting like this is also called an "early out," which is Terse, Readable Programming 101.

``````if (closed.has(current)) continue
``````

If the `current` vertex has already been searched, then we skip this iteration of the containing loop.

``````open = open.concat(graph.get(current))
``````

We update the open list to include the `current` vertex's neighbors.

``````closed.add(current)
``````

The final act of the loop is adding the `current` vertex to the closed list, so we don't attempt searching it again.

``````return false
``````

If `return true` never executed from the while loop, then the search failed⁠—`false` is returned since there is no path.

#### 2. Getting a Path

Getting a path is a little more complicated (you'll have to write input validation yourself):

``````const getPathWithBfsOrDfs = (graph, start, end, bfs = true) => {
const open = [[start]]
const closed = new Set()

while (open) {
const path = bfs ? open.shift() : open.pop()
const current = path[path.length - 1]

if (current === end) return path
if (closed.has(current)) continue

const neighbors = graph.get(current)
for (let i = neighbors.length; i--;) {
if (closed.has(neighbors[i])) continue
newPath = [...path]
newPath.push(neighbors[i])
open.push(newPath)
}

}
}
``````

We can now run this function and log its output like so for BFS:

``console.log(getPathWithBfsOrDfs(graph, 'A', 'D'))``

Its output should be `[ 'A', 'K', 'J', 'H', 'G', 'D' ]`.

But what about DFS? We can find that out by setting the `bfs` flag to false:

``console.log(getPathWithBfsOrDfs(graph, 'A', 'D', false))``

Notice how for DFS, the output path is longer, being `['A', 'B', 'I', 'J', 'H', 'G', 'D']`. Now you might ask yourself, "What gives?"

Super important note: Both are technically valid, connected paths, but only one is the shortest. In general, be aware that the output for BFS and DFS can be different. BFS will correctly find the shortest path, but there is no guarantee that DFS will. Why? Because DFS goes deeper and deeper for arbitrary branch heights instead of checking each neighbor at a given level or degree. DFS just spits out the first connected path it finds, not necessarily the shortest one, hence why DFS may be more efficient for determining connectivity.

Like before, we'll go over each part of this function.

``````const getPathWithBfsOrDfs = (graph, start, end, bfs = true) => {
...
}
``````

Again, we define a function that accepts a graph, a start vertex, end vertex, and a flag. If the flag is `true`, the default, then BFS is used; otherwise, it's DFS.

``````const open = [[start]]
``````

Unlike the previous function, we now model the open list as a collection of paths (vertex collections). Commenting on the previous function, recall that the parent collection ought to be a deque for BFS in the name of efficiency, and a stack for DFS to be idiomatic. The first path in this collection is just the `start` vertex.

``````const closed = new Set()
``````

The closed list is basically the same as implemented in the first function.

``````while (open) {
...
}
``````

Observe that the loop works the same as in the first function.

``````const path = bfs ? open.shift() : open.pop()
``````

`path` is the current path—remember that, for this function, `open` is a collection of paths—for BFS, we shift the path, and for DFS, we pop it.

``````const current = path[path.length - 1]
``````

The `current` vertex should obviously be the last one in the `path` since it is, after all, the ordered list of vertices.

``````if (current === end) return path
``````

Nearly the same as the first function—early out—except here the current `path` is returned.

``````if (closed.has(current)) continue
``````

This is no different than the first function, skipping an iteration if the `current` vertex has already been searched.

``````const neighbors = graph.get(current)
``````

We retrieve the neighbor vertices of the `current` vertex.

``````for (let i = neighbors.length; i--;) {
if (closed.has(neighbors[i])) continue
newPath = [...path]
newPath.push(neighbors[i])
open.push(newPath)
}
``````

We find the number of neighbor vertices once, and decrement index `i` to iterate over them. Defining `i` as the length of the neighbor vertices and using `while(i--)` is pretty good too, but other types of looping mechanisms, such as the `for...of` statement in JavaScript, tend to be terribly inefficient.

Moving on, if the current neighbor is in the closed list, then we skip an iteration.

Otherwise, a new path is created for each neighbor, shallow-copied from the `path` so as to not overwrite values by accident, hence `[...path]`. The new path is the "current" path, except it is joined with each neighbor of the `current` vertex. Then each new path, joined with each respective neighbor, is pushed onto the open list, just waiting to be shifted or popped as the next `current` path in an ensuing iteration of the `while` loop.

``````closed.add(current)
``````

Finally, onto the closed list the `current` vertex is added in the same way as the previously defined function.

### Conclusion

You did it. You learned how to do performant graph traversal with BFS and DFS. We noted how data structures such as the deque, stack, and map can be useful to that end. Along the way, tidbits of graph theory were thrown around to acquaint or reintroduce you with the subject.

However, these algorithms have limitations. While BFS can find the path with the least edges, it's not suited for finding the path with the least cost, where weights are assigned to each edge. In other words, the graph traversal in this tutorial is restricted to unweighted graphs. Thus, in a later tutorial, I'll cover Dijkstra's algorithm for finding least-cost paths in weighted graphs. After that, I'll discuss A*, which is essentially an informed optimization of Dijkstra's algorithm.

Thanks for your time, and keep in mind that I'll update this tutorial to link to related ones as time goes on.

My code is released under the MIT license.