One of the first motions we would intuitively think of when working with graphs is the idea of fully or partially traversing the nodes of a certain graph, this action is called "Graph search" and it is achieved through various algorithms using the notion of "paths".
As previously mentionned, a graph is a set of nodes and the edges connecting the nodes, we can visualize a path between two nodes as the set of edges that leads us from one node to another, this path obviously isn't necessarily unique nor does it exist for a certain, in fact the characteristic of paths existing for every two nodes in a graph is well defined as "connectivity".
Let us begin by writing the formal definitions of all these notions:
Let $G = (V, E)$ be a graph.
A path is a sequence of edges $(e_{1}, e_{2}, ..., e_{n-1})$ for which there is a sequence of distinct nodes $(v_{1}, v_{2}, ... , v_{n})$ such that $e_{i} = (v_{i}, v_{i+1})$ for $i \in [|1,n − 1|]$. $(v_{1}, v_{2}, ..., v_{n})$ is the node sequence of the path. If $G$ was a directed graph then our path would be called a directed path.
Let's consider this graph for a moment:
We can see that there is a path between 2 and 3, mathematically that path would be denoted as :
$(e_{1},e_{2})$ such as $e_{1}=(2,1), e_{2}=(1,3)$
$G$ is a connected graph if for every two distinct nodes there exist a path connecting them ie:
$\forall (x,y) \in V, \exists I \in [|1,n|], I={i_{1},...,i_{k}}, (e_{j})_{j \in I} \subset (e_{i})_{i \in [|1,n|]}$, $e_{i_{1}}=x, e_{i_{k}}=y, \forall j \in I, e_{j}=(v_{j},v_{j+1})$
The graph shown banotherefore is a connected graph considering that there is a path from every node to another, but take this graph for example:
It is a non connected graph since no path exists between 6 and any other node, although the non-existence of a pth from 6 and just on another node is sufficient to call this graph non connected.
Graph search alogirthms pertain to visit every node in a graph in a certain order, two fundamental algorithms exist. Visiting a node consists of calling a certain method on this particular node, hence we use the term "visited" once that node had the method called upon it. Discovering a node is when the algorithm recognize the node as a neighbor of a visited node, and plans to visit it eventually.
Breadth-first search commonly known as BFS, is a search algorithm that makes sure every neighbor of a node is visited before moving to the next, but since nodes are discovered before visited, we will have to use a data structure known as queue.
A queue is a commonly used data structure following the FIFO (first in, first out) rule, as in that the first element inserted in it, is the first accessible element to take out of it.
We can implement either an iterative algorithm or a recursive one, for this article we will suffice ourselves with the iterative version:
The iterative algorithm goes as follows:
BFS(G, v):
Q = queue
v explored
Q.enqueue(v)
v visited
while Q is not empty do:
v = Q.dequeue()
for all w adjacent nodes to v in G do:
if w is not explored and not visited:
w explored
Q.enqueue(w)
w visited
In the worst case scenario, if we note $|E|$ the number of the edges of the graph and $|V|$ the number of its nodes then:
-The time complexity of the algorithm is: $O(|E|+|V|)$
-The space complexity of the algorithm is: $O(|V|)$
We can implement it in python depending on how the graph is represented, for the adjacency list:
from collections import deque
def visit(node):
#visiting method
return 0
def BFS(adjacencyList, v):
visited = [False]*len(adjacencyList)
Q=deque()
Q.append(v)
visit(v)
visited[v-1]=True
while Q:
v=Q.popleft()
for w in adjacencyList[v-1]: ## the -1 added here and in the next line are to align the indexing since in python list indexes start with 0
if w not in Q and not visited[w-1]:
Q.append(w)
visit(w)
visited[w-1]=True
For the adjacency matrix representation:
def BFS(adjacencyMatrix, v):
visited = [False]*len(adjacencyMatrix)
Q=deque()
Q.append(v)
visit(v)
visited[v-1]=True
while Q:
print(Q)
v=Q.popleft()
for i in range(len(adjacencyMatrix)):
if adjacencyMatrix[v-1][i]==1 and i+1 not in Q and not visited[i]:
Q.append(i+1)
visit(i+1)
visited[i]=True
Depth-first search commonly known as DFS, is a search algorithm that goes as deeper as possible by going to the next neighbor of each node and then backtracks, for the same reasons previously, we will have to use another data structure known as a stack.
A stack is a commonly used data structure following the LIFO (last in, first out) rule, as in that the last element inserted in it, is the first accessible element to take out of it.
The algorithm goes as follows:
The algorithm of the DFS is almost similar to the BFS except in the choice of a stack instead of a queue:
DFS(G, v):
S = stack
v explored
S.add(v)
v visited
while S is not empty do:
v = Q.pop()
for all w adjacent nodes to v in G do:
if w is not explored and not visited:
w explored
S.add(w)
w visited
The time and space complexity of this algorithm in the worst case scenario are the same as the BFS.
We can implement the algorithm in Python accordingly:
For the Adjacency List representation:
def DFS(adjacencyList, v):
visited = [False]*len(adjacencyList)
S=deque()
S.append(v)
visit(v)
visited[v]=True
while S:
print(S)
v=S.pop()
for w in adjacencyList[v-1]:
if w not in S and not visited[w-1]:
S.append(w)
visit(w)
visited[w-1]=True
For the adjacency Matric representation:
def DFS(adjacencyMatrix, v):
visited = [False]*len(adjacencyMatrix)
S=deque()
S.append(v)
visit(v)
visited[v-1]=True
while S:
print(S)
v=S.pop()
for i in range(len(adjacencyMatrix)):
if adjacencyMatrix[v-1][i]==1 and i+1 not in S and not visited[i]:
S.append(i+1)
visit(i+1)
visited[i]=True
Next article will broach on the motivation for graph theory and the shortest path possible algorithms!