Graph Search

One of the first motions we would intuitively think of when working with graphs is the idea of fully or partially traversing the nodes of a certain graph, this action is called "Graph search" and it is achieved through various algorithms using the notion of "paths".

As previously mentionned, a graph is a set of nodes and the edges connecting the nodes, we can visualize a path between two nodes as the set of edges that leads us from one node to another, this path obviously isn't necessarily unique nor does it exist for a certain, in fact the characteristic of paths existing for every two nodes in a graph is well defined as "connectivity".

Let us begin by writing the formal definitions of all these notions:

Definitions:

Path:

Let $G = (V, E)$ be a graph.

A path is a sequence of edges $(e_{1}, e_{2}, ..., e_{n-1})$ for which there is a sequence of distinct nodes $(v_{1}, v_{2}, ... , v_{n})$ such that $e_{i} = (v_{i}, v_{i+1})$ for $i \in [|1,n − 1|]$. $(v_{1}, v_{2}, ..., v_{n})$ is the node sequence of the path. If $G$ was a directed graph then our path would be called a directed path.

Let's consider this graph for a moment:

Gex3.png

We can see that there is a path between 2 and 3, mathematically that path would be denoted as :

$(e_{1},e_{2})$ such as $e_{1}=(2,1), e_{2}=(1,3)$

Gex4.png

Connectivity:

$G$ is a connected graph if for every two distinct nodes there exist a path connecting them ie:

$\forall (x,y) \in V, \exists I \in [|1,n|], I={i_{1},...,i_{k}}, (e_{j})_{j \in I} \subset (e_{i})_{i \in [|1,n|]}$, $e_{i_{1}}=x, e_{i_{k}}=y, \forall j \in I, e_{j}=(v_{j},v_{j+1})$

Example:

The graph shown banotherefore is a connected graph considering that there is a path from every node to another, but take this graph for example:

Gex5.png

It is a non connected graph since no path exists between 6 and any other node, although the non-existence of a pth from 6 and just on another node is sufficient to call this graph non connected.

Graph search algorithms;

Graph search alogirthms pertain to visit every node in a graph in a certain order, two fundamental algorithms exist. Visiting a node consists of calling a certain method on this particular node, hence we use the term "visited" once that node had the method called upon it. Discovering a node is when the algorithm recognize the node as a neighbor of a visited node, and plans to visit it eventually.

Breadth-first search:

Breadth-first search commonly known as BFS, is a search algorithm that makes sure every neighbor of a node is visited before moving to the next, but since nodes are discovered before visited, we will have to use a data structure known as queue.

What is a queue ?

A queue is a commonly used data structure following the FIFO (first in, first out) rule, as in that the first element inserted in it, is the first accessible element to take out of it.

Gex6.png

We can implement either an iterative algorithm or a recursive one, for this article we will suffice ourselves with the iterative version:

The iterative algorithm goes as follows:

BFS(G, v): 
 Q = queue
 v explored
 Q.enqueue(v)
 v visited
 while Q is not empty do:
      v = Q.dequeue()
      for all w adjacent nodes to v in G do:
         if w is not explored and not visited: 
            w  explored
            Q.enqueue(w)
            w  visited
Time and space complexity:

In the worst case scenario, if we note $|E|$ the number of the edges of the graph and $|V|$ the number of its nodes then:

-The time complexity of the algorithm is: $O(|E|+|V|)$

-The space complexity of the algorithm is: $O(|V|)$

We can implement it in python depending on how the graph is represented, for the adjacency list:

For the adjacency matrix representation:

Depth-first search:

Depth-first search commonly known as DFS, is a search algorithm that goes as deeper as possible by going to the next neighbor of each node and then backtracks, for the same reasons previously, we will have to use another data structure known as a stack.

What is a stack?

A stack is a commonly used data structure following the LIFO (last in, first out) rule, as in that the last element inserted in it, is the first accessible element to take out of it.

Gex7.png

The algorithm goes as follows:

The algorithm of the DFS is almost similar to the BFS except in the choice of a stack instead of a queue:

DFS(G, v): 
 S = stack
 v explored
 S.add(v)
 v visited
 while S is not empty do:
      v = Q.pop()
      for all w adjacent nodes to v in G do:
         if w is not explored and not visited: 
            w  explored
            S.add(w)
            w  visited
Time and space complexity:

The time and space complexity of this algorithm in the worst case scenario are the same as the BFS.

We can implement the algorithm in Python accordingly:

For the Adjacency List representation:

For the adjacency Matric representation:

Thank you everyone for reading!

Next article will broach on the motivation for graph theory and the shortest path possible algorithms!