CommonLounge Archive

Prim's algorithm for Minimum Spanning Tree

January 24, 2017

Given an undirected weighted graph, a minimum spanning tree (MST) is a subset of the edges of the graph which form a tree and have the minimum total edge weight. For a MST to exist, the graph must be connected (that is, every pair of nodes must be reachable from each other).

Example of Minimum Spanning Tree. Total edge weight = 5 + 8 + 8 + 4 + 11 + 6 = 42

Prim’s algorithm builds the minimum spanning tree one vertex at a time. To see what that means, watch the following video.

Video with step-by-step example

Algorithm steps

  1. Start at an arbitrary node. Mark the node visited. Mark all other nodes unvisited.
  2. While there is an unvisited node
  3. Consider all edges which connect a visited node to an unvisited node. Out of these edges, pick the one with the minimum weight.
  4. Mark the unvisited node corresponding to the edge visited. Add edge to minimum spanning tree.
  5. Done!

Pseudo code

The following is the pseudo code function definition to implement prim’s algorithm in python:

define prims_algorithm(graph):
    # initialization
    unvisited = set(all nodes in graph)
    heap = empty heap
    MST = empty set
    # start at an arbitrary node
    source = choose node at random
    unvisited.remove(source) # mark visited 
    for neighbor of source:
        # add edge to heap with priority=edge weight
        heap.insert_with_priority(edge_weight(source, neighbor), (source, neighbor))
    # while there is an unvisited node
    while not unvisited.empty():
        # if heap is empty, graph is disconnected
        if heap.empty(): return INFINITY
        # get minimum-weight edge to an unvisited node from heap
        # (note: all edges in heap have at-least one end = visited node)
        edge_weight, (previous, node) = heap.pop()
        if node not in unvisited: continue # already visited 
        # found new reachable node. mark visited and update MST
        unvisited.remove(node)
        MST.add((previous, node))
        # add all edges from this node to the heap
        for neighbor of node:
            # add edge to heap with priority=edge weight
            heap.insert_with_priority(edge_weight(node, neighbor), (node, neighbor))
    return MST

Implementation

The following is the python code for prim’s algorithm:

from collections import defaultdict
import heapq
# initialisation
graph = defaultdict(list)
visited = [0]*100
heap = []
n = 4
# adding the following edges
edges = [(0, 1), (1, 2), (2, 3), (3, 4), (4, 0), (1, 4), (1, 3)]
weight = [1, 4, 2, 6, 7, 5, 3];
for i in range (0, len(edges)):
    x = edges[i][0]
    y = edges[i][1]
    w = weight[i];
    #Adding the weighted edges
    graph[x].append((w,y))
    graph[y].append((w,x))
# start at an arbitrary node - here 0
# add edge to heap with edge weight priority
for i in range(0,len(graph[0])):
    heapq.heappush(heap,(graph[0][i],0))
visited[0] = 1
cost = 0
mst = []
# while there is an unvisited node
# here we consider that the given graph is connected
while len(mst) < n:
    # get minimum-weight edge to an unvisited node from heap
    # heap[0] gives the top element in the min heap with edge weight priority
    w = heap[0][0][0]
    previousnode = heap[0][1]
    node = heap[0][0][1]
    heapq.heappop(heap)
    # already visited 
    if visited[node] == 1:
        continue
    # found new reachable node. Mark visited and update MST        
    visited[node] = 1
    cost = cost + w
    mst.append((node, previousnode))
    # add all edges from this node to the heap
    for i in range(0, len(graph[node])):
        heapq.heappush(heap, (graph[node][i], node))
print("Minimum cost of spanning tree is: " + str(cost))
print("Following are the edges in MST: ")
for i in range(0,len(mst)):
    print(str(mst[i][0]) + " " + str(mst[i][1]))

The following is the implementation in Cpp of Prim’s algorithm to build a minimum spanning tree.

#include <bits/stdc++.h>
using namespace std;
int vis[1001];
int main() 
{
  int n = 5, m;
  vector<pair<int, int> >graph[n];
  vector<pair<int, int> >mst;
  pair<int, int>edges[] = { {0, 1}, {1, 2}, {2, 3}, {3, 4}, {4, 0}, {1, 4}, {1, 3} };
  int weight[] = {1, 4, 2, 6, 7, 5, 3};
  m = sizeof(weight) / sizeof(weight[0]);
  // Build the adjacency list
  for(int i = 0; i < m; i++)
  {
    graph[edges[i].first].push_back({edges[i].second, weight[i]});
    graph[edges[i].second].push_back({edges[i].first, weight[i]});
  }
  /*
    priority queue in cpp is
    max heap by default. We need 
    min heap for prim's algorithm.
    We store (weight, (u, v))
  */
  priority_queue< pair<int, pair<int, int> >, vector< pair<int, pair<int, int> > >,greater< pair<int, pair<int, int> > > >pq;
  /*
    We can start with any random node.
    WLOG we start with node 0
  */
  for(auto i: graph[0])
    pq.push({i.second, {i.first, 0}});
  vis[0] = 1;
  int cost = 0;
  while(!pq.empty())
  {
    // get minimum-weight edge to an unvisited node from heap
    // pq.top() gives the (weight,(node,previousnode)) with min weight priority
    int w = pq.top().first;
    int node = pq.top().second.first;
    int previousnode = pq.top().second.second;
    pq.pop();   
    // already visited 
    if(vis[node])continue;
    // found new reachable node. Mark visited and update MST
    vis[node] = 1;
    cost += w;
    mst.push_back({node, previousnode});
    // add all edges from this node to the heap
    for(auto i: graph[node])
      pq.push({i.second, {i.first, node}});
  }
  cout << "Minimum cost of spanning tree is: " << cost << endl;
  cout << "Following are the edges in MST: " << endl;
  for(auto i: mst)
    cout << i.first << " " << i.second << endl;
  return 0;
}

Run-time analysis

Given a graph with $n$ nodes and $m$ edges, Prim’s algorithm has a run-time of O($m \log m$). Since $m \leq n^2$, this is also O($m \log n$). The most expensive operation is heap inserts and heap pops, and we do each operation $m$ times (once per edge).

Correctness of Prim’s algorithm

The cut property

The main idea behind both Prim’s algorithm and Kruskal’s algorithm is the cut property.

Consider any subset S of all vertices V of the graph. Consider all edges going from a vertex in S to a vertex outside S. The cut property states that, if $e$ is the edge with minimum weight among all these edges, then $e$ must be part of the minimum spanning tree.

Proof of cut property: The proof is quite simple. Assume the contrary, that is, the MST of the graph does not include $e$. Since the minimum spanning tree spans all vertices, it has at-least one edge $e'$ going from S to outside S. If this edge is replaced by $e$, we get a spanning tree with lesser weight. This is a contradiction, since the original tree was a minimum spanning tree.

Correctness of Prim’s algorithm

In Prim’s algorithm, in each step, we consider a cut and choose the minimum weight edge crossing the cut. Each of these edges must be part of the MST, as proved above.


© 2016-2022. All rights reserved.