CommonLounge Archive

Hands-on: Implementing Minimum Spanning Tree algorithms (Prim's / Kruskal's)

July 04, 2018

Now that we’ve learnt about minimum spanning trees and related algorithms, it’s time to implement them. You can work in any programming language of your choice - Python, C++, Java, or anything else you want. Let’s get started!

Problem Statement

You are given an undirected weighted graph with $n$ nodes and $m$ edges. The nodes are numbered from $0$ to $n-1$. Find the total weight of the minimum spanning tree, as well as one specific minimum spanning tree.

Note that there may be multiple different minimum spanning trees. You need to construct any one of them.

Constraints: $n \leq 10^5$ and $m \leq 10^6$.

IO description

Input description: The input file contains multiple test cases. Each test case begins with “Test case: X”. This is followed by a line which has two numbers, $n$ (the number of nodes) and $m$ (the number of edges). The next $m$ each contain 3 numbers $u$, $v$ and $w$, denoting that there is an edge $(u, v)$ with weight $w$.

Sample input:

Test case: 1
5 8
3 2 123
3 4 4
4 2 9
1 0 125
3 0 189
0 4 40
1 2 51
3 1 10
Test case: 2
11 44
1 5 23
6 8 66
6 2 45
4 8 38
9 1 695
2 10 15
3 6 4
7 3 6
4 7 7
2 5 948
4 9 223
5 8 83
0 9 264
5 6 65
7 10 8
10 0 582
1 3 9
1 6 33
0 5 9
10 4 977
7 6 13
2 4 47
0 6 25
5 3 879
5 4 5
7 1 239
1 4 879
7 9 6
2 1 22
7 8 49
3 2 17
3 0 735
8 1 145
9 5 5
10 3 26
9 2 311
10 5 9
9 8 524
7 0 55
8 3 5
6 9 164
4 0 115
1 10 31
0 1 4

Output description: Output should contain one line per test case. The first number in the line should be the total weight of a minimum spanning tree. This should be followed by $n-1$ comma-separated pairs $(u, v)$, the list of edges which make a minimum spanning tree.

Note that there may be multiple different minimum spanning trees. You need to construct any one of them. The edges can be listed in any order.

If a minimum spanning tree does not exist (that is, if the graph is disconnected), output INF instead (for infinity).

Sample output:

63 3 4, 4 2, 3 1, 0 4
67 3 6, 0 1, 5 4, 9 5, 8 3, 7 3, 7 9, 7 10, 1 3, 2 10

Test case 1 explanation: The edges in the minimum spanning tree are (3, 4), (4, 2), (3, 1) and (0, 4). The total weight is 4 + 9 + 10 + 40 = 63.

Notes

You can implement either Prim’s algorithm or Kruskal’s algorithm (or both). Implementing Prim’s algorithm is easier, since Kruskal’s algorithm requires implementing disjoint-set data-structure.

Prim’s algorithm

  1. Start by creating an adjacency list from input. Make sure to add edges in both directions.
  2. To implement Prim’s algorithm, you don’t need to implement your own heap or priority queue. Most languages have a pre-implemented one.
  3. When adding an edge to the heap / priority queue, add it as (edge_weight, u, v). This will ensure that the least-weight edge comes out from the heap when you do heap.pop(), but it will also retain the information regarding which two nodes $(u, v)$ the edge connects.

Kruskal’s algorithm

  1. No need to create an adjacency list for Kruskal’s. Easiest format to receive input is a list of edges.
  2. You’ll need to implement an efficient disjoint-set data-structure. A python implementation has been provided below.
## this impliments disjoint-set data structure (very efficient!)
# each "set" is a tree, and the "set representative" is the tree root
# hence, two nodes are in the same set if root(u) == root(v)
# initially, everything is in its own set. hence parent(node) = node
parent = range(nn)
size = [1]*nn
def root(node):
    # to find the root, start from the node and keep going to parent[node]. stop when parent[node] = node.
    # in addition, we set "parent[node] = root(node)" so that next time we look for root(node), we'll get there in 1 step!
    if not parent[node] == node:
        parent[node] = root(parent[node])
    return parent[node]
def merge(uu, vv):
    # to merge the sets, we can simply do parent[root(u)] = root(v)
    # to ensure that tree heights are O(log n), we make the root of the smaller tree a child of the root of the larger tree
    # (since a node's root can't change > log n times)
    root_uu, root_vv = root(uu), root(vv)
    assert root_uu != root_vv, (root_uu, root_vv)
    if size[root_uu] < size[root_vv]:
        parent[root_uu] = root_vv
        size[root_uu] += size[root_vv]
    else:
        parent[root_vv] = root_uu
        size[root_vv] += size[root_uu]

Submitting your solution

Once you are confidant of your solution, run your code on the input available here: mst_input. Then submit your answers in the quiz!

Full solution

The solution for this assignment is included at the end of the quiz. My Kruskal’s implementation takes 6 seconds to run for all test cases combined, and Prim’s implementation takes 8 seconds. Note that if you implemented the solution in C++ / Java, your code should be executing about > 5x faster.


© 2016-2022. All rights reserved.