# Merge-sort: Detailed tutorial with Python & C++ implementation

September 15, 2016

Merge sort is an *efficient* and *recursive* algorithm for sorting an array. To sort an array recursively, merge-sort splits the array into two and sorts each half. Then, it uses the **merge** operation to combine the two sorted halfs into a single sorted array.

First, we will see how the merge operation works. Then, we will see how the merge-sort algorithm works as whole.

# Merge operation

Suppose that we already have two sorted arrays, say **A** and **B**. Then the merge operation merges them into a single sorted array **C**.

For example, let’s say we have

```
A = [22, 43, 59]
B = [17, 31, 64]
```

We want to create a single sorted array **C**, so:

`C = [17, 22, 31, 44, 59, 64]`

The merge operation works as follows.

- We initialize two pointers,
**i**and**j**, which point to the first element of the array**A**and array**B**respectively.**C**starts as an empty array. - If
**A[i]**is less than (or equal to)**B[j]**, we add element**A[i]**to array**C**and increment**i**so that it points to the next element in array**A.**Otherwise, we add element**B[j]**to array**C**and increment j so that it points to the next element in array**B**. - Repeat step 2 while we have not exhausted all elements of arrays
**A**and**B**. At some point, one of**i**or**j**will reach the end of the array. When that happens, we add the remaining elements of the other array to**C**.

For our example above, we start with:

```
A = [22, 43, 59] and B = [17, 31, 64]
C = []
i = 0 (points to 22) and j = 0 (points to 17)
```

**Step 2:**

We compare **A[0]** and **B[0]**. **B[0]** = 17 is smaller. So we add 17 to **C** and update **j**.

```
C = [17]
i = 0 (points to 22) and j = 1 (points to 31)
```

**Step 3 = keep repeating above:**

We compare **A[0]** and **B[1]**. **A[0]** = 22 is smaller. So we add 22 to **C** and update **i**.

```
C = [17, 22]
i = 1 (points to 43) and j = 1 (points to 31)
```

We compare **A[1]** and **B[1]**. **B[1]** = 31 is smaller. So we add 31 to **C** and update **j**.

```
C = [17, 22, 31]
i = 1 (points to 43) and j = 2 (points to 64)
```

We compare **A[1]** and **B[2]**. **A[1]** = 43 is smaller. So we add 43 to **C** and update **i**.

```
C = [17, 22, 31, 43]
i = 2 (points to 59) and j = 2 (points to 64)
```

We compare **A[2]** and **B[2]**. **A[2]** = 59 is smaller. So we add 59 to **C** and update **i**.

```
C = [17, 22, 31, 43, 59]
i = 3 (reached end) and j = 2 (points to 64)
```

Now that we have reached the end of **A**, we can just add the remaining elements of **B** to C. Since j = 2, there’s only one remaining element in **B**: 64.

`C = [17, 22, 31, 43, 59, 64]`

After going through all values in

A, we append the remaining elements of arrayBwithout any further comparisons.

Done! We have built the sorted array **C.**

The two arrays to be merged

need notbe of the same sizes. The merge operation can work on arrays of different sizes.

**Run-time:** Each element is processed and put into array **C** only once. Hence, the time complexity of merge operation is **O( n)**, where

*n*is the total number of elements in

**A**and

**B**combined.

# Python implementation of merge operation

```
def merge(left, right):
left_length = len(left)
right_length = len(right)
array = []
i = 0
j = 0
while i < left_length and j < right_length:
if left[i] < right[j]:
array.append(left[i])
i += 1
else:
array.append(right[j])
j += 1
while i < left_length:
array.append(left[i])
i += 1
while j < right_length:
array.append(right[j])
j += 1
return array
a = [22, 43, 59]
b = [17, 31, 64]
print("Initial arrays:", a, b)
print("Merged array:", merge(a, b))
```

# C++ implementation of merge operation

```
void merge(int a[], int b[], int a_size, int b_size)
{
/*
Size of the new array c
will be the sum of a_size and b_size
*/
int c_size = a_size + b_size;
int c[c_size];
// Keep adding the smaller element to C
int i = 0, j = 0, k = 0;
while (i < a_size && j < b_size) // loop_0
{
if (a[i] <= b[j])
c[k++] = a[i++];
else
c[k++] = b[j++];
}
// Add remaining elements from A or B to C. Only *one* of the below loops execute.
while (i < a_size) // loop_1
c[k++] = a[i++];
while (j < b_size) // loop_2
c[k++] = b[j++];
cout << "Sorted array" << endl;
for (int i = 0; i < c_size; i++ )
cout << c[i] << " ";
}
int main()
{
// Initialising the arrays a and b
int size_a = 3, size_b = 3;
int a[] = {22, 43, 59};
int b[] = {17, 31, 64};
cout << "Initial arrays" << endl;
for (int i = 0; i < 3; i++) cout << a[i] << " ";
cout << endl;
for (int i = 0; i < 3; i++) cout << b[i] << " ";
cout << endl;
merge(a, b, size_a, size_b);
return 0;
}
```

In the above code, **exactly one** of `loop_1`

and `loop_2`

will run depending on which of the initial arrays has leftover elements.

# Merge Sort Algorithm

Now that we know how to merge two sorted arrays into a single sorted array, let’s see how merge-sort works.

We want to sort an array with **N** elements. For example, let’s say we have an array of $8$ elements. If we were given two already *sorted* arrays of size $4$ each, then using the merge operation we can obtain a single sorted array of length $8$.

The question that arises here is how do we get the two sorted arrays. This is achieved by **recursively** calling the function `mergeSort()`

on array’s first half and second half. Recall that recursion is a process in which a function calls itself — we learnt about it in this CommonLounge tutorial on Recursion.

Here’s the **pseudo-code** for the above:

```
function mergeSort(array A)
if length(A) <= 1: don't do anything (base case)
recursively sort first-half of A
recursively sort second-half of A
merge sorted halves into single sorted array
```

The `mergeSort()`

function calls itself on the **first-half** and **second-half** of the array **A**. The **base case** for recursion is when we call `mergeSort()`

for an array of size 1, since a **single element array is always sorted**.

From two arrays of size $1$ to get a sorted array of size $2$. From two sorted arrays of size 2, we get a sorted array of size 4. This process repeats itself and we obtain a sorted array of any given length.

Animation for merge-sort

# Python 3 implementation

```
def merge(left, right):
left_length = len(left)
right_length = len(right)
array = []
i = 0
j = 0
while i < left_length and j < right_length:
if left[i] < right[j]:
array.append(left[i])
i += 1
else:
array.append(right[j])
j += 1
while i < left_length:
array.append(left[i])
i += 1
while j < right_length:
array.append(right[j])
j += 1
return array
def merge_sort(array):
length = len(array)
# base case
if length <= 1: return array
# recursion
midpoint = length // 2
left_half = merge_sort(array[:midpoint])
right_half = merge_sort(array[midpoint:])
return merge(left_half, right_half)
array = [10, 20, 5, 8, 7, 2, 15, 12]
print("Initial array:", array)
print("Sorted array:", merge_sort(array))
```

# C++ Implementation

Code for merge-sort in C++:

```
#include <bits/stdc++.h>
using namespace std;
void merge(int arr[], int left, int right)
{
int mid = (left + right) / 2;
int lsize = mid - left + 1, rsize = right - mid;
int pos = left;
int leftarr[lsize];
int rightarr[rsize];
// Create a copy of left and right sorted arrays.
for (int i = 0; i < lsize; i++)
leftarr[i] = arr[left+i];
for (int i = 0; i < rsize; i++)
rightarr[i] = arr[mid+1+i];
// Keep adding the smaller element to C
int i = 0, j = 0;
while (i < lsize && j < rsize)
{
if (leftarr[i] <= rightarr[j])
arr[pos++] = leftarr[i++];
else
arr[pos++] = rightarr[j++];
}
// Add the remaining elements in either of the two arrays.
while (i < lsize)
arr[pos++] = leftarr[i++];
while (j < rsize)
arr[pos++] = rightarr[j++];
}
/*
Function mergeSort recursively call itself on the left half
and right half and then calls merge function to merge the two
sorted halves
*/
void mergeSort(int arr[], int left, int right)
{
// The following if condition is the base case.
if (left == right)
return;
int mid = (left + right) / 2;
mergeSort(arr, left, mid);
mergeSort(arr, mid+1, right);
merge(arr, left, right);
}
int main()
{
int a[] = {10, 20, 5, 8, 7, 2, 15, 12};
int n = 8;
// Display the array
cout << "Initial array: ";
for (int i = 0; i < n; i++)
cout << a[i] << " ";
cout << endl;
/*
Call function mergeSort on input array
mergeSort(array, start, end)
*/
mergeSort(a, 0, n-1);
// Display the array
cout << "Sorted array: ";
for (int i = 0; i < n; i++)
cout << a[i] << " ";
cout << endl;
return 0;
}
```

# Execution flow

Let us understand the execution flow of the merge-sort implementation given above by the following example.

Let the given array be *a* = [8, 6, 4, 2].

`mergeSort(a, 0, 3)`

is called from the*main*function.`mergeSort(a, 0, 1)`

and`mergeSort(a, 2, 3)`

are called.- From
`mergeSort(a, 0, 1)`

, the calls to`mergeSort(a, 0, 0)`

and`mergeSort(a, 1, 1)`

will return as the condition`left == right`

holds true. It then merges them to get [6, 8]. - Similarly, the call the second-half,
`mergeSort(a, 2, 3)`

, would return [2, 4]. - Code execution returns to
`mergeSort(a, 0, 3)`

. The two halves are merged and we get the following array:*a =*[2, 4, 6, 8].

# Run-time Analysis

The easiest way to understand the run-time complexity of merge-sort is to look at the recursion graph.

- Since we keep on dividing $n$ by $2$ in each subsequent layer till we reach a single element array, there are $O(\log n)$ layers in the tree.
- Within each layer, each element appears exactly once. Hence, each layer executes in $O(n)$ operations.

Therefore, the algorithm complexity is $O(n\times \log n)$.

Recursion graph of merge-sort

For example, above you can see the diagram of the recursion graph of merge-sort on an array of size $7$. The graph has roughly $2\times n\log_2 n$ layers.

Each of the layers in the lower half of the graph executes in $O(n)$ operations as the merge operation has linear complexity.

Combining [27, 38] with [3, 43] takes 4 operations, and combining [9, 82] and [10] takes 3 operations. For that entire layer, the number of operations is roughly 7 operations which is basically $O(n)$. Similarly, we can observe for all layers where the merge operation executes.

Hence, the complexity of merge sort is $O(n\log n)$.

For n = 100,000, merge-sort is about 5,000 times faster than an O(n

^{2}) sorting algorithm!

# Video Tutorial / Review

The first 2.5 minutes of the video above explain the merge-sort algorithm. The next 3 minutes (optional) compare merge-sort and quick-sort in one case where quicksort’s pivot choices get lucky, and another where quicksort gets unlucky.

# Summary

In this tutorial you learned about

**merge sort**— a**recursive**algorithm for sorting**merge operation**— is the core of the merge-sort algorithm. it merges two sorted arrays into a single sorted array- References is an
**efficient algorithm**which executes in $O(n\log n)$