CommonLounge Archive

Our first sorting algorithm: Simple Sort

June 29, 2018

In this tutorial, we’ll see a simple sorting algorithm. It will take a sequence of numbers as input, and will return the same numbers in sorted order.

Example

Let’s see an example.

Suppose the input is as follows: 5 10 8 7 3 6 12 2 7

Then the desired output is: 2 3 5 6 7 7 8 10 12

Algorithm

Here is the algorithm that we will use to perform the sorting

Python Code

# simple_sort.py
def simple_sort(array):
    n = len(array)
    for i in range(n):
        # compare array[i] with array[i+1], array[i+2], ... array[n-1]
        for j in range(i+1, n):
            # swap if we find a smaller value
            if array[i] > array[j]:
                array[i], array[j] = array[j], array[i]
    return array
array = [5, 10, 8, 7, 3, 6, 12, 2, 7];
sorted_array = simple_sort(array)
print("Sorted array")
print(sorted_array)

C++ Code

#include <bits/stdc++.h>
using namespace std;
int main() 
{
  int n = 9;
  int array[] = {5, 10, 8, 7, 3, 6, 12, 2, 7};
  for (int i = 0; i < n; i++)
  {
    for (int j = i + 1; j < n; j++)
    {
      if (array[i] > array[j])
        swap(array[i], array[j]);
    }
  }
  cout << "Sorted array: " << endl;
  for (int i = 0; i < n; i++)
    cout << array[i] << " ";
  return 0;
}

The strategy of the algorithm is that in the first iteration of the outer loop, it makes array[0] the minimum value out of all numbers. It achieves this by comparing array[0] to all the remaining elements - array[1] through array[n-1], and swapping if it finds a smaller value. Then, it repeats the same thing for position 1, by comparing to positions 2 through n-1. And so on, for each position.

Example execution

For our sample input, the following sequence of swaps would occur.

5 10 8 7 3 6 12 2 7

Swap 1:

Initially, i = 0. The first swap happens when j = 4, since array[j] = 3 and array[i] = 5. This gives:

3 10 8 7 5 6 12 2 7

Swap 2:

The next swap happens when array[j] = 2, i.e. j = 7. This gives:

2 10 8 7 5 6 12 3 7

Swaps 3-6:

For i = 1, the following 4 swaps happen.

2 8 10 7 5 6 12 3 7

2 7 10 8 5 6 12 3 7

2 5 10 8 7 6 12 3 7

2 3 10 8 7 6 12 5 7

Now, the first 2 elements are sorted. We keep going, and at the end, all elements become sorted.

Time complexity

Simple sort does the innermost loop work about $n^2/2$ times, where $n$ is the length of the array. Hence, it is an O($n^2$) algorithm.

Since computers can perform about 1 billion operations per second, we’d expect simple sort to be fine for input arrays with few thousand elements. Once the array size gets > 10,000, it should start to become quite slow.

Testing Simple Sort (Hands-on)

Use the following grader to test the speed of Simple Sort.

# grader.py
from __future__ import print_function
import datetime
import random
from simple_sort import simple_sort
npassed, nfailed = 0, 0
for itest in range(1, 13):
    print('Test case:', itest)
    nelements = random.randint(int((10**(itest/2.))/2), int(10**(itest/2.)))
    print('Array size:', nelements)
    array = [random.randint(1, 100*nelements) for _ in range(nelements)]
    tic = datetime.datetime.now()
    submission = simple_sort(array)
    toc = datetime.datetime.now()
    answer = sorted(array)
    correct = len(submission) == len(answer) and (submission == answer)
    if correct:
        print('PASSED (Runtime = %.2f seconds)' % (toc-tic).total_seconds())
        npassed += 1
    else:
        print('FAILED')
        nfailed += 1
        print(array)
        print(submission)
        print(answer)
    print('='*100)
print('TOTAL PASSED: %d. TOTAL FAILED: %d.' % (npassed, nfailed))

Save the file above, and then run the following command:

python grader.py

and you should see the results of running the grader.

The following is the output I got:

Test case: 1
Array size: 2
PASSED (Runtime = 0.00 seconds)
====================================================================================================
Test case: 2
Array size: 9
PASSED (Runtime = 0.00 seconds)
====================================================================================================
Test case: 3
Array size: 22
PASSED (Runtime = 0.00 seconds)
====================================================================================================
Test case: 4
Array size: 73
PASSED (Runtime = 0.00 seconds)
====================================================================================================
Test case: 5
Array size: 172
PASSED (Runtime = 0.00 seconds)
====================================================================================================
Test case: 6
Array size: 572
PASSED (Runtime = 0.02 seconds)
====================================================================================================
Test case: 7
Array size: 2867
PASSED (Runtime = 0.39 seconds)
====================================================================================================
Test case: 8
Array size: 7248
PASSED (Runtime = 2.72 seconds)
====================================================================================================
Test case: 9
Array size: 28978
PASSED (Runtime = 46.94 seconds)
====================================================================================================
Test case: 10
Array size: 95679
PASSED (Runtime = 856.32 seconds)
====================================================================================================
Test case: 11
Array size: 258638
...

I stopped waiting at the 11th test case. For the smaller test cases, even though the run-time of the code is increasing, we don’t notice anything since the time taken is so small.

But test case 7 onwards, we can clearly see that making the test case about 3-4x larger, is making the run-time about 10-15x. That is, the run-time is growing with the square of the array length n.

For test case 12, where the array length is about 1,000,000, simple sort will take about 1010850 seconds to execute = 1 day!

Next steps

In the upcoming tutorials, we’ll see O($n \log n$) sorting algorithms. These algorithms will be able to sort sequences of length 1,000,000 within a few seconds, as opposed to only sequences of length 10,000.


© 2016-2022. All rights reserved.