Overview

Sorting an unordered array is one of the most common utility algorithms in interviews.

Sorting algorithms alone might not be challenging enough to be a standalone interview question. Therefore, sorting is often used to assist other algorithms, such as binary search and greedy algorithms. However, some interviewers may ask candidates to implement the Quick Sort algorithm on the spot. If you cannot write it when asked, the interview is likely to end there.

This note briefly records 10 classic sorting algorithms and 2 less common but practical sorting algorithms.

Stability

Many materials and documents mention the concept of stability when discussing sorting. What is stability? What is its use?

Stability

Stability refers to whether the relative order of equal elements changes after sorting.

For example, in the array [3, 7, 3', 2], if after sorting, the order of 3 and 3' changes, then the sorting is unstable.

Significance of Stability

  • It can save time since a stable sorting algorithm reduces one swap time.
  • When sorting by one key and then by another, the result of the first key sort can be used in the second key sort.

Stable sorting algorithms:

  • Radix Sort, Counting Sort, Insertion Sort, Bubble Sort, Merge Sort

Unstable sorting algorithms:

  • Selection Sort, Heap Sort, Quick Sort, Shell Sort

From a beginner's perspective, the stability of a sorting algorithm might not seem important, which is understandable (as I used to think so). Beginners might focus more on the process of a sorting algorithm. Having practiced many LeetCode problems without much real-world experience, one might mistakenly believe that time complexity and space complexity are the only two metrics to evaluate sorting algorithm performance.

However, in some application scenarios, stability is very important. For instance, when the original order of the target array has significance, an unstable algorithm might disrupt this order.

E.g.

"If a class of students is already sorted by student ID, and now they need to be sorted by age from youngest to oldest. If the ages are the same, they must be sorted by student ID in ascending order. "

In this case, if an unstable sorting algorithm is used to sort by age, the order of student IDs in a group of students with the same age will be messed up, requiring the student ID sort to be repeated. But if the sorting algorithm is stable, sorting by age alone will complete the task."

Algorithm 1: Selection Sort

Selection Sort is the simplest and most intuitive sorting algorithm. It works by finding the i-th smallest element each time and then swapping it with the element at the i-th position. As long as i is guaranteed to increase monotonically from 0 to n, sorting will be completed.

Stability

Since Selection Sort involves swapping two elements, it is an unstable sorting algorithm.

Time Complexity

O(n^2)

Code Example (Python3)

  def selection_sort(arr, n):
      for i in range(n):
          min_idx = i
          for j in range(i + 1, n):
              if arr[j] < arr[min_idx]:
                  min_idx = j
          arr[i], arr[min_idx] = arr[min_idx], arr[i]

Algorithm 2: Bubble Sort

Bubble Sort is one of the simplest sorting algorithms. It is named Bubble Sort because its sorting logic makes the smaller/larger elements "bubble up" to the top of the array, like bubbles rising in water.

Execution Process

Continuously check two adjacent elements, and if the front element and the back element violate the sorting principle, swap the two elements.

Stopping Condition: When no adjacent elements need to be swapped in a round of traversal, the sorting is complete.

Sorting Logic: The i-th round of scanning places the i-th largest/smallest element in the correct position.

Stability

Bubble Sort is a stable sorting algorithm.

Time Complexity

If the array is already sorted, Bubble Sort is equivalent to just traversing the array without performing any swaps, so the complexity is O(n).

In the worst case, the complexity is O(n^2).

Code Example (Python3)

def bubble_sort(a, n):
    flag = True
    while flag:
        flag = False
        for i in range(1, n):
            if a[i] > a[i + 1]:
                flag = True
                a[i], a[i + 1] = a[i + 1], a[i]

Bubble Sort VS Selection Sort

Firstly, from the perspective of sorting logic, Bubble Sort and Selection Sort share the same logic but execute in reverse order (Selection Sort sorts the left part of the array first, while Bubble Sort sorts the right part first).

The main difference between the two lies in their comparison and swap mechanisms, creating a classic tradeoff between stability and the number of swaps.

  • Bubble Sort relies on adjacent comparison and swap, ensuring stability but may require more rounds of swaps.
  • Selection Sort finds the extreme value in the unsorted part through traversal and then swaps, reducing the number of swaps but sacrificing stability.

Algorithm 3: Insertion Sort

Insertion sort is also a very intuitive and simple sorting algorithm. For those who often play cards (like me), insertion sort is the easiest algorithm to understand because the process of arranging cards in hand after picking them up from the table is essentially insertion sort: pick up a card from the table and insert it into the correct position among the cards in your hand, then repeat with the next card.

Abstractly, the working principle of insertion sort can be summarized as: Divide the elements to be sorted into two parts: "sorted" and "unsorted". Each time, pick an element from the "unsorted" part and insert it into the correct position in the "sorted" part.

Insert Sort

Stability

Insertion sort is a stable sorting algorithm.

Time Complexity

If the array is already sorted or nearly sorted, the efficiency is very high. Because insertion sort essentially only traverses the original array once and always inserts elements at the end of the new array without needing to rearrange, the complexity is: O(n)

In the worst case, the complexity is: O(n^2)

Code Example (Python3)

def insertion_sort(arr, n):
    for i in range(1, n):
        key = arr[i]
        j = i - 1
        while j >= 0 and arr[j] > key:
            arr[j + 1] = arr[j]
            j = j - 1
        arr[j + 1] = key

Advanced: Binary Insertion Sort

Also known as Binary Insertion Sort. It aims to optimize performance through binary search, with the optimization being more noticeable when sorting a large number of elements.

However, since the basic idea of binary insertion sort is the same as direct insertion sort, binary insertion sort only optimizes the constant factor in the time complexity of insertion sort, so the optimized time complexity remains unchanged.

Code Example (Python3)

def insertion_sort(arr, n):
    if n < 2: return
    for i in range(1, n):
        key = arr[i]
        index = bisect.bisect_right(arr[:i], key)
        arr[index+1:i+1] = arr[index:i]
        arr[index] = key

Algorithm 4: Counting Sort

Counting sort is a linear time sorting algorithm. The linear time complexity is achieved by trading off space. An additional array, let's call it C, is used during the counting sort process, while the original array is referred to as A. During the traversal of A, the value of A[i] is used as an index to find the corresponding position in C, and then increment the count at that position.

By doing this, all the values are naturally sorted just by traversing the C array by index. Finally, make sure to handle duplicate values correctly by counting them to ensure they are not missed.

Stability

Counting sort is a stable sorting algorithm.

Time Complexity

The time complexity of counting sort is linear.

In the worst case, the complexity is: O(n + w), where w is the size of the auxiliary array.

Code Example (Python3)

min_value, max_value = 0, 100
cnt = [0] * max_value  

def counting_sort(arr, n):
    for i in range(n):
        cnt[arr[i]] += 1
    sorted_index = 0
    for i in range(max_value):
        while cnt[i] > 0:
            arr[sorted_index] = i
            cnt[i] -= 1
            sorted_index += 1

Algorithm 5: Radix Sort

The counting sort discussed in the previous section has a critical flaw: the size of the auxiliary array is completely constrained by the maximum and minimum values. For example, if you perform a counting sort on the array [100000,1], you would need to create an auxiliary array of at least 100001 in length, which is very wasteful. Radix sort addresses this problem to some extent.

Radix sort is a non-comparative integer sorting algorithm that works by cutting the integers into different digits and then comparing them digit by digit. In this way, the size of the auxiliary array can be controlled. This controllable auxiliary array is also called a "bucket." The concept of "buckets" is utilized in radix sort, counting sort, and the bucket sort we will discuss later.

Since integers can also represent strings (like names or dates) and certain formats of floating-point numbers, radix sort is not limited to integers.

E.g., Perform radix sort on an array with a maximum value of 50.

In the first round, establish a "bucket" with a range of [0, 9] and sort the elements by the units digit.

Radix Sort Keyword 1

In the second round, establish another "bucket" with a range of [0, 9] and sort the elements by the tens digit, finally completing the overall sorting.

Radix Sort Keyword 1

Types of Radix Sort

  • If the comparison is made from the 1st digit to the k-th digit, this type of radix sort is called MSD (Most Significant Digit first) radix sort.
  • If the comparison is made from the k-th digit to the 1st digit, this type of radix sort is called LSD (Least Significant Digit first) radix sort.

Stability

If you look at the sorting of any individual layer of digits, like counting sort, it is a stable sort. However, the sorting of the second layer of digits cannot guarantee the stability of the sorting achieved in the first layer. Conversely, it can be said:
If the sorting of the inner layer of digits is stable, then both MSD radix sort and LSD radix sort are stable sorting algorithms.

Time Complexity

Since the range of digit values is difficult to define, the worst-case scenario is too broad to discuss, but it can generally be understood as O(kn + kw), where k is the number of digits, and w is the range of the digit values (assuming each digit has the same range of values).

In general, radix sort is faster than comparison-based sorting algorithms like quicksort.

Code Example (Python3)

def radix_sort(arr, max_digit):
    mod = 10
    dev = 1
    for i in range(max_digit):
        counter = [[] for _ in range(10)]
        for num in arr:
            bucket = (num % mod) // dev
            counter[bucket].append(num)
        pos = 0
        for bucket in counter:
            for num in bucket:
                arr[pos] = num
                pos += 1
        dev *= 10
        mod *= 10
    return arr

Algorithm 6: Bucket Sort

As mentioned in radix sort, the concept of "buckets" is used, and the basic sorting idea is consistent. Bucket sort is more suitable when the data to be sorted has a large range of values but is distributed relatively evenly.

Execution Process

Bucket sort is carried out in the following steps:

  1. Set up a fixed number of arrays to act as empty buckets.
  2. Traverse the sequence, placing each element into the corresponding bucket.
  3. Sort each non-empty bucket.
  4. Place the elements from the non-empty buckets back into the original sequence.

Stability

Like radix sort, if the sorting within each bucket is stable, then bucket sort is also a stable sort.

However, this is based on certain assumptions, because unlike radix sort, bucket sort may alter the order of elements when placing them into buckets. Generally, if each bucket contains only a few elements, insertion sort can be used, making bucket sort a stable sorting algorithm in such cases.

Time Complexity

Overall, the time complexity is approximately: O(n + n^2/k + k) (dividing the range into n buckets + sorting + recombining elements)

  • When n is much larger than k, it approaches O(n^2).
  • When k is much larger than n (which is rare), it approaches O(k).
  • When n is approximately equal to k, the time complexity approaches linear, O(n).
  • The worst-case time complexity of bucket sort is O(n^2).

Code Example (Python3)

N = 100010
w = n = 0
a = [0] * N
bucket = [[] for i in range(N)]
  
def insertion_sort(A):
    for i in range(1, len(A)):
        key = A[i]
        j = i - 1
        while j >= 0 and A[j] > key:
            A[j + 1] = A[j]
            j -= 1
        A[j + 1] = key
  
  
def bucket_sort():
    bucket_size = int(w / n + 1)
    for i in range(0, n):
        bucket[i].clear()
    for i in range(1, n + 1):
        bucket[int(a[i] / bucket_size)].append(a[i])
    p = 0
    for i in range(0, n):
        insertion_sort(bucket[i])
        for j in range(0, len(bucket[i])):
            a[p] = bucket[i][j]
            p += 1

Radix Sort vs Counting Sort vs Bucket Sort

  • Radix Sort: Distributes the elements into buckets based on each digit of the key.
  • Counting Sort: Each bucket stores only a single key value.
  • Bucket Sort: Each bucket stores a range of values.

Algorithm 7: Merge Sort

Merge sort is a sorting algorithm that often discourages beginners because it is based on the divide-and-conquer strategy to sort array segments and then merge them. Therefore, mastering this algorithm requires a slightly higher level of understanding.

Execution Process

To implement merge sort using the divide-and-conquer method, you can follow these steps during recursion:

  1. If the array length is 1, the array is already sorted, so no further division is needed.
  2. If the array length is greater than 1, the array is likely unsorted. At this point, divide the array into two segments, and then check if each segment is sorted (using step 1).
    • If sorted, merge them into one sorted array.
    • If not sorted, repeat step 2 on the unsorted segment, then merge.

Division

To ensure the complexity of the sort, the array is usually divided into two segments of approximately equal length: mid = floor((l+r) / 2)

Merge

From an implementation perspective, the core part of merge sort is the merge process.

The goal is to merge two sorted arrays a[i] and b[j] into one sorted array c[k].

This can be easily achieved using the two-pointer method: Enumerate a[i] and b[j] from left to right, find the smallest value and place it into array c[k]; repeat the process until either a[i] or b[j] is empty, then place the remaining elements of the other array into c[k].

Stability

As long as you ensure that the merge process uses <= comparisons, you can avoid inserting the right-side numbers before the left-side numbers when there are duplicates. In this case, merge sort is stable.

Time Complexity

The average time complexity is: O(nlog(n))

Code Example (Python3)

def merge(a, b):
    i, j = 0, 0
    c = []
    while i < len(a) and j < len(b):
        if b[j] < a[i]:
            c.append(b[j])
            j += 1
        else:
            c.append(a[i])
            i += 1
    c.extend(a[i:])
    c.extend(b[j:])
    return c

def merge_sort(a, ll, rr):
    if rr - ll <= 1:
        return
    mid = (rr + ll) // 2
    merge_sort(a, ll, mid)
    merge_sort(a, mid, rr)
    a[ll:rr] = merge(a[ll:mid], a[mid:rr])

Algorithm 8: Quick Sort

Quick sort, also known as "partition-exchange sort," is currently one of the most widely used sorting algorithms. It is also one of the most likely sorting algorithms to be tested in interviews.

Execution Process

Similar to merge sort, the working principle of quick sort is also to sort an array through the divide and conquer method.

However, unlike merge sort, the first step is not to directly divide the sequence into two parts; instead, the relative size relationship must be ensured during the division process.

Specifically, the first step is to divide the sequence into two parts while ensuring that the numbers in the first sub-sequence are smaller than those in the second sub-sequence. To ensure average time complexity, a random number m is usually chosen as the boundary between the two sub-sequences (thus, the elements in the array can be quickly categorized: greater than m or less than m).

Next, maintain two pointers, p and q, starting from the front and back, respectively. For each current number, consider whether it is in the correct position (in the front or back). If the current number is not in the correct position, for example, if the back pointer q encounters a number smaller than m, you can swap the numbers at p and q, then move p one position forward. Once all current numbers are in the correct positions, move the pointers and continue processing until the two pointers meet.

In fact, quick sort does not specify exactly how to implement the first step. There are multiple ways to choose m and perform the division.

Quick sort involves three steps:

  1. Divide the sequence into two parts (ensuring the relative size relationship).
  2. Recursively perform quick sort on the two sub-sequences.
  3. No merging is required because the sequence is already fully sorted by this point.

Runoob.com QuickSort GIF Demo

Stability

Quick sort is a non-stable sorting algorithm.

Time Complexity

  • The best-case time complexity is: O(nlog(n)) (when the selected pivot is always the median of the sequence).
  • The average time complexity is: O(nlog(n))
  • The worst-case time complexity is: O(n^2) (when the selected pivot is always the extreme value of the sequence).

Code Example (Python3)

def quick_sort(alist, first, last):
    if first >= last:
        return
    mid_value = alist[first]
    low = first
    high = last
    while low < high:
        while low < high and alist[high] >= mid_value:
            high -= 1
        alist[low] = alist[high]
        while low < high and alist[low] < mid_value:
            low += 1
        alist[high] = alist[low]
    alist[low] = mid_value
    quick_sort(alist, first, low - 1)
    quick_sort(alist, low + 1, last)

Algorithm 9: Heap Sort

Heap sort is a sorting algorithm designed using the binary heap data structure. Initially, I felt it might be a bit off-topic since this note is meant to organize sorting algorithms and is nominally part of the "Array Series". However, considering that heaps can be built on arrays and despite its sorting essence being based on heap-based selection sort, it does not prevent it from sorting arrays. It only adds heap creation and array reconstruction.

Execution Process

  • First, build a max-heap or min-heap based on the sorting requirement. Then, remove the top element of the heap, which is the maximum value, swap it with the element at the end of the array, and maintain the properties of the remaining heap.
  • Next, remove the top element of the heap, which is the second-largest value, swap it with the element at the second-to-last position in the array, and maintain the properties of the remaining heap.
  • Repeat this process, and after the nth-1 operation, when the heap is empty, the array will be naturally sorted.

About "Heap"

I do not plan to elaborate on the binary heap details in the sorting algorithms section, as I intend to systematically organize heaps and stacks in another section.

However, I can briefly mention why binary heaps do not require a specific tree structure and can be created directly in an array.

By observing the tree structure of a binary heap, one can see that each node in the binary heap is counted from top to bottom and left to right. The total number of nodes before a node is exactly twice the count of its parent node, with the left child including itself (+1) and the right child including itself and the left child (+2).

This relationship can be directly applied to index multiplication relationships in arrays.

iParent(i) = (i - 1) / 2;
iLeftChild(i) = 2 * i + 1;
iRightChild(i) = 2 * i + 2;

Stability

Like selection sort, due to the swapping operations, heap sort is a non-stable sorting algorithm.

Time Complexity

For binary heaps, the time complexity is directly O(nlogn)

Code Example (Python3)

def sift_down(arr, start, end):
    parent = int(start)
    child = int(parent * 2 + 1)
    while child <= end:
        if child + 1 <= end and arr[child] < arr[child + 1]:
            child += 1
        if arr[parent] >= arr[child]:
            return
        else:
            arr[parent], arr[child] = arr[child], arr[parent]
            parent = child
            child = int(parent * 2 + 1)

def heap_sort(arr, len):
    i = (len - 1 - 1) / 2
    while i >= 0:
        sift_down(arr, i, len - 1)
        i -= 1
    i = len - 1
    while i > 0:
        arr[0], arr[i] = arr[i], arr[0]
        sift_down(arr, 0, i - 1)
        i -= 1

Algorithm 10: Shell Sort

Shell sort, also known as the increment decrement sort, is an improved version of insertion sort. It is one of the first algorithms to break through O(n^2) efficiency in the mid-20th century.

The inventor, Donald Shell, proposed the improvement method based on the following properties of insertion sort:

  • Insertion sort is efficient for nearly sorted data, achieving linear sorting efficiency.
  • However, insertion sort is generally inefficient because it can only move data one position at a time.

Execution Process

Shell sort groups records based on a certain increment of indices, performs insertion sort on each group, and gradually reduces the increment. As the increment decreases, each group contains more keywords. When the increment reaches 1, the entire file is divided into one group, and the algorithm terminates.

Sorting compares and moves non-adjacent records:

  1. Divide the sequence to be sorted into several subsequences (where the elements of each subsequence are spaced equally in the original array).
  2. Perform insertion sort on these subsequences.
  3. Reduce the spacing between elements in each subsequence and repeat the process until the spacing is reduced to 1.

Flowchart

Shell Sort

Stability

Shell sort is a non-stable sorting algorithm.

Time Complexity

The best-case time complexity of Shell sort is O(n).

However, the average time complexity and worst-case time complexity of Shell sort depend on the choice of the increment sequence:

  • If the increment sequence is not chosen properly, the complexity of the sorting algorithm can degrade to O(n^2).

Shell sort itself is not complex, but its time complexity has a well-developed and complex proof. This was the main work of Donald Shell after he improved the algorithm in 1959. The specific process of proving the time complexity, as well as the lemmas and theorems used, will not be elaborated here. Interested readers can refer to this link.

Code Example (Python3)

import math
def shellSort(arr):
    gap=1
    while(gap < len(arr)/3):
        gap = gap*3+1
    while gap > 0:
        for i in range(gap,len(arr)):
            temp = arr[i]
            j = i-gap
            while j >=0 and arr[j] > temp:
                arr[j+gap]=arr[j]
                j-=gap
            arr[j+gap] = temp
        gap = math.floor(gap/3)
    return arr

Algorithm 11: Tournament Sort

Tournament sort is named after the single-elimination tournament format. In this format, many participants compete, comparing in pairs, with the winner advancing to the next round. This elimination method can be represented as a complete binary tree, where you can determine the number 1 by looking at the root.

At first glance, it is quite similar to heap sort, but there are significant differences. In heap sort, after removing the top element, the new top is the second largest/smallest element. However, in tournament sort, the player eliminated in the final round is not necessarily the second best, as they might be worse than previously eliminated players (e.g., the second player might have been eliminated in the first round by the first player).

Therefore, tournament sort, also known as tree-based selection sort, actually uses the idea of heap sort (priority queue) on top of selection sort to find the next element to be selected. It is an optimized version of selection sort and a variant of heap sort (both use a complete binary tree).

Execution Process

For example, with the array [7, 2, 3, 4, 6, 10, 8, 5], performing a minimum tournament sort would result in a tournament tree as shown below:

Tournament Sort Round 1

Clearly, 2 is the smallest element and can be removed first. After one "tournament," the selected element needs to be removed. Set it to +INF (for maximum sorting use -INF), then hold another "tournament" to find the next smallest element.

Tournament Sort Round 2

Repeat this process until all elements are sorted.

Stability

Tournament sort is a non-stable sorting algorithm.

Time Complexity

The best, average, and worst-case time complexities are all O(nlogn).

Initializing the tournament tree requires O(n) time, and each "tournament" takes O(logn) time, repeated n times.

Code implementation is omitted here; the concept is not complex, but it requires functions for "initialization," "competition," and "repeated matches."

Algorithm 12: Tim Sort

Tim sort is a hybrid of merge sort and insertion sort. It is currently one of the most sophisticated sorting algorithms I have encountered. Developed by Tim Peters in 2002 using Python, Tim sort is the standard sorting algorithm in Python, and it is also used by Java SE7 for sorting arrays of non-primitive types.

Execution Process

Merge sort first divides the array into two parts, then recursively performs merge sort on the two subarrays, and finally merges the two subarrays. The smallest unit of merge sort operations is a single element.

However, there may already be contiguous and sorted subarrays within the array, which merge sort cannot utilize.

RUN Units

Tim sort takes advantage of the contiguous and sorted subarrays by using RUN as the smallest unit for merge operations. A RUN is a subarray with the following properties:

  • A RUN is either non-decreasing or strictly increasing.
  • A RUN has a minimum length requirement.

Tim sort operates similarly to merge sort by dividing the array into multiple RUNs and then repeatedly merging two RUNs according to certain rules until the array is sorted.

Stability

Tim sort is a stable sorting algorithm.

Time Complexity

  • The best-case time complexity is O(n).
  • The worst-case time complexity is O(nlogn).

Code Example (Java)

The following is an implementation in Java SE7.

class TimSort implements SortAlgorithm {
    private static final int SUB_ARRAY_SIZE = 32;
    private Comparable[] aux;

    @Override
    public <T extends Comparable<T>> T[] sort(T[] a) {
        int n = a.length;
        InsertionSort insertionSort = new InsertionSort();
        for (int i = 0; i < n; i += SUB_ARRAY_SIZE) {
            insertionSort.sort(a, i, Math.min(i + SUB_ARRAY_SIZE, n));
        }
        aux = new Comparable[n];
        for (int sz = SUB_ARRAY_SIZE; sz < n; sz = sz + sz) {
            for (int lo = 0; lo < n - sz; lo += sz + sz) {
                merge(a, lo, lo + sz - 1, Math.min(lo + sz + sz - 1, n - 1));
            }
        }
        return a;
    }

    private <T extends Comparable<T>> void merge(T[] a, int lo, int mid, int hi) {
        int i = lo;
        int j = mid + 1;
        System.arraycopy(a, lo, aux, lo, hi + 1 - lo);
        for (int k = lo; k <= hi; k++) {
            if (j > hi) {
                a[k] = (T) aux[i++];
            } else if (i > mid) {
                a[k] = (T) aux[j++];
            } else if (less(aux[j], aux[i])) {
                a[k] = (T) aux[j++];
            } else {
                a[k] = (T) aux[i++];
            }
        }
    }
}

Data Structure and Algorithm - Array(II): Sorting
https://billyjojojobulidogithub.io/notes/Algorithm/sort/english/index.html
Author
Baocheng Wang
Posted on
August 08, 2024
Licensed under