def bubble_sort(array):
"""Bubble sort algorithm to sort an array into ascending order.
Note we ignore edge cases for the sake of clarity.
These are left as an exercise for the reader:
- array is empty
- array has only 1 element
Parameters
----------
array: list
The array that we wish to sort.
Returns
-------
array: list
The input array sorted in-place.
"""
# Initially the entire array is unsorted
= len(array) - 1
last_unsorted_index = False
is_sorted
while not is_sorted:
# Set this to True before we pass through the elements,
# then if we need to perform a swap the array is not sorted so we set it to False
= True
is_sorted
# Perform a pass-through
for left_pointer in range(last_unsorted_index):
= left_pointer + 1
right_pointer
if array[left_pointer] > array[right_pointer]:
# Swap the values
= array[right_pointer], array[left_pointer]
array[left_pointer], array[right_pointer] = False
is_sorted
# The pass-through is finished so the next highest value has been "bubbled up".
-= 1
last_unsorted_index
return array
Sorting
Sorting algorithms come up so frequently that they deserve their own section.
In general, we want to solve the problem:
Given an array of unsorted values, how can we sort them in ascending order?
Algorithm | Best Case | Average Case | Worst Case |
---|---|---|---|
Bubble sort | \(N^2\) | \(N^2\) | \(N^2\) |
Selection sort | \(\frac{N^2}{2}\) | \(\frac{N^2}{2}\) | \(\frac{N^2}{2}\) |
Insertion sort | \(N\) | \(\frac{N^2}{2}\) | \(N^2\) |
Quicksort | \(N log N\) | \(N log N\) | \(N^2\) |
Merge sort | \(N log N\) | \(N log N\) | \(N log N\) |
Heap sort | \(N log N\) | \(N log N\) | \(N log N\) |
Counting sort | \(N + K\) | \(N + K\) | \(N + K\) |
Radix sort | \(N * D\) | \(N * D\) | \(N * D\) |
Sorting is often a pre-requisite for other algorithms, so often as a pre-processing step we want to presort the array. This means the efficiency of the sort is important to the overall efficiency of many other problem types.
1. Bubble Sort
1.1. Algorithm
We “bubble up” the next highest unsorted number on each pass-through.
- Start with 2 pointers pointing at the first two values in the array.
- Compare the left and right values. If
left_value > right_value
, swap them. Otherwise, do nothing. - Move both pointers one cell rightwards.
- Repeat Steps 2-3 until we reach values that have already been sorted. This completes the first “pass-through” and means we have “bubbled up” the biggest number to the end of the array.
- Start over. repeat Steps 1-4 to bubble up the second biggest number into the second to last position. Repeat this until we perform a pass through with no swaps.
1.2. Implementation
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array bubble_sort(input_array)
[0, 1, 3, 5, 7, 8, 8, 9, 12, 23, 69, 420]
1.3. Complexity
Each pass through loops through one fewer element:
Pass-through number | Number of operations |
---|---|
1 | N-1 |
2 | N-2 |
3 | N-3 |
4 | N-4 |
5 | N-5 |
… | … |
k | N-k |
So in total, there are \((N-1) + (N-2) + (N-3) + ... + 1\) comparisons. This is the sum of an arithmetic progression, which we can calculate as \(\frac{N^2}{2}\).
Also worth noting that in the worst case - an input array in descending order - each comparison will also result in a swap. This does not affect the Big-O complexity but would slow down the run time in practice. There are \(\frac{N^2}{2}\) comparisons and up to \(\frac{N^2}{2}\) swaps, resulting in \(O(N^2)\) total operations.
The complexity is therefore \(O(N^2)\).
In general, any nested loop should be a hint at quadratic time complexity.
2. Selection Sort
2.1. Algorithm
Find the smallest value from the unsorted part of the array and put it at the beginning.
- Check each cell of the array from left to right to find the lowest value. Store the index of the running minimum value.
- At the end of pass-through \(j\) (starting at 0), swap the minimum value with the one at index \(j\).
- Repeat steps 1-2 until we reach a pass-through that would start at the end of the array, i.e. \(j = N-1\)
2.2. Implementation
def selection_sort(array):
"""Selection sort algorithm to sort an array into ascending order.
Parameters
----------
array: list
The array that we wish to sort.
Returns
-------
array: list
The input array sorted in-place.
"""
= len(array)
array_length
# Loop through all array elements
for pass_through_number in range(array_length):
# Find the minimum element in the remaining unsorted array
= pass_through_number
min_index for idx_unsorted_section in range(pass_through_number + 1, array_length):
if array[idx_unsorted_section] < array[min_index]:
= idx_unsorted_section
min_index
# Swap the found minimum element with the first element
= array[min_index], array[pass_through_number]
array[pass_through_number], array[min_index]
return array
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array selection_sort(input_array)
[0, 1, 3, 5, 7, 8, 8, 9, 12, 23, 69, 420]
2.3. Complexity
As with bubble sort, on each pass-through we loop through one fewer element, so there are \(\frac{N^2}{2}\) comparisons. But each pass-through only performs 1 swap, so the total number of operations is \(\frac{N^2}{2} + N\).
This is still therefore \(O(N^2)\) complexity, but it should be about twice as fast as bubble sort.
3. Insertion Sort
3.1. Algorithm
Remove a value to create a gap, shift values along to move the gap leftwards, then fill the gap.
- Create a gap. For the first pass-through, we temporarily remove the second cell (i.e. index 1) and store it as a temporary variable. This leaves a gap at that index.
- Shifting phase. Take each value to the left of the gap and compare it to the temporary variable. if
left_val > temp_val
, move left value to the right. This has the same effect as moving the gap leftwards. As soon as we encounter a value whereleft_val < temp_val
the shifting phase is complete. - Fill the gap. Insert the temporary value into the current gap.
- Repeat pass-throughs. Steps 1-3 constitute a single pass-through. Repeat this until the pass through begins at the final index of the array. At this point the array is sorted.
3.2. Implementation
def insertion_sort(array):
"""Insertion sort algorithm to sort an array into ascending order.
Parameters
----------
array: list
The array that we wish to sort.
Returns
-------
array: list
The input array sorted in-place.
"""
= len(array)
array_len
for pass_thru_number in range(1, array_len):
# 1. Create a gap
= array[pass_thru_number]
temp_val = pass_thru_number
gap_idx
# 2. Shifting phase
# Move leftwards from the gap and keep shifting elements right if they are greater than temp_val
while (gap_idx > 0) and (array[gap_idx - 1] > temp_val):
= array[gap_idx - 1]
array[gap_idx] -= 1
gap_idx
# 3. Fill the gap
= temp_val
array[gap_idx]
return array
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array insertion_sort(input_array)
[0, 1, 3, 5, 7, 8, 8, 9, 12, 23, 69, 420]
3.3. Complexity
The worst case is when the input array is sorted in descending order.
There are 4 types of operations that occur:
Comparisons
If we compare values to the left of the gap on each step, there will be 1 comparison on the first pass-through, 2 on the second, 3 on the third, etc.
So there are \(1 + 2 + ... + N-1 = \frac{N^2}{2}\) comparisons in the worst case.
Shifts
Each comparison could result in a shift, so there are the same number of shifts as comparisons in the worst case.
Removals and Insertions
We remove 1 temp value per pass-through, so there are \(N-1\) removals.
We insert that value at the end of each pass-through, so there are also \(N-1\) insertions.
Total
Operation | Number |
---|---|
Removals | \(N-1\) |
Comparisons | \(\frac{N^2}{2}\) |
Shifts | \(\frac{N^2}{2}\) |
Insertions | \(N-1\) |
Overall there are \(N^2 + 2N - 2\) operations, so the complexity is \(O(N^2)\).
3.3 The Average Case
The complexity generally considers the worst case, but insertion sort varies greatly based on the input.
We saw in the worst case the number of steps is \(N^2 + 2N - 2\).
In the best case the data is already sorted, so we do a remove and an insert on each pass-through, for a totla of \(2N - 2\) steps.
In the average case, let’s assume about half of the data is already sorted. So we’ll still need to do \(N - 1\) removes and \(N - 1\) inserts in total. We’ll also need to compare about half the data, so \(\frac{N^2}{4}\) comparisons and the same number on swaps. This gives a total of \(\frac{N^2}{2} + 2N - 2\) steps.
Depending on the state of the input data, the speed can vary considerably. Compare this to selection sort, which will always take \(\frac{N^2}{2}\) steps regardless of the input data.
Algorithm | Best Case | Average Case | Worst Case |
---|---|---|---|
Selection sort | \(\frac{N^2}{2}\) | \(\frac{N^2}{2}\) | \(\frac{N^2}{2}\) |
Insertion sort | \(N\) | \(\frac{N^2}{2}\) | \(N^2\) |
So the choice of which algorithm is “best” depends on the state of your input data. Both have the same \(O(N^2)\) time complexity.
4. Quicksort
Quicksort is a sorting algorithm that relies on the concept of partitioning.
4.1. Partitioning
The idea behind partitioning is we take a random value from the array which we call the pivot.
We then want to ensure any smaller numbers are left of the pivot and any larger numbers to its right.
- If we take the rightmost value as the pivot, we create a left pointer pointing at the leftmost value and a right pointer at the second from the right (i.e. not the pivot but the next rightmost).
- The left pointer moves rightwards until it reaches a value >= the pivot value.
- The right pointer moves leftwards until it reaches a value <= the pivot value.
- If at this pointer the left pointer has crossed past the right pointer, move straight on to the next step. Otherwise swap the values that the left and right pointers are pointing at.
- Swap the pivot value with the left pointer’s value.
def partition(array):
"""Partition an array around the rightmost value.
Parameters
----------
array: list
The array that we wish to partition.
Returns
-------
array: list
The input array partitioned in-place.
"""
# 1. Initialise the pivot and pointers
= len(array) - 1
pivot_idx = 0
left_pointer_idx = pivot_idx - 1
right_pointer_idx
# Loop until the left and right pointers cross
while left_pointer_idx < right_pointer_idx:
# 2. Move the left pointer
while (left_pointer_idx < pivot_idx) and (array[left_pointer_idx] < array[pivot_idx]):
+= 1
left_pointer_idx
# 3. Move the right pointer
while (right_pointer_idx > 0) and (array[right_pointer_idx] > array[pivot_idx]):
-= 1
right_pointer_idx
# 4. Swap values if the pointers haven't crossed yet, otherwise end the loop
if left_pointer_idx < right_pointer_idx:
= array[right_pointer_idx], array[left_pointer_idx]
array[left_pointer_idx], array[right_pointer_idx]
# 5. Swap the pivot and left_pointer values
= array[pivot_idx], array[left_pointer_idx]
array[left_pointer_idx], array[pivot_idx]
return array
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array partition(input_array)
([8, 7, 5, 3, 1, 9, 0, 8, 12, 69, 420, 23], 8)
Partitioning isn’t the same as sorting the array, but it’s “sorted-ish”.
4.2. Quicksort Algorithm
Quicksort is a combination of partitions and recursion.
- Partition the array. The pivot is now at the correct position in the sorted array.
- Consider the subarrays to the left and the right of the pivot as their own arrays. We’ll partition these recursively.
- If a subarray has 0 or 1 elements, that is the base case and we do nothing; it is already sorted.
We’ll make a few tweaks to the partition function to take a start index and an end index, and to also return the pivot index. These extra parameters will allow us to call it recursively in quicksort.
4.3. Implementation
def partition(array, start_idx, end_idx):
"""Partition an array around the rightmost value.
Parameters
----------
array: list
The array that we wish to partition.
start_idx: int
The start index of the array to consider.
end_index: int
The end index of the array to consider.
Returns
-------
array: list
The input array partitioned in-place.
final_pivot_idx: int
The index position of the pivot point in the partitioned array.
"""
# 1. Initialise the pivot and pointers
= end_idx
pivot_idx = start_idx
left_pointer_idx = pivot_idx - 1
right_pointer_idx
# Loop until the left and right pointers cross
while left_pointer_idx < right_pointer_idx:
# 2. Move the left pointer
while (left_pointer_idx < pivot_idx) and (array[left_pointer_idx] < array[pivot_idx]):
+= 1
left_pointer_idx
# 3. Move the right pointer
while (right_pointer_idx > 0) and (array[right_pointer_idx] > array[pivot_idx]):
-= 1
right_pointer_idx
# 4. Swap values if the pointers haven't crossed yet, otherwise end the loop
if left_pointer_idx < right_pointer_idx:
= array[right_pointer_idx], array[left_pointer_idx]
array[left_pointer_idx], array[right_pointer_idx]
# 5. Swap the pivot and left_pointer values
= array[pivot_idx], array[left_pointer_idx]
array[left_pointer_idx], array[pivot_idx] = left_pointer_idx
final_pivot_idx
return array, final_pivot_idx
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array 0, 11) partition(input_array,
([8, 7, 5, 3, 1, 9, 0, 8, 12, 69, 420, 23], 8)
def quicksort(array, start_idx=0, end_idx=None):
"""Quicksort algorithm to sort an array into ascending order.
Parameters
----------
array: list
The array that we wish to sort.
start_idx: int
The start index of the array to consider.
end_index: int
The end index of the array to consider.
Returns
-------
array: list
The input array sorted in-place.
"""
= end_idx or len(array) - 1
end_idx
# 3. Base case - an array of 0 or 1 elements is already sorted
if len(array[start_idx: end_idx]) <= 1:
return array
# 1. Partition the array
= partition(array, start_idx, end_idx)
array, pivot_idx
# 2. Recursively partition the left and right subarrays
=start_idx, end_idx=pivot_idx - 1)
quicksort(array, start_idx=pivot_idx + 1, end_idx=end_idx)
quicksort(array, start_idx
return array
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array quicksort(input_array)
[0, 1, 3, 5, 7, 8, 9, 8, 12, 23, 69, 420]
4.4. Complexity
4.4.1 Complexity of Partition
Partitioning involves two steps:
Operation | Number |
---|---|
Swaps | \(N\) |
Comparisons | \(\frac{N^2}{2}\) |
Each element is compared to the pivot point at least N times, because the left and right pointers move until they reach each other. So there are \(N\) comparisons.
Each swap handles two values, so in the worst case if we swapped every value there would be \(\frac{N}{2}\) swaps. On average, there’d be about half that, so \(\frac{N}{4}\).
Either way, this means a single partition operation is \(O(N)\).
4.4.2. Complexity of Quicksort
Quicksort performs multiple partitions. How many?
In the best case, each partition would place the pivot directly in the middle of the subarray, spltting them clean in half each time. So there would be \(log_2 N\) splits.
In the average case this is “close enough”. Not every level will split equally so there’ll be a few extra but it will still be broadly symmetric, therefore a logarithmic function of \(N\).
In the worst case, every partition ends up on the left side, so with each partition we are effectively placing each element one-by-one from the left. This means we will partition \(N\) times, and each has \(O(N)\) complexity, resulting in a total compleixty of \(O(N^2)\).
Algorithm | Best Case | Average Case | Worst Case |
---|---|---|---|
Quicksort | \(N log N\) | \(N log N\) | \(N^2\) |
4.5. Quickselect
Quickselect is an algorithm with a similar approach. It’s a hybrid of Quicksort and a binary search.
Given an unsorted array, find the fifth-highest value in the array.
One approach would be to sort the array then pick the fifth element from the right. But sorting is \(O(N log N)\) on average for the better algorithms. We can do better without having to sort the whole array.
Recall that after a partition, the pivot ends up in the correct place in the array. This is crucial. If we want to find the fifth highest value, we want to find the value that ends up in the fifth position from the right. So we can:
- Perform a partition and see where our pivot ends up.
- If our pivot is too far left, partition the right subarray. If the pivot is too dar right, partition the left subarray.
- Keep going until we find the pivot which ends up in our target spot.
def quickselect(array, target_position, start_idx=0, end_idx=None):
"""Quickselect algorithm to find the target position (k-th lowest) element in an unsorted array.
Parameters
----------
array: list
The array that we wish to sort.
target_position: int
The kth lowest value that we want to find.
start_idx: int
The start index of the array to consider.
end_index: int
The end index of the array to consider.
Returns
-------
array: list
The input array sorted in-place.
"""
= end_idx or len(array) - 1
end_idx
# Base case: If the start index is greater than the end index, or if the target position is out of range
if (start_idx > end_idx) or (target_position < 0) or (target_position > end_idx):
return None
# 1. Partition the array
= partition(array, start_idx, end_idx)
array, pivot_idx
# 2. Recursively process the subarrays
if pivot_idx < target_position:
# The pivot is too small, so check the subarray to the right
return quickselect(array, target_position, start_idx=pivot_idx + 1, end_idx=end_idx)
elif pivot_idx > target_position:
# The pivot is too big, so check the subarray to the left
return quickselect(array, target_position, start_idx=start_idx, end_idx=pivot_idx - 1)
else:
# The pivot is at the target position - we have found our target!
return array[pivot_idx]
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array 5) quickselect(input_array,
8
4.6. Complexity of Quickselect
In the average case, we are splitting the subarrays (roughly) in half each time.
Each split is operating on a subarray of half the size, so in total there are \(N + \frac{N}{2} + \frac{N}{4} + \frac{N}{8} + ... + 2 = 2N\) steps.
Thus, overall the complexity is \(O(N)\) in the average case.
5. Merge Sort
Merge sort is like organizing a messy pile of papers. You divide the pile into smaller piles, sort each smaller pile, and then merge them back together in the correct order.
5.1. Algorithm
- Divide. Recursively split the array into two halves until each subarray contains only one element (and is therefore sorted).
- Sort. Recursively mergesort each half.
- Merge. Recursively merge the sorted subarrays back together in the correct order.
5.2. Implementation
def merge_sort(array):
"""Mergesort sort algorithm to sort an array into ascending order.
Parameters
----------
array: list
The array that we wish to sort.
Returns
-------
array: list
The input array sorted in-place.
"""
# Base case - the array has 0 or 1 elements so is already sorted
if len(array) <= 1:
return array
# 1. Split the array
= len(array) // 2
idx_mid = array[:idx_mid]
left_half = array[idx_mid:]
right_half
# 2. Sort each half recursively
= merge_sort(left_half)
left_half = merge_sort(right_half)
right_half
# 3. Merge the subarrays
return merge(left_half, right_half)
def merge(left_array, right_array):
"""Merge two subarrays together which are each sorted.
Parameters
----------
left_array: list
The first subarray to merge.
right_array: list
The second subarray to merge.
Returns
-------
result: list
The merged, sorted array.
"""
# Initialise pointers at the start of each subarray
= []
result = 0
left_idx = 0
right_idx
# Loop until we reach the end of one of the subarrays
while left_idx < len(left_array) and right_idx < len(right_array):
# Compare the values of the left and right array, then insert the smaller of the two into the result
if left_array[left_idx] < right_array[right_idx]:
# The left value is smaller so insert it into the result and increment the pointer
result.append(left_array[left_idx])+= 1
left_idx else:
# The right value is smaller so insert it into the result and increment the pointer
result.append(right_array[right_idx])+= 1
right_idx
# The array that reaches the end first will be empty, but there will still be elements in the other.
# So insert any remaining elements at the end of the result
result.extend(left_array[left_idx:])
result.extend(right_array[right_idx:])
return result
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array merge_sort(input_array)
[0, 1, 3, 5, 7, 8, 8, 9, 12, 23, 69, 420]
5.3. Complexity
The time complexity of merge sort is the same for the best case, average case, and worst case.
In every case, each divide step splits the array in half, meaning there are \(log_2(N)\) splits.
For the merge step, the elements subarrays are looped through one-by-one, so all \(N\) elements are touched.
Therefore, the overall complexity is \(O(NlogN)\).
The space complexity is O(n).
This is because merge sort requires additional space to store temporary arrays during the merging process. In the worst case, when merging two halves of the array, an additional array of size equal to the original array is needed to store the merged result temporarily.
6. Heap Sort
6.1. Algorithm
A max-heap is “weakly sorted”; the maximum value is the root of the heap, and nodes are greater than all of their descendants.
We can use this property to sort data elements by creating a heap from the data, then popping the maximum value one-by-one to populate an array in order.
- Build a heap: Create a max-heap, which ensures the maximum value is at the root of the heap.
- Pop the root: Remove the root and place it at the end of the result array.
- Heapify: Rearrange the remaining blocks to form a new heap.
- Repeat: Iterate through the entire heap until all elements are sorted.
6.2. Implementation
def heapify(array, heap_size, current_idx):
"""Heapify function to maintain the max-heap property.
Parameters
----------
array: list
List of elements
heap_size: int
Size of heap
current_idx: int
Index of current node
Returns
-------
None
The input heap `array` is heapified in-place.
"""
# Track the largest node, initially assumed to be the root
= current_idx
largest = 2 * current_idx + 1
left_child = 2 * current_idx + 2
right_child
# Check if the node's children, if it has any, are larger than the current node
if left_child < heap_size and array[left_child] > array[largest]:
= left_child
largest if right_child < heap_size and array[right_child] > array[largest]:
= right_child
largest
if largest != current_idx:
# If largest is not root, swap it with root to maintain the heap condition
= array[largest], array[current_idx]
array[current_idx], array[largest] # Recursively heapify the affected sub-tree
heapify(array, heap_size, largest)
def heap_sort(array):
"""Heap sort algorithm to sort an array into ascending order.
Parameters
----------
array: list
The array that we wish to sort.
Returns
-------
array: list
The sorted array sorted in-place.
"""
# 1. Build a max heap.
# Start from the last non-leaf node, and heapify each node
= len(array)
array_len = array_len // 2 - 1
idx_last_non_leaf = array_len - 1
idx_last_node
for i in range(idx_last_non_leaf, -1, -1):
heapify(array, array_len, i)
# 2. Extract elements one-by-one starting from the end
for i in range(idx_last_node, 0, -1):
0] = array[0], array[i]
array[i], array[# 3. Maintain the heap condition for the unsorted portion of the array
0)
heapify(array, i,
return array
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array heap_sort(input_array)
[0, 1, 3, 5, 7, 8, 8, 9, 12, 23, 69, 420]
6.3. Complexity
The pre-processing step is to build the heap. This operation takes O(n) time.
Once the heap is created, we perform the following operations:
- Pop the max value: This is accessible as it is the root, so \(O(1)\) time.
- Insert at the end of the result array: This is also \(O(1)\) time.
- Heapifying: This takes \(O(log N)\) time.
These are performed for each of the \(N\) data elements, so the overall time complexity of heap sort is \(O(N log N)\).
This is the same regardless of the arrangement of the input, so the time complexity of heap sort is the same in the best case, average case, and worst case.
7. Counting Sort
We count how many of each element is in the array, i.e. how many 0s, how many 1s, how many 2s.
From this we can then determine what the starting position of each element in the output will be. E.g. if there are three 0s then these will take the first three slots in the sorted output, so 1s will start at the fourth position.
Counting sort works best when the number of unique items is small, i.e. there are lots of duplicates in the list.
It is a stable sorting algorithm, meaning items with identical values in the input will retain their original ordering in the output.
7.1 Algorithm
Say we want to sort the following array:
[0, 1, 2, 0, 0, 1, 2, 1, 0, 0, 2, 1, 2]
- Counting: Count the number of occurrences of each unique element in the input list. How many 0s? How many 1s? How many 2s? Etc.
- Cumulative count: Then calculate the cumulative count of the elements. This step determines the starting position of each element in the sorted output.
- Placement: Finally, you place each element in its correct position in the sorted output based on its cumulative count.
7.2. Implementation
def counting_sort(array):
"""Counting sort algorithm to sort an array into ascending order.
Parameters
----------
array: list
The array that we wish to sort.
Returns
-------
sorted_array: list
The sorted array.
"""
# 1. Count the elements in the input.
= max(array)
max_val = [0] * (max_val + 1)
count_store
for num in array:
+= 1
count_store[num]
# 2. Cumulative count
for i in range(1, len(count_store)):
+= count_store[i - 1]
count_store[i]
# 3. Place each element in its correct position in the sorted array
= [0] * len(array)
sorted_array
# for num in reversed(array):
# sorted_array[count_store[num] - 1] = num
# count_store[num] -= 1
for num in array:
- 1] = num
sorted_array[count_store[num] -= 1
count_store[num]
return sorted_array
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array counting_sort(input_array)
[0, 1, 3, 5, 7, 8, 8, 9, 12, 23, 69, 420]
7.3. Complexity
The complexity of counting sort depends on the number of elements, \(N\), and the number of unique elements, \(K\).
For each each of the steps:
- Counting: We loop through all \(N\) elements. We also create a
count_store
array which takes \(K\) elements of auxiliary space. - Cumulative count: We loop through the \(K\) elements in the
count_store
array. - Placement: We initialise an output_array of size \(N\), which therefore takes \(N\) elements of auxiliary space. We then loop through the \(N\) elements in the original array, using
count_store
to determine the correct placement.
So there are \(N + K + N\) steps and \(N + K\) elements of auxiliary data.
Time complexity: \(O(N + K)\)
Auxiliary space: \(O(N + K)\)
8. Radix Sort
Radix sort is a sorting algorithm that sorts numbers by processing individual digits.
It sorts numbers by first grouping the individual digits of the same place value together and sorting them. It starts sorting from the least significant digit (rightmost digit) to the most significant digit (leftmost digit).
8.1. Algorithm
- Start from the rightmost digit: Look at the least significant digit of each number. Group the numbers based on this digit.
- Sort each group: After grouping, the numbers are rearranged based on their digit value. So, all the numbers with the same rightmost digit are now together, and they are sorted within this group.
- Move to the next digit: Now, look at the second rightmost digit and repeat the process. Group the numbers based on this digit and sort each group.
- Continue until all digits are considered: Keep repeating this process until you’ve looked at all digits, moving towards the leftmost digit.
By the end of this process, the numbers will be sorted because each time you look at a digit, the numbers are sorted according to that digit.
8.2. Implementation
def counting_sort(arr, exp):
= len(arr)
n = [0] * n
output = [0] * 10
count
# Count occurrences of digits
for i in range(n):
= arr[i] // exp
index % 10] += 1
count[index
# Cumulative count
for i in range(1, 10):
+= count[i - 1]
count[i]
# Build the output array
= n - 1
i while i >= 0:
= arr[i] // exp
index % 10] - 1] = arr[i]
output[count[index % 10] -= 1
count[index -= 1
i
# Copy the output array to arr, to be ready for the next iteration
for i in range(n):
= output[i]
arr[i]
def radix_sort(arr):
# Find the maximum number to determine the number of digits
= max(arr)
max_num
= 1
exp while max_num // exp > 0:
counting_sort(arr, exp)*= 10
exp
return arr
= [8, 7, 5, 3, 1, 9, 0, 8, 23, 69, 420, 12]
input_array radix_sort(input_array)
[0, 1, 3, 5, 7, 8, 8, 9, 12, 23, 69, 420]
8.3. Complexity
The time complexity of radix sort depends on the number of digits or the range of the input numbers. Let’s denote:
- n as the number of elements in the array to be sorted.
- d as the maximum number of digits in the input numbers.
- b as the base of the number system being used (usually 10 for decimal numbers).
Best Case: The best-case scenario occurs when all the numbers have the same number of digits. In this case, the algorithm still has to iterate through each digit of each number. So, the time complexity in the best case is O(d⋅n)
Average Case: The average case time complexity is also O(d⋅n). This is because, on average, each number requires � d passes through the counting and distribution steps of the radix sort. Worst Case: The worst-case scenario happens when the numbers have significantly different numbers of digits, resulting in more passes through the counting and distribution steps. In the worst case, the time complexity is O(d⋅n).
In practice, however, the value of d is often considered to be a constant because it is bounded by the word size of the machine (e.g., 32 or 64 bits for integers). Therefore, the time complexity is often simplified to O(n), making radix sort very efficient, especially when the range of input numbers is relatively small.
In radix sort, the time complexity is the same in the best case, average case, and worst case. This is because the algorithm always needs to iterate through each digit of each number, regardless of the distribution of digits among the input numbers.
References
- Counting sort: https://www.youtube.com/watch?v=OKd534EWcdk
- Radix sort: https://www.youtube.com/watch?v=XiuSW_mEn7g