This is the continuation of the previous article about the dynamic array data structure. If you didn’t read it you can read it clicking here.
One of the prerequisites of doing a binary search is that the array or list has to be sorted.
The algorithm that is usually taught as the first sorting algorithm is know as bubblesort.
The idea behind it is very simple, we iterate through the list and check each pair of elements. If they are out of order we swap them and check the next, repeating this over and over until we reach the end of the array. Once we’re finished the array will be sorted.
For example, in a list [3, 2, 1] with 3 elements we will repeat the checks three times.
First pass:
3 and 2 are out of order so we swap them and get [2, 3, 1]
3 and 1 are out of order so we swap them and get [2, 1, 3]
Second pass:
2 and 1 should be swapped and we get [1, 2, 3]
2 and 3 are correct so we keep [1, 2, 3]
Third pass:
1 and 2 are correct [1, 2, 3]
2 and 3 are correct [1, 2, 3]
We’re done!
In code this would be the following:
fun <T : Comparable<T>> DynamicArray<out T>.bubbleSort() {
val size = size()
for (i in 0 until size) {
for (j in 0 until size - 1) {
if (this[j] > this[j + 1]) {
swap(this, j, j + 1)
}
}
}
}
This algorithm takes more than three hours to execute on our words list. You’ve read that correctly… THREE HOURS!
First optimization
One of the things the previous algorithm fails to realize is that we don’t need to do all the passes in case we do a pass where we didn’t swap anything. Not swapping means that the list is already sorted and we can stop. We just need a flag to be turned on in case there were no swaps and stop the loop.
For example, let’s assume our list is [1, 3, 2].
First pass:
1 and 3 are correct [1, 3, 2]
3 and 2 should be swapped [1, 2, 3]
Second pass:
1 and 2 are correct [1, 2, 3]
2 and 3 are correct [1, 2, 3]
No swaps, we can stop and avoid the third pass.
This would be implemented like this:
fun <T : Comparable<T>> DynamicArray<out T>.bubbleSort() {
val size = size()
var swapped: Boolean
do {
swapped = false
for (i in 0 until size - 1) {
if (this[i] > this[i + 1]) {
swap(this, i, i + 1)
swapped = true
}
}
} while (swapped)
}
In a small list like that the impact is not relevant but on a big list like our dictionary just this small improvement brings the sorting time a little bit under three hours. Of course you don’t want to wait three hours to sort the dictionary so there has to be something better that we can do.
Second optimization
One thing that the previous algorithm fails to realize is that after each pass the largest element is put in its correct position. Let’s use four elements now to make it really obvious.
Let’s start with [4, 3, 2, 1]
First pass
Swap 4 and 3 [3, 4, 2, 1]
Swap 4 and 2 [3, 2, 4, 1]
Swap 4 and 1 [3, 2, 1, 4]
After this first pass, 4 is in the correct position so we don’t even need to check it again. The second pass would not go until the end but would stop one element before.
Second pass
Swap 3 and 2 [2, 3, 1, 4]
Swap 3 and 1 [2, 1, 3, 4]
Three is in the correct position so we don’t need to check it again.
Third pass
Swap 2 and 1 [1, 2, 3, 4]
Two is in the right position so we don’t need to check it again
We’re finished!
fun <T : Comparable<T>> DynamicArray<out T>.bubbleSort() {
var n = size()
var swapped: Boolean
do {
swapped = false
for (i in 1 until n) {
if (this[i - i] > this[i]) {
swap(this, i - 1, i)
swapped = true
}
}
n--
} while (swapped)
}
This might also look like a small improvement in such a small list but on a big list like ours it drops the execution time from 3 hours to “just” 23 minutes. It’s a huge improvement!
More optimizations?
We could optimize more but maybe we’re approaching the problem in the wrong way. There is another algorithm that usually performs better than bubblesort and we’ll see how it performs in the next article.