Why Sorting?
I started learning Golang last week and to solidify my understanding of the syntax, I practiced using Leetcode. This led me to revisit the algorithms I learned in my university years, which I had long forgotten. The merge sort algorithm, in particular, sparked my interest and motivated me to learn more about sorting.
Here are the references for the sorting algorithm:
-
Bubble Sort, which had the time complexity of (n^2)
- more on bubble sort https://www.geeksforgeeks.org/bubble-sort/
-
Merge Sort, which had the time complexity of (n * log(n))
- more on merge sort https://www.geeksforgeeks.org/merge-sort/
-
Quick Sort, which had the time complexity of (n * log(n))
- more on quick sort https://www.geeksforgeeks.org/quick-sort/
Since people rarely use bubble sort due to its time complexity and perhaps because they prefer to use the built-in sort function, I wonder how inefficient it actually is . Thus to find out, I conducted a simple experiment.
Experiment
The experiment aims to track the time needed for the merge
, bubble
, quick
, and built-in
(sort.Ints
) sorting function to sort the array. The whole experiment setup can be seen on my GitHub repo
Comparison Between Sorting Algorithm (with Go)
My Post
Checkout my post on dev.to: https://dev.to/peterchu999/sorting-algorithm-theory-vs-implementation--40ga/
Thought
By theory bubble sort
would had the n^2
complexity, and the rest (merge,quick,built-in sort func)
would had n*log(n)
complexity. here is the graph example :
but when I implemented all of the sorting algorithm in golang, here is the graph for 300 array length test:
Let see closely on the first 50 random array sorting benchmark. We could see up to ~25 array size, bubble sort perform better than merge
and built-in
sort. However, based on the theory it aren't suppose to be. Here os the graph comparison
Wierd isn't it ? but when we see the bigger picture the theory would align with the implementation. Let see the on 1000 array size
Running Workflow
-
./run-all.sh [n]
, change[n]
with the number of max array size for benchmark
Running Python Visualization
-
pip install -r
…
This is how the time tracking works:
func TestSorting(sfc SortFunction, arr []int) int64 {
scopedTestList := make([]int, len(arr))
copy(scopedTestList, arr) // copy the unsorted array to avoid side effects on other test
startTime := time.Now() // track start time
sfc(scopedTestList) // call sorting function
endTime := time.Now() // track end time
return endTimeMerge.Sub(startTimeMerge).Nanoseconds() // test return startTime - endTime
}
And here is the test setup:
for i:=1; i < arrLen+1; i++ {
testCase := GenerateUniqueRandomNumbers, 100) // generate i length of random numbers array
mergeT := utils.TestSorting(MergeSortVoid, testCase)
time.Sleep(time.Millisecond * 5) // incase computation degrade machine and make later case worst
bubbleT := utils.TestSorting(BubbleSortVoid, testCase)
time.Sleep(time.Millisecond * 5) // incase computation degrade machine and make later case worst
quickT := utils.TestSorting(QuickSortVoid, testCase)
time.Sleep(time.Millisecond * 5) // incase computation degrade machine and make later case worst
biT := utils.TestSorting(sort.Ints, testCase)
}
- loop until
arrLen
which was the maximum number of the array length. - for each iteration, generate a random number array with the length of the current iteration index (i)
- for each iteration, tracked the time needed for each sorting function to sort the array
the gathered data from the experiment above were saved and visualized by using Python code. The visualization plot the time taken for each array length to be sorted.
Result
Interesting results are shown when using a small array length. Let's examine the visualized graph comparing the complexity of theoretical sorting algorithms with the actual implementation time.
Up to an array length of around 30, bubble sort, which was presumably deemed slower according to theory, was actually faster than the merge
and built-in
sort functions. It's quite surprising 🤯 !
However, when we look at the bigger picture, the theory and implementation align . Here is the visualization of the theoretical sort complexity versus the sort implementation time for an array length of 1000.
Conclusion
In the end, when the array size is small, the sorting operation takes a very short amount of time (just a few nanoseconds), so no one really notices the difference. What truly matters is that the n * log(n)
sort algorithm (merge sort, quick sort) we learned performs significantly better on larger array size.
Please check out my code and feel free to experiment with it. If you find any flaws or errors, please create an issue on GitHub or leave a comment 💬 below!
I'm still unsure why bubble sort performs better on small datasets. My guess is that the operation to create a new array in merge sort takes longer than simply swapping values between indices. What do you think 🤔 ? Please share your thoughts and comments 🙌!
Acknowledgment
- https://unsplash.com/photos/long-exposure-photography-of-road-and-cars-NqOInJ-ttqM (cover image)
- https://geeksforgeeks.org/
- https://blog.boot.dev/golang/quick-sort-golang/ (quick sort)
- https://www.tutorialspoint.com/how-to-generate-a-array-of-unique-random-numbers-in-golang (generate random number array)
Top comments (2)
so, the first thing here is the order of growth, bubble sort grows quadratically, where as merge sort has n*log(n). as n becomes large,
(taking limit), n^2 grows faster than n*log(n), but this is just an upperbound / asymptotic analysis, which is done after ignoring the lower order terms, constants and other external factors such as compiler optimizations, and hardware configs. intuitively it feels (atleast for me) that for smaller array sizes, the instrcutions needed for swapping elements are must faster than loading the recursive stack back and forth into memory.
Yep, agree with ur thought 🙌.
Thanks for Commenting ❤️