TL;DR Comparing Go and Python for code execution time, working with Docker images and native packages distribution.
Recently I was working on a programming challenge and I thought it would be interesting to have an implementation both in Go and Python.
The challenge consists of processing a dataset (specifically a csv of ~100mb), make some aggregations and output a json file that will be used as a data-source to serve a bunch of REST API endpoints.
What are we comparing?
But enough with the chit-chat, lets dig in about the comparison.
Execution time: time difference in speed between the two implementations to execute the data aggregation algorithm.
Docker image: complexity in producing a docker image and size of the generated images.
Native distribution: ease of use to install and run the app in a native environment (aka on your laptop).
A bit of context
Both app are available on github, here the Go version and here the Python one.
Specifically we'll be comparing PLZ Go v0.2.1 with PLZ Py v0.1.3
The Go and Python version are respectively go-1.13.8
and python-3.8.5
.
Both implementation are quite small:
- PLZ Go: 250loc
- PLZ Py: 191loc (~23% smaller)
Metrics generated with tokei using:
- for Go
tokei -f -t=Go -e="*_test.go
- for Py
tokei -f -t=Python -e="tests"
Criteria #1: Execution speed
Both implementations use the same identical sequential algorithm to process the data and both input and output are the same.
Average speed1 to process ~100mb csv input to a 32k json ouput is:
PZL Go: ~0.6s (~2.2x faster)
PLZ Py: ~1.3s
The algorithm implementation is in the massage
function: here in v0.1.3 for Python and here in v0.2.1 for Go
If you dig into the code and you spot some evident problem affecting the performance please let me know in the comments.
Not much to say here, Go is twice faster than Python in this scenario.
Best: Go
Criteria #2: Docker image
Since the application provides a REST API to serve the results of the aggregation, it make sense to ship the app as a Docker image.
The approach to build the docker image is to use a multi-stage build: first build the app and aggregate the data and then assemble a "production" image.
For this criteria Go is hands down the best, you can instruct the compiler to build a self contained binary to be package in a scratch image for a resulting image size of little more of 9mb (3.5mb compressed).
On the other, with Python, the final image (based on 3.8-slim-buster
) is a staggering 175mb (55mb compressed). This is due to the many layers that compose the 3.8-slim-buster
base image, therefore there is room for improvement by building a custom image, but that will likely require a significant effort for build and maintenance.
PZL Go: ~9.3mb - (18x smaller)
PLZ Py: ~175mb
Best: Go
Criteria #3: Native distribution
By native distribution I mean to install the app on a server or on your laptop without having to tinker too much with software requirements. I strongly believe that this is an important metric, since having complex installation procedures involves a lot of cognitive effort that is ultimately wasted.
In this area both Go and Python are more or less equivalent on the surface, with Go you can install the package running go get github.com/noandrea/plz
and with Python using pip install plzpy
.
A caveat is that Python is far more popular and get shipped by default with many OS, while for Go you will likely have to download and install the Go toolkit.
Another notable difference is that with pip install
you are installing a "binary" distribution while with go get
you are fetching the source code and compiling it locally.
Worth mentioning that the maintainer(s) of the Go package may have shipped binaries for the OS specific architectures and package managers (brew
, dnf
, apt
, ...), but that requires additional work and maintenance.
Slightly better: Python
Conclusions
I hope you enjoyed this comparison as much as I did in making it, and about which one is better, Go or Python, the answer is ....🥁 ... both and none 😉!
👋
-
consistent over multiple executions ↩
Top comments (0)