Introduction
This is the third post of the Software Engineer Interviews series. I have brought a challenge I did a few years ago, and actually got the position - other tech interviews were involved, such as a past experience screening.
If you've missed the previous posts on this series, you can find them here.
The Challenge
This challenge was also a take-home coding task, where I had to develop a CLI program that would query the OEIS (On-Line Encyclopedia of Integer Sequences) and return the total number of results, and the name of the first five sequences returned by the query.
Thankfully, the OEIS query system includes a JSON output format, so you can get the results by calling the url and passing the sequence as a query string.
Example input and output:
oeis 1 1 2 3 5 7
Found 1096 results. Showing first five:
1. The prime numbers.
2. a(n) is the number of partitions of n (the partition numbers).
3. Prime numbers at the beginning of the 20th century (today 1 is no longer regarded as a prime).
4. Palindromic primes: prime numbers whose decimal expansion is a palindrome.
5. a(n) = floor(3^n / 2^n).
Note: this result is outdated!
Solving the Challenge
The plan to solve this challenge is the following:
- Start with a Python file that will be the CLI entrypoint
- It should receive a list of numbers separated by spaces as argument
- Create a client file that will be responsible to fetch data from the OEIS query system
- A formatter that will take care of returning the output formatted for the console
Since this is a coding challenge, I will be using Poetry to help me create the structure of the project, and to facilitate anyone running it. You can check how to install and use Poetry on their website.
I’ll start by creating the package with:
poetry new oeis
This will create a folder called oeis
, which will contain the Poetry’s configuration file, a test folder, and a folder also called oeis
, which will be the root of our project.
I will also add an optional package, called Click, which helps building CLI tools. This is not required, and can be replaced by other native tools from Python, although less elegant.
Inside the project’s folder, run:
poetry add click
This will add click as a dependency to our project.
Now we can move to our entrypoint file. If you open the folder oeis/oeis
, you will see there’s already an __init__.py
file. Let’s update it to import Click, and a main function to be called with the command:
# oeis/oeis/__init__.py
import click
@click.command()
def oeis():
pass
if __name__ == "__main__":
oeis()
This is the starting point to our CLI. See the @click.command
? This is a wrapper from click, which will help us define oeis
as a command.
Now, remember we need to receive the sequence of numbers, separated by a space? We need to add this as an argument. Click has an option for that:
# oeis/oeis/__init__.py
import click
@click.command()
@click.argument("sequence", nargs=-1)
def oeis(sequence: tuple[str]):
print(sequence)
if __name__ == "__main__":
oeis()
This will add an argument called sequence, and the nargs=-1
option tells click it will be separated by spaces. I added a print so we can test the argument is being passed correctly.
To tell Poetry that we have a command, we need to open pyproject.toml
and add the following lines:
# oeis/pyproject.toml
[tool.poetry.scripts]
oeis = "oeis:oeis"
This is adding a script called oeis
, which calls the oeis
function on the oeis
module. Now, we run:
poetry install
which will let us call the script. Let’s try it:
❯ poetry run oeis 1 2 3 4 5
('1', '2', '3', '4', '5')
Perfect, we have the command and the arguments being parsed as we expected! Let's move on to the client. Under the oeis/oeis
folder, create a folder called clients, a file called __init__.py
and a file called oeis_client.py
.
If we expected to have other clients in this project, we could develop a base client class, but since we will only have this single one, this could be considered over-engineering. In the OEIS client class, we should have a base URL, which is the URL without the paths, and that we will use to query it:
# oeis/oeis/clients/oeis_client.py
import requests
from urllib.parse import urlencode
class OEISClient:
def __init__(self) -> None:
self.base_url = "https://oeis.org/"
def query_results(self, sequence: tuple[str]) -> list:
url_params = self.build_url_params(sequence)
full_url = self.base_url + "search?" + url_params
response = requests.get(full_url)
response.raise_for_status()
return response.json()
def build_url_params(self, sequence: tuple[str]) -> str:
sequence_str = ",".join(sequence)
params = {"q": sequence_str, "fmt": "json"}
return urlencode(params)
As you can see, we are importing the requests package. We need to add it to Poetry before we can use it:
poetry add requests
Now, the client has a base url which does not change. Let's dive into the other methods:
-
build_url_params
- Receives the sequence passed as argument from the CLI, and transforms it into a string of numbers separated by a comma
- Builds a dict with the params, the
q
being the query we will run, andfmt
being the output format expected - Lastly, we return the URL encoded version of the params, which is a nice way to ensure our string is compatible with URLs
-
query_results
- Receives the sequence passed as argument from the CLI, builds the url encoded params through the
build_url_params
method - Builds the full URL which will be used to query the data
- Proceeds with the request to the URL built, and raises for any HTTP status that we didn’t expect
- Returns the JSON data
- Receives the sequence passed as argument from the CLI, builds the url encoded params through the
We also need to update our main file, to call this method:
# oeis/oeis/__init__.py
import click
from oeis.clients.oeis_client import OEISClient
OEIS_CLIENT = OEISClient()
@click.command()
@click.argument("sequence", nargs=-1)
def oeis(sequence: tuple[str]):
data = OEIS_CLIENT.query_results(sequence)
print(data)
if __name__ == "__main__":
oeis()
Here we are now building a client instance, outside the method, so it doesn’t create an instance every time the command is called, and calling it inside the command.
Running this results in a very, very long response, since the OEIS has thousands of entries. As we only need to know the total size and the top five entries, we can do the following:
# oeis/oeis/__init__.py
import click
from oeis.clients.oeis_client import OEISClient
OEIS_CLIENT = OEISClient()
@click.command()
@click.argument("sequence", nargs=-1)
def oeis(sequence: tuple[str]):
data = OEIS_CLIENT.query_results(sequence)
size = len(data)
top_five = data[:5]
print(size)
print(top_five)
if __name__ == "__main__":
oeis()
Running this is already way better than before. We now print the total size, and the top five (if they exist) entries.
But we also don’t need all of that. Let's build a formatter to correctly format our output. Create a folder called formatters, which will have a __init__.py
file and a oeis_formatter.py
file.
# oeis/oeis/formatters/oeis_formatter.py
def format_output(query_result: list) -> str:
size = len(query_result)
top_five = query_result[:5]
top_five_list = [f"{i+1}. {entry["name"]}" for i, entry in enumerate(top_five)]
top_five_str = "\n".join(top_five_list)
first_line = f"Found {size} results. Showing the first {len(top_five)}:\n"
return first_line + top_five_str
This file is basically formatting the top five results into what we want for the output. Let’s use it in our main file:
# oeis/oeis/__init__.py
import click
from oeis.clients.oeis_client import OEISClient
from oeis.formatters import oeis_formatter
OEIS_CLIENT = OEISClient()
@click.command()
@click.argument("sequence", nargs=-1)
def oeis(sequence: tuple[str]):
data = OEIS_CLIENT.query_results(sequence)
output = oeis_formatter.format_output(data)
print(output)
if __name__ == "__main__":
oeis()
If you run this code, you will get this now:
Found 10 results. Showing the first 5:
1. a(n) is the number of partitions of n (the partition numbers).
2. a(n) = floor(3^n / 2^n).
3. Partition triangle A008284 read from right to left.
4. Number of n-stacks with strictly receding walls, or the number of Type A partitions of n in the sense of Auluck (1951).
5. Number of partitions of n into prime power parts (1 included); number of nonisomorphic Abelian subgroups of symmetric group S_n.
It is now returning with the format we expect, but notice that it says it found 10 results. This is wrong, if you search on the OEIS website you will see there are way more results. Unfortunately, there was an update to OEIS API and the result no longer returns a count with the number of results. This count still shows up on the text formatted output, though. We can use it to know how many results there are.
To do this, we can change the URL to use the fmt=text
, and a regex to find the value we want. Let’s update the client code to fetch the text data, and the formatter to use this data so we can output it.
# oeis/oeis/clients/oeis_client.py
import re
import requests
from urllib.parse import urlencode
class OEISClient:
def __init__(self) -> None:
self.base_url = "https://oeis.org/"
self.count_regex = re.compile(r"Showing .* of (\d*)")
def query_results(self, sequence: tuple[str]) -> list:
url_params = self.build_url_params(sequence, fmt="json")
full_url = self.base_url + "search?" + url_params
response = requests.get(full_url)
response.raise_for_status()
return response.json()
def get_count(self, sequence: tuple[str]) -> str:
url_params = self.build_url_params(sequence, fmt="text")
full_url = self.base_url + "search?" + url_params
response = requests.get(full_url)
response.raise_for_status()
return self.get_response_count(response.text)
def build_url_params(self, sequence: tuple[str], fmt: str) -> str:
sequence_str = ",".join(sequence)
params = {"q": sequence_str, "fmt": fmt}
return urlencode(params)
def get_response_count(self, response_text: str) -> str:
match = self.count_regex.search(response_text)
if not match:
raise Exception("Count not found!")
return match.group(1)
As you can see, we added two new methods:
-
get_count
- Will build the params for the text API, and pass it to the method which will use regex to find the number we are searching for
-
get_response_count
- Will use the regex built in the class’ init to perform a search and get the first group
# oeis/oeis/formatters/oeis_formatter.py
def format_output(query_result: list, count: str) -> str:
top_five = query_result[:5]
top_five_list = [f"{i+1}. {entry["name"]}" for i, entry in enumerate(top_five)]
top_five_str = "\n".join(top_five_list)
first_line = f"Found {count} results. Showing the first {len(top_five)}:\n"
return first_line + top_five_str
In this file, we only added a new param for the method, and used it instead of the length of the query result.
# oeis/oeis/__init__.py
import click
from oeis.clients.oeis_client import OEISClient
from oeis.formatters import oeis_formatter
OEIS_CLIENT = OEISClient()
@click.command()
@click.argument("sequence", nargs=-1)
def oeis(sequence: tuple[str]):
data = OEIS_CLIENT.query_results(sequence)
count = OEIS_CLIENT.get_count(sequence)
output = oeis_formatter.format_output(data, count)
print(output)
if __name__ == "__main__":
oeis()
Here we are just calling the new method on the client, and passing the information to the formatter. Running it again results in the output we were expecting:
❯ poetry run oeis 1 2 3 4 5
Found 7821 results. Showing the first 5:
1. The positive integers. Also called the natural numbers, the whole numbers or the counting numbers, but these terms are ambiguous.
2. Digital sum (i.e., sum of digits) of n; also called digsum(n).
3. Powers of primes. Alternatively, 1 and the prime powers (p^k, p prime, k >= 1).
4. The nonnegative integers.
5. Palindromes in base 10.
The code is basically ready. But for a real challenge, remember to use Git when possible, do small commits, and of course, add unit tests, code formatting libs, type checkers, and whatever else you feel you will need.
Good luck!
Top comments (0)