How to Build a Wikipedia CLI Tool Using Python and the Wikipedia API
Creating a command-line interface (CLI) tool for Wikipedia can be a rewarding project, combining Python's simplicity with the vast knowledge base of Wikipedia. In this tutorial, we'll guide you step-by-step on how to build a CLI tool that fetches information from Wikipedia using its API.
Requirements
Before starting, make sure you have the following:
- Python 3.7 or newer installed on your system.
- Basic knowledge of Python and working with APIs.
- An internet connection to access the Wikipedia API.
Step 1: Understand the Wikipedia API
Wikipedia offers a RESTful API at https://en.wikipedia.org/w/api.php
. This API allows developers to query Wikipedia for content, metadata, and more. The key endpoints we'll use include:
-
action=query
: To fetch general content from Wikipedia. -
list=search
: To search for articles by keywords. -
prop=extracts
: To retrieve article summaries.
The base URL for all API requests is:
https://en.wikipedia.org/w/api.php
Step 2: Set Up Your Python Environment
Start by creating a Python virtual environment and installing the required libraries. We'll use requests
for making HTTP requests and argparse
for handling CLI arguments.
# Create a virtual environment
python -m venv wikipedia-cli-env
# Activate the environment
# On Windows:
wikipedia-cli-env\Scripts\activate
# On Mac/Linux:
source wikipedia-cli-env/bin/activate
# Install dependencies
pip install requests
Step 3: Plan the CLI Functionality
Our CLI tool will include the following features:
- Search Wikipedia Articles: Allow the user to search for articles by keyword.
- Fetch Article Summaries: Retrieve a brief summary of a specific article.
- View CLI Help: Display usage instructions.
Step 4: Implement the CLI Tool
Below is the Python code for the CLI tool:
import argparse
import requests
# Define the base URL for the Wikipedia API
WIKIPEDIA_API_URL = "https://en.wikipedia.org/w/api.php"
def search_articles(query):
"""Search Wikipedia for articles matching the query."""
params = {
'action': 'query',
'list': 'search',
'srsearch': query,
'format': 'json',
}
response = requests.get(WIKIPEDIA_API_URL, params=params)
response.raise_for_status() # Raise an error for bad responses
data = response.json()
if 'query' in data:
return data['query']['search']
else:
return []
def get_article_summary(title):
"""Fetch the summary of a Wikipedia article."""
params = {
'action': 'query',
'prop': 'extracts',
'exintro': True,
'titles': title,
'format': 'json',
}
response = requests.get(WIKIPEDIA_API_URL, params=params)
response.raise_for_status()
data = response.json()
pages = data.get('query', {}).get('pages', {})
for page_id, page in pages.items():
if 'extract' in page:
return page['extract']
return "No summary available."
def main():
parser = argparse.ArgumentParser(description="A CLI tool for interacting with Wikipedia.")
subparsers = parser.add_subparsers(dest="command")
# Sub-command: search
search_parser = subparsers.add_parser("search", help="Search for articles on Wikipedia.")
search_parser.add_argument("query", help="The search query.")
# Sub-command: summary
summary_parser = subparsers.add_parser("summary", help="Get the summary of a specific Wikipedia article.")
summary_parser.add_argument("title", help="The title of the Wikipedia article.")
args = parser.parse_args()
if args.command == "search":
results = search_articles(args.query)
if results:
print("Search Results:")
for result in results:
print(f"- {result['title']}: {result['snippet']}")
else:
print("No results found.")
elif args.command == "summary":
summary = get_article_summary(args.title)
print(summary)
else:
parser.print_help()
if __name__ == "__main__":
main()
Step 5: Test the CLI Tool
Save the script as wikipedia_cli.py
. You can now run the tool from your terminal:
- Search for articles:
python wikipedia_cli.py search "Python programming"
- Get an article summary:
python wikipedia_cli.py summary "Python (programming language)"
Step 6: Enhance the Tool
To make the tool more robust and user-friendly, consider adding the following:
- Error Handling: Provide detailed error messages for failed API requests.
-
Formatting: Use libraries like
rich
for prettier output. - Caching: Implement caching to avoid repetitive API calls for the same query.
- Additional Features: Add support for fetching related articles, categories, or images.
Conclusion
You've successfully built a CLI tool for Wikipedia using Python and its API! This tool can be a great starting point for more advanced projects, such as integrating it into other applications or creating a GUI version. Happy coding!
Top comments (0)