DEV Community

Cover image for 9 Open Source Python Projects to Join in 2022!
Code_Jedi
Code_Jedi

Posted on • Edited on

9 Open Source Python Projects to Join in 2022!

Contributing to open source projects is great for your reputation, skill development and knowledge as a developer.
In this article, I will be going through 9 open source Python projects that you can join today!


9. Django

Ah yes, the famous web development framework made for Python. It has more than 60k stars on Github and is used by millions of Python developers around the world.

GitHub logo django / django

The Web framework for perfectionists with deadlines.

Django

Django is a high-level Python web framework that encourages rapid development and clean, pragmatic design. Thanks for checking it out.

All documentation is in the "docs" directory and online at https://docs.djangoproject.com/en/stable/. If you're just getting started here's how we recommend you read the docs:

  • First, read docs/intro/install.txt for instructions on installing Django.
  • Next, work through the tutorials in order (docs/intro/tutorial01.txt docs/intro/tutorial02.txt, etc.).
  • If you want to set up an actual deployment server, read docs/howto/deployment/index.txt for instructions.
  • You'll probably want to read through the topical guides (in docs/topics) next; from there you can jump to the HOWTOs (in docs/howto) for specific problems, and check out the reference (docs/ref) for gory details.
  • See docs/README for instructions on building an HTML version of the docs.

Docs are updated rigorously. If you find any problems in the docs, or think they should be…

If you have experience with web development in Python and are looking to join an open source project, Django is the project for you!
Start contributing to Django here.


8. Scrapy

Scrapy is the most popular Python web scraping library with over 40k stars on github.

GitHub logo scrapy / scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

https://scrapy.org/img/scrapylogo.png

Scrapy

PyPI Version Supported Python Versions Ubuntu Windows Wheel Status Coverage report Conda Version

Overview

Scrapy is a BSD-licensed fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

Scrapy is maintained by Zyte (formerly Scrapinghub) and many other contributors.

Check the Scrapy homepage at https://scrapy.org for more information, including a list of features.

Requirements

  • Python 3.8+
  • Works on Linux, Windows, macOS, BSD

Install

The quick way:

pip install scrapy
Enter fullscreen mode Exit fullscreen mode

See the install section in the documentation at https://docs.scrapy.org/en/latest/intro/install.html for more details.

Documentation

Documentation is available online at https://docs.scrapy.org/ and in the docs directory.

Releases

You can check https://docs.scrapy.org/en/latest/news.html for the release notes.

Community (blog, twitter, mail list, IRC)

See https://scrapy.org/community/ for details.

Contributing

See https://docs.scrapy.org/en/master/contributing.html for details.

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct.

If you're into web scraping with Python and want to work on improving the web scraping library used by thousands of Python developers, start contributing to Scrapy through this page.


7. Scikit-Learn

If you've been involved in machine learning with Python for some time, you've probably come across this library.

GitHub logo scikit-learn / scikit-learn

scikit-learn: machine learning in Python

Azure CirrusCI Codecov CircleCI Nightly wheels Black PythonVersion PyPi DOI Benchmark

https://raw.githubusercontent.com/scikit-learn/scikit-learn/main/doc/logos/scikit-learn-logo.png

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license.

The project was started in 2007 by David Cournapeau as a Google Summer of Code project, and since then many volunteers have contributed. See the About us page for a list of core contributors.

It is currently maintained by a team of volunteers.

Website: https://scikit-learn.org

Installation

Dependencies

scikit-learn requires:

  • Python (>= 3.9)
  • NumPy (>= 1.19.5)
  • SciPy (>= 1.6.0)
  • joblib (>= 1.2.0)
  • threadpoolctl (>= 3.1.0)

Scikit-learn 0.20 was the last version to support Python 2.7 and Python 3.4. scikit-learn 1.0 and later require Python 3.7 or newer. scikit-learn 1.1 and later require Python 3.8 or newer.

Scikit-learn plotting capabilities (i.e., functions start with plot_ and classes end with Display) require Matplotlib (>= 3.3.4) For running the examples Matplotlib >= 3.3.4 is required. A few examples require scikit-image >= 0.17.2,…

If you have experience with machine learning and data visualization with Python and want to contribute to one of the most popular Python machine learning libraries, start contributing to scikit-learn here.


6. Pandas

Pandas is the most popular data analysis/manipulation library for Python.

GitHub logo pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

Pandas Logo

pandas: powerful Python data analysis toolkit

Testing CI - Test Coverage
Package PyPI Latest Release PyPI Downloads Conda Latest Release Conda Downloads
Meta Powered by NumFOCUS DOI License - BSD 3-Clause Slack

What is it?

pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way towards this goal.

Table of Contents

Main Features

Here are just a few of the things that pandas does well:

  • Easy handling of missing data (represented as NaN, NA, or NaT) in floating point as well as non-floating…

If you know how to work with data in Python and want to help build the future of data analysis/manipulation in Python, start contributing to pandas here.


5. Flask

Flask is another popular Python web development library with over 50k stars on Github.

GitHub logo pallets / flask

The Python micro framework for building web applications.

Flask

Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications. It began as a simple wrapper around Werkzeug and Jinja, and has become one of the most popular Python web application frameworks.

Flask offers suggestions, but doesn't enforce any dependencies or project layout. It is up to the developer to choose the tools and libraries they want to use. There are many extensions provided by the community that make adding new functionality easy.

A Simple Example

# save this as app.py
from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello, World!"
Enter fullscreen mode Exit fullscreen mode
$ flask run
  * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

Donate

The Pallets organization develops and supports Flask and the libraries it uses. In order to…

If you're looking to help build the future of web development with Python, start contributing to flask here.


4. Requests

Requests, the OG library used by millions that is used for making HTTP requests with Python. This might be pretty underwhelming, but you see, the requests library is used to connect to API endpoints, authenticate web connections, scrape data from the web, test web endpoints and more!
Without the requests library, Python wouldn't be where it is today.

GitHub logo psf / requests

A simple, yet elegant, HTTP library.

Requests

Requests is a simple, yet elegant, HTTP library.

>>> import requests
>>> r = requests.get('https://httpbin.org/basic-auth/user/pass', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
'{"authenticated": true, ...'
>>> r.json()
{'authenticated': True, ...}
Enter fullscreen mode Exit fullscreen mode

Requests allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs, or to form-encode your PUT & POST data — but nowadays, just use the json method!

Requests is one of the most downloaded Python packages today, pulling in around 30M downloads / week— according to GitHub, Requests is currently depended upon by 1,000,000+ repositories. You may certainly put your trust in this code.

Downloads Supported Versions Contributors

Installing Requests and Supported

Start contributing to requests here.


3. Matplotlib

Matplotlib is the most popular data visualization library for Python.

GitHub logo matplotlib / matplotlib

matplotlib: plotting with Python

PyPi Conda Downloads NUMFocus

Discourse help forum Gitter GitHub issues Contributing

GitHub actions status Azure pipelines status AppVeyor status Codecov status EffVer Versioning

Matplotlib logotype

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.

Check out our home page for more information.

image

Matplotlib produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, Python/IPython shells, web application servers and various graphical user interface toolkits.

Install

See the install documentation which is generated from /doc/install/index.rst

Contribute

You've discovered a bug or something else you want to change — excellent!

You've worked out a way to fix it — even better!

You want to tell us about it — best of all!

Start at the contributing guide!

Contact

Discourse is the discussion forum for general questions and discussions and our recommended starting point.

Our active mailing lists (which are mirrored on Discourse) are:

Gitter is for coordinating…

If you're involved with data visualization with Python and want to contribute to the most used and versatile data visualization library in Python, start contributing to Matplotlib here.


2. Keras

With over 50k stars on Github, Keras is a simple, versatile and robust library for building neural networks with Python.

GitHub logo keras-team / keras

Deep Learning for humans

Keras 3: Deep Learning for Humans

Keras 3 is a multi-backend deep learning framework, with support for JAX, TensorFlow, and PyTorch Effortlessly build and train models for computer vision, natural language processing, audio processing timeseries forecasting, recommender systems, etc.

  • Accelerated model development: Ship deep learning solutions faster thanks to the high-level UX of Keras and the availability of easy-to-debug runtimes like PyTorch or JAX eager execution.
  • State-of-the-art performance: By picking the backend that is the fastest for your model architecture (often JAX!), leverage speedups ranging from 20% to 350% compared to other frameworks. Benchmark here.
  • Datacenter-scale training: Scale confidently from your laptop to large clusters of GPUs or TPUs.

Join nearly three million developers, from burgeoning startups to global enterprises, in harnessing the power of Keras 3.

Installation

Install with pip

Keras 3 is available on PyPI as keras. Note that Keras 2 remains available…

Start contributing to Keras here.


1. TensorFlow

TensorFlow is a sophisticated Python neural network, deep learning and machine learning library used by millions with over 160k stars on Github.

GitHub logo tensorflow / tensorflow

An Open Source Machine Learning Framework for Everyone

Python PyPI DOI CII Best Practices OpenSSF Scorecard Fuzzing Status Fuzzing Status OSSRank Contributor Covenant TF Official Continuous TF Official Nightly

Documentation
Documentation

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.

TensorFlow was originally developed by researchers and engineers working within the Machine Intelligence team at Google Brain to conduct research in machine learning and neural networks. However, the framework is versatile enough to be used in other areas as well.

TensorFlow provides stable Python and C++ APIs, as well as a non-guaranteed backward compatible API for other languages.

Keep up-to-date with release announcements and security updates by subscribing to announce@tensorflow.org See all the mailing lists.

Install

See the TensorFlow install guide for the pip package, to enable GPU support, use a Docker container, and build from source.

To install the current release…

Start contributing to TensorFlow here.


Conclusion

I hope that in this article, you've found the open source project that you would like to contribute to, and help build the future of Python.

Educative

Before I end this article, I'd like to recommend Educative for developers looking to learn.
Why Educative?
It is home to hundreds of development courses, hands on tutorials, guides and demonstrations to help you stay ahead of the curve in your development journey.

You can get started with Educative here.

Byeeee👋

Top comments (2)

Collapse
 
pramodk73 profile image
Pramod

Hey, I was just exploring a place where I can talk about a small project I started few months back. This post came first in the Google search so posting here. It is a python package to to construct python objects from and to dict (json). Happy to listen, get feedback/counter arguments on it :)

Here is the project: github.com/pskd73/pydictable

Collapse
 
papa_fuerte profile image
Timur Carpeev

Do you mind making it 10 by adding most popular open source ecommerce :D
github.com/saleor/saleor