DEV Community

Hridya for Learn Earn & Fun

Posted on

Quickstart to Flyte

Introduction

What is Flyte?

  • Kubernetes-native workflow automation platform
  • Open-source
  • Makes it easy create concurrent, scalable, and maintainable workflows
  • DLF AI & Data Incubation Project
  • Opinionated, scalable & hosted workflow automating platform
  • Extensible, Auditable, Observable

Integrations

Flyte supports a ton of integrations such as Hugging Face, Vaex,
Polars, Modin, BigQuery, DuckDB, Hive, etc...

Image description

This is an overall view of how many integrations they support!

You can check out all the integrations they support by clicking here

Trust by Companies

Flyte is used in production at LinkedIn, Spotify, Intel and others.

Image description

Setting Up Flyte

Note: You can skip this step and use Flyte on the browser if you don't want to download Flyte on your PC, https://sandbox.union.ai/

Requirements

  • Docker
  • Python

Ensure that your Docker Daemon is running

Installation

pip install flytekit flytekitplugins-deck-standard scikit-learn
Enter fullscreen mode Exit fullscreen mode

Installing FlyteCTL

FlyteCTL is a command-line interface for Flyte

OSX
brew install flyteorg/homebrew-tap/flytectl
Enter fullscreen mode Exit fullscreen mode
Other Operating Systems
curl -sL https://ctl.flyte.org/install | sudo bash -s -- -b /usr/local/bin
Enter fullscreen mode Exit fullscreen mode

Creating an Example Flyte Script

Just to checkout your setup works and have a bit of fun with Flyte.

Let's create an example script with flyte that:

  • Trains a model on the Wine Dataset from sklearn

Here's the script, insert it into any python file

from sklearn.datasets import load_wine
from sklearn.linear_model import LogisticRegression


def get_data():
    """Get the wine dataset."""
    return load_wine(as_frame=True).frame


def process_data(data):
    """Simplify the task from a 3-class to a binary classification problem."""
    return data.assign(target=lambda x: x["target"].where(x["target"] == 0, 1))


def train_model(data):
    """Train a model on the wine dataset."""
    features = data.drop("target", axis="columns")
    target = data["target"]
    return LogisticRegression(max_iter=1000).fit(features, target)


def training_workflow():
    """Put all of the steps together into a single workflow."""
    data = get_data()
    processed_data = process_data(data)
    return train_model(processed_data)


if __name__ == "__main__":
    print(f"Running training_workflow() {training_workflow()}")

Enter fullscreen mode Exit fullscreen mode

Running Flyte workflows

You can run the workflow in example.py on a local Python environment or a Flyte cluster.

Running a workflow using a local python env

Run this command to kickstart your newly created workflow using a python env
NOTE: Change example.py with the filename your Python file is!

pyflyte run example.py training_workflow
Enter fullscreen mode Exit fullscreen mode

Creating a Demo Flyte Cluster

Run this command to kickstart your newly created workflow using a Flyte Cluster.

flytectl demo start
Enter fullscreen mode Exit fullscreen mode

Then run the workflow on the cluster with the following command:

pyflyte run --remote example.py training_workflow 
Enter fullscreen mode Exit fullscreen mode

If you have setup everything correctly, You should receive the following message:

Image description

Great! You have run and successfully setup Flyte in your computer

Conclusion

🎉 Congratulations! In this getting started guide, you:

  • 🤓 You learned all about Flyte
  • 💻 Setup Flyte in your computer
  • 📜 Created a Flyte script
  • 🛥 Created a demo Flyte cluster on your local system.
  • 👟 Ran a workflow locally and on a demo Flyte cluster.

Flyte is a great workflow automation tool for Data, Machine Learning Processes

Lastly, don't forget to leave a LIKE and key in your feedback in the comments!

Top comments (0)