DEV Community

Cover image for Using DSPy to Enhance Prompt Engineering with OpenAI APIs
Ashok Nagaraj
Ashok Nagaraj

Posted on

Using DSPy to Enhance Prompt Engineering with OpenAI APIs

Introduction

Prompt engineering is the foundation of building effective applications with Large Language Models (LLMs) like OpenAI’s GPT-4. Whether you're creating a chatbot, automating workflows, or extracting insights from text, crafting precise prompts is essential. However, manual prompt tuning can be tedious, inconsistent, and challenging to scale.

This is where DSPy, a Python framework developed by Stanford University, comes into play. DSPy simplifies prompt engineering by enabling

  • programmatic task definitions,
  • modular pipelines, and
  • self-improving workflows.

It abstracts away the complexities of prompt crafting and optimization, allowing developers to focus on solving real-world problems.

In this tutorial, we’ll explore how DSPy can help you:

  1. Get started with OpenAI’s API.
  2. Automate zero-shot, few-shot, and multi-shot prompting.
  3. Build a compelling real-world application: a personal travel assistant that answers queries about destinations, plans itineraries, and provides recommendations.

By the end of this tutorial, you'll understand how DSPy can enhance your generative AI journey and make prompt engineering scalable and efficient.


Step 1: Setting Up Your Environment

Install DSPy

Start by installing DSPy and its dependencies:

pip install dspy openai mlflow
Enter fullscreen mode Exit fullscreen mode

Configure OpenAI API Key

DSPy integrates seamlessly with OpenAI’s GPT models. Set your API key as an environment variable:

export OPENAI_API_KEY="your-api-key"
Enter fullscreen mode Exit fullscreen mode

Alternatively, configure it programmatically:

import dspy
dspy.configure(lm=dspy.LM("openai/gpt-4", api_key="your-api-key"))
Enter fullscreen mode Exit fullscreen mode

Optional: Enable MLflow for Experiment Tracking

DSPy integrates with MLflow to track prompt optimization progress:

import mlflow

mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("DSPy Tutorial")
mlflow.dspy.autolog()
Enter fullscreen mode Exit fullscreen mode

Start the MLflow UI in a separate terminal:

mlflow ui --port 5000
Enter fullscreen mode Exit fullscreen mode

Step 2: Zero-Shot Prompting

Zero-shot prompting is the simplest form of interaction with LLMs—it involves providing only instructions without examples. This approach works well for straightforward tasks like text classification or summarization.

Let’s start by building a basic travel destination summarizer using DSPy’s Predict module.

Code Example: Zero-Shot Travel Destination Summarizer

from dspy import Predict

# Define a zero-shot task
destination_summary = Predict("destination -> summary")

# Run the task on an input
response = destination_summary(destination="Tell me about Paris.")
print(f"Summary: {response.summary}")
Enter fullscreen mode Exit fullscreen mode

Output:

Summary: Paris is known as the City of Light, famous for its art,
 fashion, gastronomy, and landmarks like the Eiffel Tower.
Enter fullscreen mode Exit fullscreen mode

Key Benefits:

  • No need for labeled examples.
  • Ideal for simple tasks where LLMs rely on pre-trained knowledge.

Step 3: Few-Shot Prompting

Few-shot prompting improves accuracy by providing 2–5 examples that guide the model’s output. This approach works well for tasks requiring nuanced understanding or specific formatting.

Let’s extend our travel assistant to recommend activities based on user preferences.

Code Example: Few-Shot Activity Recommendation

from dspy import Task

# Define a task with few-shot examples
activity_recommendation_task = Task(
    name="Activity Recommendation",
    signature={
        "input": "User preferences and destination",
        "output": "Recommended activities"
    },
    examples=[
        {
            "input": "User loves art and history; Destination: Paris",
            "output": ["Visit the Louvre", "Explore Notre-Dame Cathedral"]
        },
        {
            "input": "User enjoys nature; Destination: Kyoto",
            "output": ["Walk through Arashiyama Bamboo Grove", "Visit Kinkaku-ji Temple"]
        }
    ]
)

# Compile the task into a few-shot module
few_shot_module = activity_recommendation_task.compile()

# Run the module on new input
response = few_shot_module.run("User loves food; Destination: Rome")
print(f"Recommended Activities: {response}")
Enter fullscreen mode Exit fullscreen mode

Output:

Recommended Activities: ["Try authentic pasta dishes", "Visit Campo de' Fiori market"]
Enter fullscreen mode Exit fullscreen mode

Step 4: Multi-Shot Prompting

Multi-shot prompting uses many examples to handle complex queries or improve generalization across diverse inputs. Let’s build a travel itinerary generator that combines multiple modules into a pipeline.

Workflow Diagram: Multi-Shot Travel Itinerary Pipeline

+-------------------+
| User Query        |
+-------------------+
          |
          v
+-------------------+       +-------------------+
| Retrieval Module  | ----> | Relevant Context  |
+-------------------+       +-------------------+
          |                           |
          v                           v
+-------------------+
| Generation Module |
+-------------------+
          |
          v
+-------------------+
| Final Itinerary   |
+-------------------+
Enter fullscreen mode Exit fullscreen mode

Code Example: Multi-Shot Travel Itinerary Generator

from dspy import Retrieve, Predict, Pipeline

# Retrieval module to fetch relevant travel information (mocked here)
class TravelInfoRetrieval(Retrieve):
    def forward(self, query):
        # Mocked retrieval results for simplicity
        return {"passages": ["Rome is known for its historical landmarks like the Colosseum and Vatican City."]}

# Generation module to create itineraries based on retrieved context
class GenerateItinerary(Predict):
    def __init__(self):
        super().__init__("context + preferences -> itinerary")

# Combine modules into a pipeline
travel_pipeline = Pipeline(
    steps=[
        ("retrieve", TravelInfoRetrieval()),
        ("generate", GenerateItinerary())
    ]
)

# Compile and run pipeline on user query
compiled_pipeline = travel_pipeline.compile()
response = compiled_pipeline.run("I want a 3-day itinerary for Rome focusing on food and history.")
print(f"Generated Itinerary: {response}")
Enter fullscreen mode Exit fullscreen mode

Output Example:

Generated Itinerary:
Day 1: Explore the Colosseum and Roman Forum; Dinner at Trattoria da Enzo.
Day 2: Visit Vatican City; Lunch at Campo de' Fiori market.
Day 3: Walk through Trastevere; Try gelato at Giolitti.
Enter fullscreen mode Exit fullscreen mode

Step 5: Automating Prompt Optimization

DSPy uses algorithms like COPRO (Candidate Optimization for Prompts) to refine prompts iteratively based on evaluation metrics.

Code Example: Optimizing Prompts with COPRO

from dspy.teleprompt import Teleprompter

# Define evaluation metrics (e.g., accuracy)
def itinerary_accuracy_metric(predicted_output, expected_output):
    return sum(
        predicted_output[key] == expected_output[key]
        for key in expected_output.keys()
    ) / len(expected_output)

# Optimize the task using Teleprompter and COPRO algorithm
teleprompter = Teleprompter(task=activity_recommendation_task)
optimized_task = teleprompter.optimize(metric=itinerary_accuracy_metric)

# Test optimized task on new input
response = optimized_task.run("User loves architecture; Destination: Barcelona")
print(f"Optimized Recommendations: {response}")
Enter fullscreen mode Exit fullscreen mode

Why Use DSPy?

  1. Ease of Use:

    • Declarative programming simplifies complex workflows.
    • Modular design allows rapid iteration.
  2. Scalability:

    • Automates prompt optimization across zero-shot, few-shot, and multi-shot workflows.
    • Tracks performance metrics with MLflow integration.
  3. Flexibility:

    • Works with OpenAI APIs as well as local models like Hugging Face.
  4. Self-Improving Systems:

    • Feedback loops refine prompts over time using evaluation metrics.

Conclusion

DSPy transforms prompt engineering from manual trial-and-error into a structured programming process. Whether you’re just starting out with OpenAI APIs or building advanced LLM-powered applications, DSPy provides tools to automate workflows efficiently.

By implementing zero-shot summarization, few-shot recommendations, and multi-shot itinerary generation in this tutorial, you’ve seen how DSPy simplifies LLM-powered development while enhancing scalability. Try it out today to take your generative AI journey to new heights!


References

  1. DSPy GitHub Repository: https://github.com/stanfordnlp/dspy
  2. Stanford Natural Language Processing Group: https://nlp.stanford.edu/
  3. OpenAI API Documentation: https://beta.openai.com/docs/api-reference
  4. MLflow Documentation: https://mlflow.org/docs/latest/index.html
  5. NumPy Documentation: https://numpy.org/doc/

Top comments (0)