Rishab Dugar

Posted on Sep 15

Crafting Structured {JSON} Responses: Ensuring Consistent Output from any LLM 🦙🤖

#llm #llama #python #genai

Large Language Models (LLMs) are revolutionizing how we interact with data, but getting these models to generate well-formatted & usable JSON responses consistently can feel like herding digital cats. You ask for structured data and get a jumbled mess interspersed with friendly commentary. Frustrating, right?
A reliable JSON output is crucial, whether you're categorizing customer feedback, extracting structured data from unstructured text, or automating data pipelines. This article aims to provide a comprehensive, generalized approach to ensure you get perfectly formatted JSON from any LLM, every time.

The Problem

LLMs are trained on massive text datasets, making them adept at generating human-like text. However, this strength becomes a weakness when seeking precise, structured output like JSON or Python Dictionary.
Common issues include:

Inconsistent Formatting: Random spaces, line breaks, and inconsistent quoting can break JSON parsers.
Extraneous Text: LLMs often add conversational fluff before or after the JSON, making extraction difficult.
Hallucinations: LLMs might invent data points or misinterpret instructions, leading to invalid or inaccurate JSON.

These issues can disrupt downstream processes and lead to significant inefficiencies. Let's explore some proven techniques to overcome these challenges.

The Solution: A Multi-Layered Approach

1. Guiding the LLM with Clear Instructions

Explicitly Request JSON: Clearly state that you expect the output in JSON format. Explicitly stating the intended use of the JSON output in the prompt can significantly improve its validity. Giving explicit instructions to provide a structured response in "system_prompt" can also prove helpful.

json_prompt = """Ensure the output is valid JSON as it will be parsed 
                 using `json.loads()` in Python. 
                 It should be in the schema: 
                <output>
                {
                "cars": [
                    {
                    "model": "<model_name1>",
                    "color": "<color1>"
                    },
                    {
                    "model": "<model_name2>",
                    "color": "<color2>"
                    },
                    {
                    "model": "<model_name3>",
                    "color": "<color3>"
                    },
                    {
                    "model": "<model_name4>",
                    "color": "<color4>"
                    },
                    {
                    "model": "<model_name5>",
                    "color": "<color5>"
                    }
                ]
                }
                </output>
                """
#Defining system prompt
system_prompt = "You are an AI language model that provides structured JSON outputs."

Provide a JSON Schema: Define the exact structure of the desired JSON, including keys and data types.
Use Examples: Show the LLM examples of correctly formatted JSON output for your specific use case.

As suggested in Anthropic Documentation, one more effective method is to guide the LLM by pre-filling the assistant's response with the beginning of the JSON structure. This technique leverages the model's ability to continue from a given starting point.

Example:

import boto3
import json
from botocore.exceptions import ClientError
from dotenv import load_dotenv
import os

load_dotenv()

# AWS Bedrock setup
session = boto3.Session(
    region_name=os.getenv("AWS_DEFAULT_REGION"),
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
)

bedrock = session.client(service_name="bedrock-runtime")

# Create a Bedrock Runtime client in the AWS Region of your choice.
client = boto3.client("bedrock-runtime", region_name="us-east-1")

# Set the model ID for Claude.
model_id = "anthropic.claude-3-haiku-20240307-v1:0"

# Define the JSON schema and example prefill response with stop sequences.
output_start = """<output>\n{\n"cars":"""
closing_bracket = "]\n}\n</output>"
json_prompt = """Ensure the output is valid JSON as it will be parsed 
                 using `json.loads()` in Python. 
                 It should be in the schema: 
                <output>
                {
                "cars": [
                    {
                    "model": "<model_name1>",
                    "color": "<color1>"
                    },
                    {
                    "model": "<model_name2>",
                    "color": "<color2>"
                    },
                    {
                    "model": "<model_name3>",
                    "color": "<color3>"
                    },
                    {
                    "model": "<model_name4>",
                    "color": "<color4>"
                    },
                    {
                    "model": "<model_name5>",
                    "color": "<color5>"
                    }
                ]
                }
                </output>
                """

# Define the prompt for the model.
prompt = f"""Provide an example of 5 cars with their color and models in JSON format enclosed in <output></output> XML tags.
            {json_prompt}"""

# Prefilled part of the response.
prefilled_response = output_start

# Define the system prompt.
system_prompt = "You are an AI language model that provides structured JSON outputs."

# Format the request payload using the model's native structure.
native_request = {
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1024,
    "temperature": 0.01,
    "stop_sequences": ["\n\nHuman:", closing_bracket],
    "system": f"<system>{system_prompt}</system>",
    "messages": [
        {
            "role": "user",
            "content": [{"type": "text", "text": prompt}],
        },
        {
            "role": "assistant",
            "content": [{"type": "text", "text": prefilled_response}]
        }
    ],
}

# Convert the native request to JSON.
request = json.dumps(native_request)

try:
    # Invoke the model with the request.
    response = client.invoke_model(modelId=model_id, body=request)

    # Decode the response body.
    model_response = json.loads(response["body"].read())

    # Extract and print the response text.
    completion = model_response["content"][0]["text"]
    final_result = prefilled_response + completion + closing_bracket

    print(final_result)

except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)

Output :

<output>
{
"cars":
[
    {
    "model": "Toyota Corolla",
    "color": "Silver"
    },
    {
    "model": "Honda Civic",
    "color": "Blue"
    },
    {
    "model": "Ford Mustang",
    "color": "Red"
    },
    {
    "model": "Chevrolet Camaro",
    "color": "Black"
    },
    {
    "model": "Nissan Altima",
    "color": "White"
    }
]
}
</output>

The salient features of this method are :

Prefilling the Response: "Put words in the LLM's mouth" by starting the assistant's response with the opening bracket { or other relevant beginning sequences as we have used above <output>\n{\n"cars":. This encourages the model to follow the expected format.
Strategic Stop Sequences: Define stop sequences ( like } or specific keywords, for example : ]\n}\n</output>. ) to prevent the LLM from adding extraneous text after the JSON.
Leveraging Tags for Complex Outputs: For multiple JSON objects, ask the output to be enclosed within unique tags ( e.g., <output>...</output> XML tags ). This allows for easy extraction using regular expressions.

Extracting the JSON response between XML tags :

When working with APIs or systems that return responses wrapped in XML tags, it becomes crucial to extract and utilize the JSON data embedded within those tags. Below, we'll explore methods to extract JSON data from XML tags both with and without the use of regular expressions (regex), followed by saving the extracted data to a JSON file.

Using Regular Expressions (Regex)

Regex can be a powerful tool for pattern matching and extraction. In this case, we can use regex to locate the JSON content within the specified XML tags.

import json
import re

def extract_json_with_regex(response: str):
    pattern = r"<output>(.*?)</output>"
    # Search for the pattern <output>...</output>
    match = re.search(pattern, response, re.DOTALL)

    if match:
        # Extract the content between the tags
        json_str = match.group(1).strip()
        try:
            # Parse the string to a JSON object
            json_data = json.loads(json_str)
            return json_data
        except json.JSONDecodeError:
            # Return None if JSON parsing fails
            return None
    # Return None if no match is found
    return None

In this function, re.search() is used to find the first occurrence of the pattern <output>...</output> in the response. If found, it extracts the content between these tags and attempts to parse it as JSON. If parsing fails, it returns None.

Without Using Regular Expressions

For scenarios where you prefer not to use regex, a more manual approach can be employed to achieve the same goal.

import json

def extract_json_without_regex(response: str):
    start_tag = "<output>"
    end_tag = "</output>"
    # Find the start and end indices of the tags
    start_index = response.find(start_tag)
    end_index = response.find(end_tag)

    if start_index != -1 and end_index != -1:
        # Adjust start index to get the content after the start tag
        start_index += len(start_tag)
        # Extract the content between the tags
        json_str = response[start_index:end_index].strip()
        try:
            # Parse the string to a JSON object
            json_data = json.loads(json_str)
            return json_data
        except json.JSONDecodeError:
            # Return None if JSON parsing fails
            return None
    # Return None if tags are not found
    return None

This function locates the starting and ending positions of the <output> ...</output> tags manually, extracts the content between them and attempts to parse it as JSON. Like the regex approach, it returns None if parsing fails or the tags are not found.

Saving Extracted JSON to a File
After extracting the JSON data, the next step is to save it to a file for further processing or record-keeping. The function below handles this task.

def save_json_to_file(json_data, file_name='output.json'):
    with open(file_name, 'w') as json_file:
        # Save the JSON data to the specified file with indentation for readability
        json.dump(json_data, json_file, indent=4)
        print(f"JSON data saved to {json_file.name}")

This utility function opens a file in write mode and uses json.dump() to write the JSON data to it, ensuring the output is formatted with an indentation of 4 spaces for better readability.

Final JSON result (output.json):

{
    "cars": [
        {
            "model": "Toyota Corolla",
            "color": "Silver"
        },
        {
            "model": "Honda Civic",
            "color": "Blue"
        },
        {
            "model": "Ford Mustang",
            "color": "Red"
        },
        {
            "model": "Chevrolet Camaro",
            "color": "Black"
        },
        {
            "model": "Nissan Altima",
            "color": "White"
        }
    ]
}

2. Validating and Repairing JSON Response

Despite employing the earlier techniques, minor syntax errors can occasionally disrupt the JSON structure. These errors can be addressed using the following methods:

We can fix these minor errors using some simple methods :

Requesting the LLM to Correct the JSON: Feed the malformed JSON back to the LLM and prompt it to correct the errors.
Utilizing JSON Repair Tools: Using tools like [json_repair](https://github.com/mangiucugna/json_repair) or [half-json](https://github.com/half-pie/half-json) can help correct these errors quickly.

The second method is generally more economical, faster, and reliable for straightforward cleanup tasks. In contrast, the first method may be more effective for addressing complex issues, albeit at the cost of additional time and an extra LLM call.

Example (using json-repair):

pip install json-repair

from json_repair import repair_json

cleaned_final_result = repair_json(final_result)

You can also use this library to completely replace json.loads():

import json_repair

decoded_object = json_repair.loads(json_string)

Example (Asking LLM to fix broken JSON) :

import boto3
import json
from botocore.exceptions import ClientError
from dotenv import load_dotenv
import os

# Load environment variables from a .env file
load_dotenv()

# AWS Bedrock setup with credentials and region from environment variables
session = boto3.Session(
    region_name=os.getenv("AWS_DEFAULT_REGION"),
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
)

# Create a Bedrock Runtime client using the session
bedrock = session.client(service_name="bedrock-runtime")

# Create a Bedrock Runtime client in the AWS Region of your choice (hardcoded to 'us-east-1')
client = boto3.client("bedrock-runtime", region_name="us-east-1")

# Set the model ID for Claude
model_id = "anthropic.claude-3-haiku-20240307-v1:0"

# Define the prefill response with stop sequences.
output_start = "{"
closing_bracket = "\n}"

#Example of Broken/Invalid JSON 
json_prompt = 
"""
{
    "cars": [
        {
            "model": Toyota Corolla, # Missing quotes around the value
            "color": "Silver"
        },
        {
            "model": "Honda Civic",
            "color": "Blue          # Missing closing quote
        },
        {
            "model": "Ford Mustang",
            "color": "Red"
        },
        {
            model: Chevrolet Camaro, # Missing quotes around the key and value
            "color": 'Black"         # Mixed quotes, opening with ' and closing with "
        ,                            # Missing closing brace for the object
        {
            "model": "Nissan Altima",
            "color": "White          # Missing closing quote and closing brace for the object
        }
    ]                           
}    
"""

# Define the prompt for the model
prompt = f"""Fix the JSON below:\n{json_prompt}"""

# Prefilled part of the response
prefilled_response = output_start

# Generic System prompt for JSON Repairing via LLM.
system_prompt = """

### Instruction

Your task is to act as an expert JSON fixer and repairer. You are responsible for correcting any broken JSON and ensuring there are no syntax errors. The resulting JSON should be validated and easily parsed using `json.loads()` in Python.

### Context

JSON is built on two primary structures:
1. A collection of name/value pairs, realized in various languages as an object, record, struct, dictionary, hash table, keyed list, or associative array.
2. An ordered list of values, realized in most languages as an array, vector, list, or sequence.

These structures are supported by virtually all modern programming languages, making JSON a widely used data interchange format.

In JSON, the structures take the following forms:
- An **object** is an unordered set of name/value pairs. An object begins with a `{` (left brace) and ends with a `}` (right brace). Each name is followed by a `:` (colon) and the name/value pairs are separated by `,` (comma).
- An **array** is an ordered collection of values. An array begins with a `[` (left bracket) and ends with a `]` (right bracket). Values are separated by `,` (comma).

### Requirements
1. Repair only the JSON structure without changing or modifying any data or values of the keys.
2. Ensure that the data is accurately represented and properly formatted within the JSON structure.
3. The resulting JSON should be validated and able to be parsed using `json.loads()` in Python.

### Example

#### Broken JSON
{
    "name": "John Doe",
    "age": 30,
    "isStudent": false
    "courses": ["Math", "Science"]
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "zipcode": "12345"
    }

#### Fixed JSON

{
    "name": "John Doe",
    "age": 30,
    "isStudent": false,
    "courses": ["Math", "Science"],
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "zipcode": "12345"
    }
}

### Notes
- Pay close attention to missing commas, unmatched braces or brackets, and any other structural issues.
- Maintain the integrity of the data without making assumptions or altering the content.
- Ensure the output is clean, precise, and ready for parsing in Python.
"""

# Format the request payload using the model's native structure
native_request = {
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1024,
    "temperature": 0.01,
    "stop_sequences": ["\n\nHuman:", closing_bracket],
    "system": f"<system>{system_prompt}</system>",
    "messages": [
        {
            "role": "user",
            "content": [{"type": "text", "text": prompt}],
        },
        {
            "role": "assistant",
            "content": [{"type": "text", "text": prefilled_response}]
        }
    ],
}

# Convert the native request to JSON
request = json.dumps(native_request)

try:
    # Invoke the model with the request
    response = client.invoke_model(modelId=model_id, body=request)

    # Decode the response body
    model_response = json.loads(response["body"].read())

    # Extract and print the response text
    completion = model_response["content"][0]["text"]
    final_result = prefilled_response + completion + closing_bracket

    print(final_result)

except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)

Output (as JSON) :

{
    "cars": [
        {
            "model": "Toyota Corolla",
            "color": "Silver"
        },
        {
            "model": "Honda Civic",
            "color": "Blue"
        },
        {
            "model": "Ford Mustang",
            "color": "Red"
        },
        {
            "model": "Chevrolet Camaro",
            "color": "Black"
        },
        {
            "model": "Nissan Altima",
            "color": "White"
        }
    ]
}

Balanced Perspective

While these techniques can significantly improve the consistency of JSON output from LLMs, they are not foolproof. Potential challenges include:

Increased complexity in prompt design
Additional computational overhead for post-processing
Dependency on external libraries for validation

Moreover, ethical considerations such as data privacy and model biases should always be taken into account when deploying LLMs in production environments.

Actionable Insights

Start with a Clear JSON Template: Define the JSON structure and use it as a guide for the LLM with few-shot prompting examples.
Leverage Post-Processing Tools: Use tools like [json_repair](https://github.com/mangiucugna/json_repair) to correct minor syntax errors in the JSON output.
Iterate and Improve: Continuously refining our prompts and validation rules based on the output and feedback. By following these steps, we can ensure that our LLM consistently generates well-formatted JSON, making our AI-driven applications more reliable and efficient.

Conclusion

Generating perfectly formatted JSON from LLMs is a common yet challenging task. By guiding the JSON syntax, communicating its usage, and using validation tools like json-fixer, we can significantly improve the consistency and reliability of the output. By combining clear instructions, strategic prompting, and robust validation, we can transform our LLM interactions from a gamble into a reliable pipeline for structured data.
That's all for the day folks, Stay informed, iterate, and refine your approach to master the art of JSON generation from any LLM.

DEV Community

Crafting Structured {JSON} Responses: Ensuring Consistent Output from any LLM 🦙🤖

The Problem

The Solution: A Multi-Layered Approach

1. Guiding the LLM with Clear Instructions

Extracting the JSON response between XML tags :

Using Regular Expressions (Regex)

Without Using Regular Expressions

2. Validating and Repairing JSON Response

Balanced Perspective

Actionable Insights

Conclusion

Top comments (0)

Read next

Cursor Pagination Example

Exploring Different Chunking Strategies and Working with Unstructured Data

Hacking in the Movies is Like…

Why creating a variable and using that variable as reference can lead to confusion?