Shawn Wang

Posted on Mar 5

How to Use ComfyUI API with Python: A Complete Guide

#ai #comfyui

ComfyUI is an open source node-based application for creating images, videos, and audio with GenAI. While the graphical interface is user-friendly, programmatic access via API can enable automation and integration into your applications. This guide will walk you through two approaches to interact with ComfyUI API using Python.

Prerequisites

Python 3.x
websocket-client library (pip install websocket-client)
A running ComfyUI instance
- For local deployment: use 127.0.0.1:8188
- For remote deployment: use your server's IP address, e.g. 192.168.1.100:8188

Method 1: Basic API with Image Saving

This method is used when your ComfyUI workflow contains SaveImage nodes, which save generated images to the local disk. The API will then retrieve these saved images through HTTP endpoints.

Key Steps

1. Prepare the Workflow Prompt

The workflow prompt is a JSON structure that defines your entire generation pipeline. You can export this from the ComfyUI interface after creating your desired workflow. It includes all nodes (like model loading, sampling, encoding) and their connections.

prompt_text = """
{
    "3": {
        "class_type": "KSampler",
        "inputs": {
            "cfg": 8,
            "denoise": 1,
            "seed": 8566257,
            "steps": 20,
            "sampler_name": "euler",
            "scheduler": "normal",
            "latent_image": ["5", 0],
            "model": ["4", 0],
            "positive": ["6", 0],
            "negative": ["7", 0]
        }
    },
    "4": {
        "class_type": "CheckpointLoaderSimple",
        "inputs": {
            "ckpt_name": "v1-5-pruned-emaonly.safetensors"
        }
    },
    # ... other nodes configuration
}
"""
prompt = json.loads(prompt_text)

2. Customize the Prompt

Before execution, you can modify various parameters in the prompt to customize the generation. Common modifications include changing the text prompt, seed, or sampling parameters.

# Modify the text prompt for the positive CLIPTextEncode node
prompt["6"]["inputs"]["text"] = "masterpiece best quality man"

# Change the seed for different results
prompt["3"]["inputs"]["seed"] = 5

3. Set Up WebSocket Connection

ComfyUI uses WebSocket to provide real-time updates about the generation process. This connection allows you to monitor the execution status and receive preview images during generation.

client_id = str(uuid.uuid4())  # Generate a unique client ID
ws = websocket.WebSocket()
ws.connect(f"ws://{server_address}/ws?clientId={client_id}")

4. Queue the Prompt

Submit the generation request to ComfyUI's queue. Each request receives a unique prompt id that we'll use to track its execution and retrieve results.

def queue_prompt(prompt):
    p = {"prompt": prompt, "client_id": client_id}
    data = json.dumps(p).encode('utf-8')
    req = urllib.request.Request(f"http://{server_address}/prompt", data=data)
    return json.loads(urllib.request.urlopen(req).read())

# Get prompt_id for tracking the execution
prompt_id = queue_prompt(prompt)['prompt_id']

5. Monitor Execution Status

Listen to WebSocket messages to track the generation progress. The server sends updates about which node is currently executing and when the entire process is complete. You can also receive preview images during generation.

while True:
    out = ws.recv()
    if isinstance(out, str):
        message = json.loads(out)
        if message['type'] == 'executing':
            data = message['data']
            if data['node'] is None and data['prompt_id'] == prompt_id:
                break  # Execution complete
    else:
        # Binary data (preview images)
        continue

6. Get History and Retrieve Images

Once execution is complete, we need to:

Fetch the execution history to get information about generated images
Use that information to retrieve the actual image data through the view endpoint

def get_history(prompt_id):
    with urllib.request.urlopen(f"http://{server_address}/history/{prompt_id}") as response:
        return json.loads(response.read())

def get_image(filename, subfolder, folder_type):
    data = {"filename": filename, "subfolder": subfolder, "type": folder_type}
    url_values = urllib.parse.urlencode(data)
    with urllib.request.urlopen(f"http://{server_address}/view?{url_values}") as response:
        return response.read()

# Get history for the executed prompt
history = get_history(prompt_id)[prompt_id]

# Since a ComfyUI workflow may contain multiple SaveImage nodes,
# and each SaveImage node might save multiple images,
# we need to iterate through all outputs to collect all generated images
output_images = {}
for node_id in history['outputs']:
    node_output = history['outputs'][node_id]
    images_output = []
    if 'images' in node_output:
        for image in node_output['images']:
            image_data = get_image(image['filename'], image['subfolder'], image['type'])
            images_output.append(image_data)
    output_images[node_id] = images_output

7. Process Images and Clean Up

Finally, process the retrieved images as needed (save to disk, display, or further processing) and clean up resources by closing the WebSocket connection.

# Process the generated images
for node_id in output_images:
    for image_data in output_images[node_id]:
        # Convert bytes to PIL Image
        image = Image.open(io.BytesIO(image_data))
        # Process image as needed
        # image.save(f"output_{node_id}.png")

# Always close the WebSocket connection
ws.close()

Method 2: WebSocket-Based Image Transfer

This method is used when your ComfyUI workflow contains SaveImageWebsocket nodes, which stream generated images directly through the WebSocket connection without saving to disk. This is more efficient for real-time applications.

Key Steps

1. Prepare and Customize Prompt

Similar to Method 1, but using SaveImageWebsocket node:

prompt_text = """
{
    "3": {
        "class_type": "KSampler",
        "inputs": {
            "cfg": 8,
            "denoise": 1,
            "seed": 8566257,
            "steps": 20,
            "sampler_name": "euler",
            "scheduler": "normal",
            "latent_image": ["5", 0],
            "model": ["4", 0],
            "positive": ["6", 0],
            "negative": ["7", 0]
        }
    },
    # ... other nodes remain the same ...
    "save_image_websocket_node": {
        "class_type": "SaveImageWebsocket",
        "inputs": {
            "images": ["8", 0]
        }
    }
}
"""
prompt = json.loads(prompt_text)

# Customize the prompt
prompt["6"]["inputs"]["text"] = "masterpiece best quality man"
prompt["3"]["inputs"]["seed"] = 5

2. Set Up WebSocket Connection

client_id = str(uuid.uuid4())
ws = websocket.WebSocket()
ws.connect(f"ws://{server_address}/ws?clientId={client_id}")

3. Queue the Prompt

def queue_prompt(prompt):
    p = {"prompt": prompt, "client_id": client_id}
    data = json.dumps(p).encode('utf-8')
    req = urllib.request.Request(f"http://{server_address}/prompt", data=data)
    return json.loads(urllib.request.urlopen(req).read())

# Get prompt_id for tracking the execution
prompt_id = queue_prompt(prompt)['prompt_id']

4. Monitor Execution Status

Similar to Method 1, we monitor the WebSocket messages to track execution progress, but we also need to track which node is currently executing to properly collect image data. When we detect that the save_image_websocket_node is executing, any subsequent binary data received will be the image data, which we collect directly from the WebSocket stream.

current_node = ""
output_images = {}

while True:
    out = ws.recv()
    if isinstance(out, str):
        message = json.loads(out)
        if message['type'] == 'executing':
            data = message['data']
            if data['prompt_id'] == prompt_id:
                if data['node'] is None:
                    break  # Execution complete
                else:
                    current_node = data['node']
    else:
        # Handle binary image data from SaveImageWebsocket node
        if current_node == 'save_image_websocket_node':
            images_output = output_images.get(current_node, [])
            images_output.append(out[8:])  # Skip first 8 bytes of binary header
            output_images[current_node] = images_output

5. Process Images and Clean Up

Once all images are collected, we can process them as needed:

# Process the images
for node_id in output_images:
    for image_data in output_images[node_id]:
        # Convert binary data to PIL Image
        image = Image.open(io.BytesIO(image_data))
        # Process image as needed
        # image.show()

# Clean up
ws.close()

Complete Example Code

For complete working examples of both methods, please refer to the official ComfyUI repository:

Method 1 (Basic API): websockets_api_example.py
Method 2 (WebSocket): websockets_api_example_ws_images.py

💡 Looking for AI Image Inspiration?

Explore VisionGeni AI: a completely free, no-signup gallery of Stable Diffusion 3.5 & Flux images with prompts. Try our Flux prompt generator instantly to spark your creativity.

Choosing Between Methods

Use Method 1 (Basic API) when:
- You need to persist images to disk
- You want simpler error recovery
- Network stability is a concern
Use Method 2 (WebSocket) when:
- You need real-time image processing
- You want to avoid disk I/O
- You're building an interactive application
- Performance is critical

DEV Community

How to Use ComfyUI API with Python: A Complete Guide

Prerequisites

Method 1: Basic API with Image Saving

Key Steps

1. Prepare the Workflow Prompt

2. Customize the Prompt

3. Set Up WebSocket Connection

4. Queue the Prompt

5. Monitor Execution Status

6. Get History and Retrieve Images

7. Process Images and Clean Up

Method 2: WebSocket-Based Image Transfer

Key Steps

1. Prepare and Customize Prompt

2. Set Up WebSocket Connection

3. Queue the Prompt

4. Monitor Execution Status

5. Process Images and Clean Up

Complete Example Code

Choosing Between Methods

Top comments (0)

Read next

AI Web App Generators: Build in Minutes

Claude 3.7 Sonnet vs. Grok 3 vs. o3-mini-high: Coding comparison

Prompt Engineering Patterns for Successful RAG Implementations

Harnessing AWS Cloud for Seamless DeepSeek R1 Operations