DEV Community

Cover image for How to Use ComfyUI API with Python: A Complete Guide
Shawn Wang
Shawn Wang

Posted on

How to Use ComfyUI API with Python: A Complete Guide

ComfyUI is an open source node-based application for creating images, videos, and audio with GenAI. While the graphical interface is user-friendly, programmatic access via API can enable automation and integration into your applications. This guide will walk you through two approaches to interact with ComfyUI API using Python.

Prerequisites

  • Python 3.x
  • websocket-client library (pip install websocket-client)
  • A running ComfyUI instance
    • For local deployment: use 127.0.0.1:8188
    • For remote deployment: use your server's IP address, e.g. 192.168.1.100:8188

Method 1: Basic API with Image Saving

This method is used when your ComfyUI workflow contains SaveImage nodes, which save generated images to the local disk. The API will then retrieve these saved images through HTTP endpoints.

Key Steps

1. Prepare the Workflow Prompt

The workflow prompt is a JSON structure that defines your entire generation pipeline. You can export this from the ComfyUI interface after creating your desired workflow. It includes all nodes (like model loading, sampling, encoding) and their connections.

Workflow Export

prompt_text = """
{
    "3": {
        "class_type": "KSampler",
        "inputs": {
            "cfg": 8,
            "denoise": 1,
            "seed": 8566257,
            "steps": 20,
            "sampler_name": "euler",
            "scheduler": "normal",
            "latent_image": ["5", 0],
            "model": ["4", 0],
            "positive": ["6", 0],
            "negative": ["7", 0]
        }
    },
    "4": {
        "class_type": "CheckpointLoaderSimple",
        "inputs": {
            "ckpt_name": "v1-5-pruned-emaonly.safetensors"
        }
    },
    # ... other nodes configuration
}
"""
prompt = json.loads(prompt_text)
Enter fullscreen mode Exit fullscreen mode

2. Customize the Prompt

Before execution, you can modify various parameters in the prompt to customize the generation. Common modifications include changing the text prompt, seed, or sampling parameters.

# Modify the text prompt for the positive CLIPTextEncode node
prompt["6"]["inputs"]["text"] = "masterpiece best quality man"

# Change the seed for different results
prompt["3"]["inputs"]["seed"] = 5
Enter fullscreen mode Exit fullscreen mode

3. Set Up WebSocket Connection

ComfyUI uses WebSocket to provide real-time updates about the generation process. This connection allows you to monitor the execution status and receive preview images during generation.

client_id = str(uuid.uuid4())  # Generate a unique client ID
ws = websocket.WebSocket()
ws.connect(f"ws://{server_address}/ws?clientId={client_id}")
Enter fullscreen mode Exit fullscreen mode

4. Queue the Prompt

Submit the generation request to ComfyUI's queue. Each request receives a unique prompt id that we'll use to track its execution and retrieve results.

def queue_prompt(prompt):
    p = {"prompt": prompt, "client_id": client_id}
    data = json.dumps(p).encode('utf-8')
    req = urllib.request.Request(f"http://{server_address}/prompt", data=data)
    return json.loads(urllib.request.urlopen(req).read())

# Get prompt_id for tracking the execution
prompt_id = queue_prompt(prompt)['prompt_id']
Enter fullscreen mode Exit fullscreen mode

5. Monitor Execution Status

Listen to WebSocket messages to track the generation progress. The server sends updates about which node is currently executing and when the entire process is complete. You can also receive preview images during generation.

while True:
    out = ws.recv()
    if isinstance(out, str):
        message = json.loads(out)
        if message['type'] == 'executing':
            data = message['data']
            if data['node'] is None and data['prompt_id'] == prompt_id:
                break  # Execution complete
    else:
        # Binary data (preview images)
        continue
Enter fullscreen mode Exit fullscreen mode

6. Get History and Retrieve Images

Once execution is complete, we need to:

  1. Fetch the execution history to get information about generated images
  2. Use that information to retrieve the actual image data through the view endpoint
def get_history(prompt_id):
    with urllib.request.urlopen(f"http://{server_address}/history/{prompt_id}") as response:
        return json.loads(response.read())

def get_image(filename, subfolder, folder_type):
    data = {"filename": filename, "subfolder": subfolder, "type": folder_type}
    url_values = urllib.parse.urlencode(data)
    with urllib.request.urlopen(f"http://{server_address}/view?{url_values}") as response:
        return response.read()

# Get history for the executed prompt
history = get_history(prompt_id)[prompt_id]

# Since a ComfyUI workflow may contain multiple SaveImage nodes,
# and each SaveImage node might save multiple images,
# we need to iterate through all outputs to collect all generated images
output_images = {}
for node_id in history['outputs']:
    node_output = history['outputs'][node_id]
    images_output = []
    if 'images' in node_output:
        for image in node_output['images']:
            image_data = get_image(image['filename'], image['subfolder'], image['type'])
            images_output.append(image_data)
    output_images[node_id] = images_output
Enter fullscreen mode Exit fullscreen mode

7. Process Images and Clean Up

Finally, process the retrieved images as needed (save to disk, display, or further processing) and clean up resources by closing the WebSocket connection.

# Process the generated images
for node_id in output_images:
    for image_data in output_images[node_id]:
        # Convert bytes to PIL Image
        image = Image.open(io.BytesIO(image_data))
        # Process image as needed
        # image.save(f"output_{node_id}.png")

# Always close the WebSocket connection
ws.close()
Enter fullscreen mode Exit fullscreen mode

Method 2: WebSocket-Based Image Transfer

This method is used when your ComfyUI workflow contains SaveImageWebsocket nodes, which stream generated images directly through the WebSocket connection without saving to disk. This is more efficient for real-time applications.

Key Steps

1. Prepare and Customize Prompt

Similar to Method 1, but using SaveImageWebsocket node:

prompt_text = """
{
    "3": {
        "class_type": "KSampler",
        "inputs": {
            "cfg": 8,
            "denoise": 1,
            "seed": 8566257,
            "steps": 20,
            "sampler_name": "euler",
            "scheduler": "normal",
            "latent_image": ["5", 0],
            "model": ["4", 0],
            "positive": ["6", 0],
            "negative": ["7", 0]
        }
    },
    # ... other nodes remain the same ...
    "save_image_websocket_node": {
        "class_type": "SaveImageWebsocket",
        "inputs": {
            "images": ["8", 0]
        }
    }
}
"""
prompt = json.loads(prompt_text)

# Customize the prompt
prompt["6"]["inputs"]["text"] = "masterpiece best quality man"
prompt["3"]["inputs"]["seed"] = 5
Enter fullscreen mode Exit fullscreen mode

2. Set Up WebSocket Connection

client_id = str(uuid.uuid4())
ws = websocket.WebSocket()
ws.connect(f"ws://{server_address}/ws?clientId={client_id}")
Enter fullscreen mode Exit fullscreen mode

3. Queue the Prompt

def queue_prompt(prompt):
    p = {"prompt": prompt, "client_id": client_id}
    data = json.dumps(p).encode('utf-8')
    req = urllib.request.Request(f"http://{server_address}/prompt", data=data)
    return json.loads(urllib.request.urlopen(req).read())

# Get prompt_id for tracking the execution
prompt_id = queue_prompt(prompt)['prompt_id']
Enter fullscreen mode Exit fullscreen mode

4. Monitor Execution Status

Similar to Method 1, we monitor the WebSocket messages to track execution progress, but we also need to track which node is currently executing to properly collect image data. When we detect that the save_image_websocket_node is executing, any subsequent binary data received will be the image data, which we collect directly from the WebSocket stream.

current_node = ""
output_images = {}

while True:
    out = ws.recv()
    if isinstance(out, str):
        message = json.loads(out)
        if message['type'] == 'executing':
            data = message['data']
            if data['prompt_id'] == prompt_id:
                if data['node'] is None:
                    break  # Execution complete
                else:
                    current_node = data['node']
    else:
        # Handle binary image data from SaveImageWebsocket node
        if current_node == 'save_image_websocket_node':
            images_output = output_images.get(current_node, [])
            images_output.append(out[8:])  # Skip first 8 bytes of binary header
            output_images[current_node] = images_output
Enter fullscreen mode Exit fullscreen mode

5. Process Images and Clean Up

Once all images are collected, we can process them as needed:

# Process the images
for node_id in output_images:
    for image_data in output_images[node_id]:
        # Convert binary data to PIL Image
        image = Image.open(io.BytesIO(image_data))
        # Process image as needed
        # image.show()

# Clean up
ws.close()
Enter fullscreen mode Exit fullscreen mode

Complete Example Code

For complete working examples of both methods, please refer to the official ComfyUI repository:

💡 Looking for AI Image Inspiration?

Explore VisionGeni AI: a completely free, no-signup gallery of Stable Diffusion 3.5 & Flux images with prompts. Try our Flux prompt generator instantly to spark your creativity.

Choosing Between Methods

  • Use Method 1 (Basic API) when:

    • You need to persist images to disk
    • You want simpler error recovery
    • Network stability is a concern
  • Use Method 2 (WebSocket) when:

    • You need real-time image processing
    • You want to avoid disk I/O
    • You're building an interactive application
    • Performance is critical

Top comments (0)