ComfyUI is an open source node-based application for creating images, videos, and audio with GenAI. While the graphical interface is user-friendly, programmatic access via API can enable automation and integration into your applications. This guide will walk you through two approaches to interact with ComfyUI API using Python.
Prerequisites
- Python 3.x
- websocket-client library (
pip install websocket-client
) - A running ComfyUI instance
- For local deployment: use
127.0.0.1:8188
- For remote deployment: use your server's IP address, e.g.
192.168.1.100:8188
- For local deployment: use
Method 1: Basic API with Image Saving
This method is used when your ComfyUI workflow contains SaveImage
nodes, which save generated images to the local disk. The API will then retrieve these saved images through HTTP endpoints.
Key Steps
1. Prepare the Workflow Prompt
The workflow prompt is a JSON structure that defines your entire generation pipeline. You can export this from the ComfyUI interface after creating your desired workflow. It includes all nodes (like model loading, sampling, encoding) and their connections.
prompt_text = """
{
"3": {
"class_type": "KSampler",
"inputs": {
"cfg": 8,
"denoise": 1,
"seed": 8566257,
"steps": 20,
"sampler_name": "euler",
"scheduler": "normal",
"latent_image": ["5", 0],
"model": ["4", 0],
"positive": ["6", 0],
"negative": ["7", 0]
}
},
"4": {
"class_type": "CheckpointLoaderSimple",
"inputs": {
"ckpt_name": "v1-5-pruned-emaonly.safetensors"
}
},
# ... other nodes configuration
}
"""
prompt = json.loads(prompt_text)
2. Customize the Prompt
Before execution, you can modify various parameters in the prompt to customize the generation. Common modifications include changing the text prompt, seed, or sampling parameters.
# Modify the text prompt for the positive CLIPTextEncode node
prompt["6"]["inputs"]["text"] = "masterpiece best quality man"
# Change the seed for different results
prompt["3"]["inputs"]["seed"] = 5
3. Set Up WebSocket Connection
ComfyUI uses WebSocket to provide real-time updates about the generation process. This connection allows you to monitor the execution status and receive preview images during generation.
client_id = str(uuid.uuid4()) # Generate a unique client ID
ws = websocket.WebSocket()
ws.connect(f"ws://{server_address}/ws?clientId={client_id}")
4. Queue the Prompt
Submit the generation request to ComfyUI's queue. Each request receives a unique prompt id that we'll use to track its execution and retrieve results.
def queue_prompt(prompt):
p = {"prompt": prompt, "client_id": client_id}
data = json.dumps(p).encode('utf-8')
req = urllib.request.Request(f"http://{server_address}/prompt", data=data)
return json.loads(urllib.request.urlopen(req).read())
# Get prompt_id for tracking the execution
prompt_id = queue_prompt(prompt)['prompt_id']
5. Monitor Execution Status
Listen to WebSocket messages to track the generation progress. The server sends updates about which node is currently executing and when the entire process is complete. You can also receive preview images during generation.
while True:
out = ws.recv()
if isinstance(out, str):
message = json.loads(out)
if message['type'] == 'executing':
data = message['data']
if data['node'] is None and data['prompt_id'] == prompt_id:
break # Execution complete
else:
# Binary data (preview images)
continue
6. Get History and Retrieve Images
Once execution is complete, we need to:
- Fetch the execution history to get information about generated images
- Use that information to retrieve the actual image data through the view endpoint
def get_history(prompt_id):
with urllib.request.urlopen(f"http://{server_address}/history/{prompt_id}") as response:
return json.loads(response.read())
def get_image(filename, subfolder, folder_type):
data = {"filename": filename, "subfolder": subfolder, "type": folder_type}
url_values = urllib.parse.urlencode(data)
with urllib.request.urlopen(f"http://{server_address}/view?{url_values}") as response:
return response.read()
# Get history for the executed prompt
history = get_history(prompt_id)[prompt_id]
# Since a ComfyUI workflow may contain multiple SaveImage nodes,
# and each SaveImage node might save multiple images,
# we need to iterate through all outputs to collect all generated images
output_images = {}
for node_id in history['outputs']:
node_output = history['outputs'][node_id]
images_output = []
if 'images' in node_output:
for image in node_output['images']:
image_data = get_image(image['filename'], image['subfolder'], image['type'])
images_output.append(image_data)
output_images[node_id] = images_output
7. Process Images and Clean Up
Finally, process the retrieved images as needed (save to disk, display, or further processing) and clean up resources by closing the WebSocket connection.
# Process the generated images
for node_id in output_images:
for image_data in output_images[node_id]:
# Convert bytes to PIL Image
image = Image.open(io.BytesIO(image_data))
# Process image as needed
# image.save(f"output_{node_id}.png")
# Always close the WebSocket connection
ws.close()
Method 2: WebSocket-Based Image Transfer
This method is used when your ComfyUI workflow contains SaveImageWebsocket
nodes, which stream generated images directly through the WebSocket connection without saving to disk. This is more efficient for real-time applications.
Key Steps
1. Prepare and Customize Prompt
Similar to Method 1, but using SaveImageWebsocket node:
prompt_text = """
{
"3": {
"class_type": "KSampler",
"inputs": {
"cfg": 8,
"denoise": 1,
"seed": 8566257,
"steps": 20,
"sampler_name": "euler",
"scheduler": "normal",
"latent_image": ["5", 0],
"model": ["4", 0],
"positive": ["6", 0],
"negative": ["7", 0]
}
},
# ... other nodes remain the same ...
"save_image_websocket_node": {
"class_type": "SaveImageWebsocket",
"inputs": {
"images": ["8", 0]
}
}
}
"""
prompt = json.loads(prompt_text)
# Customize the prompt
prompt["6"]["inputs"]["text"] = "masterpiece best quality man"
prompt["3"]["inputs"]["seed"] = 5
2. Set Up WebSocket Connection
client_id = str(uuid.uuid4())
ws = websocket.WebSocket()
ws.connect(f"ws://{server_address}/ws?clientId={client_id}")
3. Queue the Prompt
def queue_prompt(prompt):
p = {"prompt": prompt, "client_id": client_id}
data = json.dumps(p).encode('utf-8')
req = urllib.request.Request(f"http://{server_address}/prompt", data=data)
return json.loads(urllib.request.urlopen(req).read())
# Get prompt_id for tracking the execution
prompt_id = queue_prompt(prompt)['prompt_id']
4. Monitor Execution Status
Similar to Method 1, we monitor the WebSocket messages to track execution progress, but we also need to track which node is currently executing to properly collect image data. When we detect that the save_image_websocket_node
is executing, any subsequent binary data received will be the image data, which we collect directly from the WebSocket stream.
current_node = ""
output_images = {}
while True:
out = ws.recv()
if isinstance(out, str):
message = json.loads(out)
if message['type'] == 'executing':
data = message['data']
if data['prompt_id'] == prompt_id:
if data['node'] is None:
break # Execution complete
else:
current_node = data['node']
else:
# Handle binary image data from SaveImageWebsocket node
if current_node == 'save_image_websocket_node':
images_output = output_images.get(current_node, [])
images_output.append(out[8:]) # Skip first 8 bytes of binary header
output_images[current_node] = images_output
5. Process Images and Clean Up
Once all images are collected, we can process them as needed:
# Process the images
for node_id in output_images:
for image_data in output_images[node_id]:
# Convert binary data to PIL Image
image = Image.open(io.BytesIO(image_data))
# Process image as needed
# image.show()
# Clean up
ws.close()
Complete Example Code
For complete working examples of both methods, please refer to the official ComfyUI repository:
- Method 1 (Basic API): websockets_api_example.py
- Method 2 (WebSocket): websockets_api_example_ws_images.py
💡 Looking for AI Image Inspiration?
Explore VisionGeni AI: a completely free, no-signup gallery of Stable Diffusion 3.5 & Flux images with prompts. Try our Flux prompt generator instantly to spark your creativity.
Choosing Between Methods
-
Use Method 1 (Basic API) when:
- You need to persist images to disk
- You want simpler error recovery
- Network stability is a concern
-
Use Method 2 (WebSocket) when:
- You need real-time image processing
- You want to avoid disk I/O
- You're building an interactive application
- Performance is critical
Top comments (0)