In the first article of this series, we introduced the SAGA pattern and demonstrated how a minimal Orchestration can manage distributed transactions with a central orchestrator.
Let’s get real! This time, we’ll dive into the Choreography approach, where services coordinate workflows by autonomously emitting and consuming events.
To make this practical, we’ll implement a multi-service healthcare workflow using Go and RabbitMQ. Each service will have its own main.go
, making it easy to scale, test, and run independently.
What is SAGA Choreography?
Choreography relies on decentralized communication. Each service listens for events and triggers subsequent steps by emitting new events. There’s no central orchestrator; the flow emerges from the interactions of individual services.
Key Benefits:
- Decoupled Services: Each service operates independently.
- Scalability: Event-driven systems handle high loads efficiently.
- Flexibility: Adding new services doesn’t require changing the workflow logic.
Challenges:
- Debugging Complexity: Tracking events across multiple services can be tricky. (I'll write an article dedicated to this topic, stay tuned!)
- Infrastructure Setup: Services require a robust message broker (e.g., RabbitMQ) to connect all the dots.
- Event Storms: Poorly designed workflows can overwhelm the system with events.
Practical Example: Healthcare Workflow
Let’s revisit our healthcare workflow from the first article:
- Patient Service: Verifies patient details and insurance coverage.
- Scheduler Service: Schedules the procedure.
- Inventory Service: Reserves medical supplies.
- Billing Service: Processes billing.
Each service will:
- Listen for specific events using RabbitMQ.
- Emit new events to trigger subsequent steps.
Setting Up RabbitMQ with Docker
We’ll use RabbitMQ as the event queue. Run it locally using Docker:
docker run --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:4.0.5-management
Access the RabbitMQ management interface at http://localhost:15672 (username: guest
, password: guest
).
Exchanges, Queues, and Bindings Setup
We need to configure RabbitMQ to accommodate our events. Here’s an example init.go
file for setting up the RabbitMQ infrastructure:
package main
import (
"log"
"github.com/rabbitmq/amqp091-go"
)
func main() {
conn, err := amqp091.Dial("amqp://guest:guest@localhost:5672/")
if err != nil {
log.Fatalf("Failed to connect to RabbitMQ: %v", err)
}
defer conn.Close()
ch, err := conn.Channel()
if err != nil {
log.Fatalf("Failed to open a channel: %v", err)
}
defer ch.Close()
err = ch.ExchangeDeclare("events", "direct", true, false, false, false, nil)
if err != nil {
log.Fatalf("Failed to declare an exchange: %v", err)
}
_, err = ch.QueueDeclare("PatientVerified", true, false, false, false, nil)
if err != nil {
log.Fatalf("Failed to declare a queue: %v", err)
}
err = ch.QueueBind("PatientVerified", "PatientVerified", "events", false, nil)
if err != nil {
log.Fatalf("Failed to bind a queue: %v", err)
}
}
Full code here!
Note: In a production setting, you might want to manage this setup using a GitOps approach (e.g., with Terraform) or let each service handle its own queues dynamically.
Implementation: Service Files
Each service will have its own main.go
. We’ll also include compensation actions for handling failures gracefully.
1. Patient Service
This service verifies patient details and emits a PatientVerified
event. It also compensates by notifying the patient if a downstream failure occurs.
// patient/main.go
package main
import (
"fmt"
"log"
"github.com/rabbitmq/amqp091-go"
"github.com/thegoodapi/saga_tutorial/choreography/common"
)
func main() {
conn, err := amqp091.Dial("amqp://guest:guest@localhost:5672/")
if err != nil {
log.Fatalf("Failed to connect to RabbitMQ: %v", err)
}
defer conn.Close()
ch, err := conn.Channel()
if err != nil {
log.Fatalf("Failed to open a channel: %v", err)
}
defer ch.Close()
go func() {
fmt.Println("[PatientService] Waiting for events...")
msgs, err := common.ConsumeEvent(ch, "ProcedureScheduleCancelled")
if err != nil {
log.Fatalf("Failed to consume event: %v", err)
}
for range msgs {
fmt.Println("[PatientService] Processing event: ProcedureScheduleCancelled")
if err := notifyProcedureScheduleCancellation(); err != nil {
log.Fatalf("Failed to notify patient: %v", err)
}
}
}()
common.PublishEvent(ch, "events", "PatientVerified", "Patient details verified")
fmt.Println("[PatientService] Event published: PatientVerified")
select {}
}
func notifyProcedureScheduleCancellation() error {
fmt.Println("Compensation: Notify patient of procedure cancellation.")
return nil
}
2. Scheduler Service
This service listens for PatientVerified
and emits ProcedureScheduled
. It compensates by canceling the procedure if a downstream failure occurs.
// scheduler/main.go
package main
import (
"fmt"
"log"
"github.com/rabbitmq/amqp091-go"
"github.com/thegoodapi/saga_tutorial/choreography/common"
)
func main() {
conn, err := amqp091.Dial("amqp://guest:guest@localhost:5672/")
if err != nil {
log.Fatalf("Failed to connect to RabbitMQ: %v", err)
}
defer conn.Close()
ch, err := conn.Channel()
if err != nil {
log.Fatalf("Failed to open a channel: %v", err)
}
defer ch.Close()
go func() {
fmt.Println("[SchedulerService] Waiting for events...")
msgs, err := common.ConsumeEvent(ch, "PatientVerified")
if err != nil {
log.Fatalf("Failed to consume event: %v", err)
}
for range msgs {
fmt.Println("[SchedulerService] Processing event: PatientVerified")
if err := scheduleProcedure(); err != nil {
common.PublishEvent(ch, "events", "ProcedureScheduleFailed", "Failed to schedule procedure")
fmt.Println("[SchedulerService] Compensation triggered: ProcedureScheduleFailed")
} else {
common.PublishEvent(ch, "events", "ProcedureScheduled", "Procedure scheduled successfully")
fmt.Println("[SchedulerService] Event published: ProcedureScheduled")
}
}
}()
select {}
}
func scheduleProcedure() error {
fmt.Println("Step 2: Scheduling procedure...")
return nil // or simulate a failure
}
Additional Services
Include Inventory Service
and Billing Service
implementations, following the same structure as above. Each service listens for the previous event and emits the next one, ensuring compensation logic is in place for failures.
Full code here!
Running the Workflow
Start RabbitMQ:
docker run --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:4.0.5-management
Run Each Service:
Open separate terminals and run:
// one-time script to setup rabbitmq
go run choerography/init/main.go
// services
go run choerography/billing/main.go
go run choerography/inventory/main.go
go run choerography/scheduler/main.go
go run choerography/patient/main.go
Observe Output:
Each service processes events in sequence, logging the workflow progress.
What happened?
Let's break it down!
First of all, for the purpose of this article, we are not implementing SuppliesReserveFailed
and ProcedureScheduleFailed
,l to avoid unseless complexity.
We are implementing the following events
Steps (or transactions):
- T1: (init): PatientVerified
- T2: ProcedureScheduled
- T3: SuppliesReserved
- T4: BillingSuccessful
Compensations:
- C4: BillingFailed
- C3: ReservedSuppliesReleased
- C2: ProcedureScheduleCancelled
- C1: NotifyFailureToUser (not implemented)
Folowing this implementation diagram
This diagram represents a common approach to documenting choreography. However, I find it somewhat difficult to understand and a bit frustrating, particularly for those who are not familiar with the implementation or the pattern.
Let's break it down!
The diagram above is way more verbose and it breaks down each step making it easier to understand what's going on.
In a nutshell:
-
Patient service
verifies patient details successfully -
Patient service
emitsPatientVerified
-
Scheduler service
consumesPatientVerified
-
Scheduler service
schedule the appintment successfully -
Scheduler service
emitsProcedureScheduled
-
Inventory service
consumesProcedureScheduled
-
Inventory service
reserves the supplies successfully -
Inventory service
emitsSuppliesReserved
-
Billing service
consumesSuppliesReserved
-
Billing service
failes to charge the customer and starts the compensation -
Billing service
emitsBillingFailed
-
Inventory service
consumesBillingFailed
-
Inventory service
releases the supplies, reserved in step 7 -
Inventory service
emitsReservedSuppliesReleased
-
Scheduler service
consumesReservedSuppliesReleased
-
Scheduler service
deletes the appointment scheduled in step 4 -
Scheduler service
emitsProcedureScheduleCancelled
-
Patient service
consumesProcedureScheduleCancelled
-
Patient service
notifies the customer of the error
Note that we are not implementing failures for steps 1, 4, and 7 for the sake of brevity; however, the approach would be the same. Each of these failures would trigger a rollback of the preceding steps.
Observability
Observability is essential for debugging and monitoring distributed systems. Implementing logs, metrics, and traces ensures that developers can understand system behavior and diagnose issues efficiently.
Logging
- Use structured logging (e.g., JSON format) to capture events and metadata.
- Include correlation IDs in logs to trace workflows across services.
Metrics
- Monitor queue sizes and event processing times.
- Use tools like Prometheus to collect and visualize metrics.
Tracing
- Implement distributed tracing (e.g., with OpenTelemetry) to track events across services.
- Annotate spans with relevant data (e.g., event names, timestamps) for better insights.
We'll dive into observability in choerography later in this serie, stay tuned!
Key Takeaways
- Decentralized Control: Choreography enables autonomous collaboration.
- Event-Driven Simplicity: RabbitMQ simplifies message exchange.
- Scalable Architecture: Adding new services is seamless.
-
Choerography can be very overwelming at first, but as always: practice make you
perfectbetter!
Stay tuned for the next article, where we’ll explore Orchestration!
Check out the full repository for this series here. Let’s discuss in the comments!
Top comments (0)