SQS FIFO와 Lambda를 사용한 Slack Rate Limiting
Overview (개요)
이 문서는 AWS SQS FIFO와 Lambda를 사용해 Slack 채널들에 메시지를 보내는 시스템에서 Slack API의 rate limit()을 관리하는 방법을 설명합니다. 태스크 5000개가 메시지를 발행하며, DynamoDB 없이 효율적으로 처리하는 방식을 다룹니다.
This document explains how to manage Slack API rate limits (1 message per second per channel) in a system sending messages to Slack channels (2~5) using AWS SQS FIFO and Lambda. It covers an efficient approach without DynamoDB, with 5000 tasks publishing messages.
Architecture (구조)
Components (구성 요소)
-
SQS FIFO Queue (
slack-queue.fifo
)- 태스크 5000개가 메시지를 직접 발행.
-
MessageGroupId
를 Slack 채널 ID로 설정해 채널별 순서 보장. - Messages are directly published by 5000 tasks.
-
MessageGroupId
is set to Slack channel IDs to ensure order per channel.
-
Lambda
- SQS FIFO에서 메시지를 디큐(dequeue).
- Slack API로 메시지 전송, 429 오류 시 재큐잉.
- Dequeues messages from SQS FIFO.
- Sends messages to Slack API, re-queues on 429 errors.
Code (코드)
1. SQS FIFO Publishing (Fargate Task)
1. SQS FIFO 발행 (Fargate 태스크)
import boto3
import json
import os
sqs = boto3.client('sqs')
def publish_messages(channels, message):
queue_url = os.environ['SQS_QUEUE_URL'] # slack-queue.fifo
for channel in channels:
sqs.send_message(
QueueUrl=queue_url,
MessageBody=json.dumps({"channel": channel, "message": message}),
MessageGroupId=channel
)
if __name__ == "__main__":
channels = ['channel1', 'channel2', 'channel3']
publish_messages(channels, "Test message from Fargate")
2. Lambda (SQS FIFO → Slack with Rate Limit Handling)
2. Lambda (SQS FIFO → Slack, Rate Limit 처리 포함)
from dataclasses import dataclass
import logging
import json
import os
from slack_sdk.webhook.client import WebhookClient
from slack_sdk.http_retry.builtin_handlers import (
ConnectionErrorRetryHandler,
RateLimitErrorRetryHandler,
ServerErrorRetryHandler,
)
import boto3
logger = logging.getLogger()
logger.setLevel("INFO")
sqs = boto3.client("sqs")
queue_url = "slack-queue.fifo" # batch 1
WORKSPACE_HOOK_URL = os.environ.get("WORKSPACE_HOOK_URL")
MAX_RETRY_COUNT = 3
client = WebhookClient(
url=WORKSPACE_HOOK_URL,
retry_handlers=[
ConnectionErrorRetryHandler(max_retry_count=MAX_RETRY_COUNT),
RateLimitErrorRetryHandler(max_retry_count=MAX_RETRY_COUNT), # 429
ServerErrorRetryHandler(max_retry_count=MAX_RETRY_COUNT), # 500, 503
],
logger=logger,
)
@dataclass
class SlackPayload:
channel: str
title: str
body: str
username: str
icon_emoji: str
extra_blocks: list
@classmethod
def from_body(cls, body: dict):
return cls(
channel=body["channel"],
title=body["title"],
body=body["body"],
username=body.get("username", "NotificationBot"),
icon_emoji=body.get("icon_emoji", ":satellite:"),
extra_blocks=body.get("extra_blocks", []),
)
def _build_blocks(self):
blocks = [
{"type": "divider"},
{"type": "section", "text": {"type": "mrkdwn", "text": f"*{self.title}*"}},
{"type": "section", "text": {"type": "mrkdwn", "text": self.body}},
]
if self.extra_blocks:
blocks.extend(self.extra_blocks)
blocks.append({"type": "divider"})
return blocks
def to_dto(self):
return {
"channel": self.channel,
"username": self.username,
"icon_emoji": self.icon_emoji,
"blocks": self.build_blocks(),
}
def handle_event(record):
body = json.loads(record["body"])
if body is None:
logger.error(f"Empty body received")
# Dequeue event when None
return
payload = SlackPayload.from_body(body)
logger.info(payload)
response = client.send_dict(
headers={"Content-Type": "application/json;charset=utf-8"},
body=payload.to_dto(),
)
if response.status_code != 200:
logger.error(f"Webhook failed: {response.status_code} - {response.body}")
raise Exception(f"Webhook failed with status {response.status_code}")
logger.info(f"Message sent to {payload.channel}")
def lambda_handler(event, context):
"""
Lambda handler for processing SQS FIFO queue events and sending Slack messages.
Intent:
- This function processes records from an SQS FIFO queue with BatchSize=1. Each invocation
handles a single message, which is acknowledged (ACKed) and deleted from the queue only
when the function returns {"statusCode": 200}. If an exception occurs, the function fails,
and the single message remains in the queue (NACKed) for retry after VisibilityTimeout.
Behavior:
- Success: "When your function successfully processes a batch, Lambda deletes
its messages from the queue."
- Failure: "If your function fails to process a batch (by throwing an exception or timing out),
Lambda retries the entire batch until it succeeds or the messages expire
and are sent to a dead-letter queue (DLQ), if configured."
Relevant AWS Documentation:
- [SQS-Lambda Integration](https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html)
- [FIFO Queue Behavior](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/FIFO-queues.html)
- Ensures message order within the same MessageGroupId and supports parallel processing.
Notes:
- BatchSize=1 ensures failures affect only the individual message, not a group of messages.
"""
logger.info(f"Received event: {json.dumps(event)}")
records = event.get("Records", [])
if not records:
logger.warning("No records found in event")
return {"statusCode": 200, "body": "No records to process"}
for record in records: # With BatchSize=1, this loop processes exactly one record
handle_event(record)
return {"statusCode": 200, "body": "Processed all records"}
Configuration (설정)
SQS FIFO Queue
SQS FIFO 큐
-
Name (이름):
slack-queue.fifo
-
VisibilityTimeout: 5초 (처리 시간 확보)
- 5 seconds (to ensure processing time)
-
Delivery Delay: 0초 (큐 수준 지연 불필요)
- 0 seconds (no queue-level delay needed)
-
WaitTimeSeconds: 20초 (Long Polling으로 빈 호출 최소화)
- 20 seconds (Long Polling to minimize empty calls)
Lambda
-
Trigger (트리거):
slack-queue.fifo
- Batch Size (배치 크기): 1 (초당 1개 메시지 처리)
-
Environment Variables (환경 변수):
-
SLACK_BOT_TOKEN
: Slack Bot 토큰 -
SQS_QUEUE_URL
:slack-queue.fifo
의 URL
-
IAM Permissions (IAM 권한)
-
sqs:SendMessage
,sqs:ReceiveMessage
,sqs:DeleteMessage
Rate Limit Management (Rate Limit 관리)
How It Works (동작 방식)
-
SQS FIFO 직렬화:
-
MessageGroupId
를 채널 ID로 설정 → 동일 채널 메시지는 순차 처리. - 서로 다른 채널은 병렬 처리 → 채널 5개면 초당 최대 5개.
- Messages with the same
MessageGroupId
(channel ID) are processed sequentially. - Different channels are processed in parallel, up to 5 messages per second for 5 channels.
-
-
Lambda 처리:
- 배치 크기 1 → 한 번에 1개 메시지 디큐.
- Slack API 호출 후 응답이 1초 이내(200~500ms) → 초당 2~3개 호출 시도 가능.
- Batch size 1 ensures one message at a time.
- API response within 1 second (200~500ms) allows 2~3 attempts per second.
-
429 오류 처리:
- Slack에서 429(Too Many Requests) 발생 시
Retry-After
헤더 값(기본 1초)으로 재큐잉. - 손실 없이 초당 1회 제한 준수.
- On 429 error, re-queue with
Retry-After
delay (default 1 second). - Ensures no message loss while respecting the 1 msg/sec limit.
- Slack에서 429(Too Many Requests) 발생 시
Performance (성능)
-
처리량: 채널 5개 → 초당 5개 메시지.
- Throughput: 5 channels → 5 messages per second.
-
총 처리 시간: 25,000개 메시지 → 약 5000초 (83분).
- Total processing time: 25,000 messages → ~5000 seconds (83 minutes).
-
429 최소화: 초과 호출(2~3개 중 1~2개)이 소수 → 재큐잉으로 효율적 처리.
- Minimal 429s: Only excess calls (1~2 out of 2~3) trigger re-queueing, handled efficiently.
Considerations (고려 사항)
-
Slack 정책: 빈번한 429는 피하는 것이 이상적이나, 소수 발생은 문제 없음.
- Slack Policy: Frequent 429s should be avoided, but occasional occurrences are fine.
-
효율성:
Retry-After
활용으로 지연 최소화 → 안정성과 성능 균형.- Efficiency: Using
Retry-After
minimizes delays, balancing stability and performance.
- Efficiency: Using
-
확장성: 채널 수가 늘면 429 빈도 증가 가능 → 모니터링 필요.
- Scalability: More channels may increase 429 frequency → monitoring required.
Conclusion (결론)
SQS FIFO와 Lambda를 사용하면 DynamoDB 없이도 Slack rate limit을 효과적으로 관리할 수 있습니다. MessageGroupId
로 채널별 직렬화를 보장하고, 429 오류 시 Retry-After
기반 재큐잉으로 초당 1회 제한을 준수합니다. 이 구조는 태스크 5000개와 채널 2~5개 환경에서 안정적이고 효율적입니다.
Using SQS FIFO and Lambda, Slack rate limits can be managed effectively without DynamoDB. MessageGroupId
ensures channel-specific serialization, and 429 errors are handled with Retry-After
-based re-queueing to enforce the 1 msg/sec limit. This setup is stable and efficient for 5000 tasks and 2~5 channels.
sqs
slack_sdk
lambda
- Generate layer
mkdir python
cd python
pip install slack_sdk -t .
cd ..
zip -r slack_sdk_layer.zip python
- Upload to lambda
- Lambda → "Layers" → "Create layer".
Top comments (0)