I see best practices as special recipes created after doing the same thing over and over again by mixing various ingredients in various ways until you get an outcome easily reproducible (standard) and of superior quality.
Letβs look at the following scenario of an endless loop and options to solve it by applying the right ingredients:
Use case prerequisite
- The message published on the queue is invalid
- The exception thrown by the receiver that picked up the message for processing, is not caught in the catch block
Use case description
- A requestor publishes an invalid message to a message channel (queue).
-
There is one listener / receiver to this queue, that will get the message.
a) In case of a valid message, an acknowledgement (ACK) is sent back saying that it is now in safe hands.
b) In our case the invalid message with incorrect error handling leads to an uncaught exception and a message locked βin-flight (a message awaiting ACK or NACK) until the Default Lock TTL (time-to-live) expires (for example: 2 min) After the 2 minutes (Lock TTL), the invalid message is made visible to other receivers. In our case, the same receiver will pick up the message and resume from 2.b)
The poisonous message will be kept on the queue for 7 days (TTL = 7 days) which gives us a nicely hidden infinite loop.
What do you do when you see that the message keeps being released on the queue and the receiver logic gets executed at a speed that only makes you happy when you are looking for fast performing systems? You of course try to solve the puzzle...
As you probably noticed from the above prerequisites there were already two major issues:
- Message validation was not applied before publish to the queue
- Always catch ALL exceptions, no matter what!
To err is human, hence you should always design your solution for failure.
Several suggestions:
- Always use a Dead Letter Queue assigned to your processing queue so that the invalid messages are moved out to the Dead Letter Queue after a given number of retries. This way you at least shorten the life of the infinite loop.
- Setup monitoring for your application / services in relation to the non-functional requirements: do you expect a load on your queue of approx. 10 messages per minute and you end up processing 2000, then most probably this requires your attention
- Review your configuration settings, default is good, but custom case by case might be even better
- Unit test your code, including error handling
Technology used: Anypoint MQ (MuleSoftβs cloud messaging service) and MuleESB
Top comments (4)
I don't understand this article. Which best practices are you talking about? Why would you make an invalid message visible to other receivers after a lock timeout? Can you explain a little better?
All good questions, thanks! I understand that this post might be difficult to digest. The first part is a walk through on a use case inspired from real life where an invalid message was published to a processing queue (why? due to a bug in the developer's code) and caused continuous looping between the queue and the listener of that queue.
The best practices I highlighted in a subtle manner are:
Hope this clarifies it.
Cool, that does help my understanding. It's worth keeping in mind that you can't rely on message validation before it is published to the queue (think malicious user!), so having the receiver perform validation to pass any invalid messages to a dead letter queue is essential.
Keep up the writing!
Excellent, glad to hear! Thanks, will do! :)