Confluent librdkafka kafka.Consumer.ReadMessage timeout

I use the GoLang librdkafka client to consume messages from Kafka, code looks like:
msg, err := c.kafkaConsumer.ReadMessage(2 * time.Second)
2 second timeout is more a guess than a conscious decision.
I'd like to drop this to be much lower, e.g. ReadMessage(50 * time.Millisecond)
if possible (for... reasons!).
Reading the comments on func (h *handle) eventPoll(...)
the ReadMessage essentially wraps it says // eventPoll polls an event from the handler's C rd_kafka_queue_t
, so I don't believe this will be spamming the network if I do this, however are there other reasons to not make this read loop faster?
Answer
Ah, good question! Let me walk you through the considerations for adjusting that timeout:
First, the good news:
That timeout is just how long it waits before checking again
No, it won't spam your network - you're right about that
librdkafka is quite efficient underneath
But some things to watch for:
You might see slightly higher CPU usage with more frequent polling
At 50ms, you'll get timeout errors much more often (though these are normal)
Might affect consumer group rebalancing if you go extremely low
What I typically recommend:
go
Copy
Download
// This 100ms middle ground often works well msg, err := c.kafkaConsumer.ReadMessage(100 * time.Millisecond)
When 50ms makes sense:
If you're doing real-time processing
When you have a constant high message flow
If you've tested and can handle the timeout errors properly
If you want to handle those timeout errors, check them like this.
if err != nil {
if kerr, ok := err.(kafka.Error); ok && kerr.Code() == kafka.ErrTimedOut {
// This is just a normal timeout
continue
}
// Handle real errors here
}
If you are still considering other approaches, you might also look at:
Using
Poll()
instead for more controlImplementing batching logic
Monitoring your consumer lag metrics
The key is to test with your actual workload. Try 50ms and watch your system metrics - if it works for your case, go for it!
These are just suggestions based on my experience - your specific use case might need different tuning. The great thing about Kafka is its flexibility, so I'd recommend experimenting with different timeout values while monitoring your system metrics. You might find that values between 50-200ms work well for low-latency scenarios.
If you'd like, I can share some monitoring approaches to help evaluate what works best for your particular setup. Just let me know what metrics would be most helpful for you to track.
Enjoyed this question?
Check out more content on our blog or follow us on social media.
Browse more questions