22/02/2020

[Kafka] Delay poll return for testing scenarios

One of the challenges you will face when testing applications that rely on Kafka to send and receive messages, is that the consumer behaviour is quite erratic.

This is all due to how the call to the poll method works.

In short, three key factors are considered before the method returns some messages, the set poll duration and two consumer configurations:
  • fetch.min.bytes: as soon as this many bytes can be read from the desired topic, the consumer's fetch request is ready to be fulfilled and all messages it has retrieved by then will be returned. Default is 1 byte.
  • fetch.max.wait.ms: how long should the fetch request be kept waiting if there is not enough data to return (see previous parameter). default is 500ms.
  • poll timeout: how long will the consumer wait for the fetch request to provide some data before returning. 0 means no wait.
As you can see, in a test environment you might want to control these timings very precisely, to ensure all your send and retrieve message requests are correctly and consistently fulfilled.

I suggest increasing immediately your fetch.min.bytes configuration to something reasonable according to your test message size.
Then you might want to slow the application down a little with the fetch.max.wait.ms setting to account for slight delays in message reception by the test broker.
Lastly, set the poll timeout to a value higher than the fetch.max.wait.ms to ensure your consumer will never return too early while some test messages might still be in the queue for the specific test case. Also remember that behind the scenes, poll does MUCH MORE than simply retrieving messages, therefore you want to allow for enough time to perform the various administrative tasks.

No comments:

Post a Comment

With great power comes great responsibility