Apache Kafka is designed to scale up to handle trillions of messages per day. Kafka is well known for it’s large scale deployments (LinkedIn, Netflix, Microsoft, Uber …) but it has an efficient implementation and can be configured to run surprisingly well on systems with limited resources for low throughput use cases as well. Here are the key settings you’ll need to change to get Kafka running on your low end VPS or Raspberry Pi:
Java heap settings
The default java heap sizes for zookeeper and kafka are 512Mb and 1Gb respectively. With appropriate configuration properties, you can take these down to as low as 4Mb and 32Mb:
KAFKA_HEAP_OPTS="-Xmx4M -Xms4M" bin/zookeeper-server-start etc/kafka/zookeeper.properties
KAFKA_HEAP_OPTS="-Xmx32M -Xms32M" bin/kafka-server-start etc/kafka/server.properties
Kafka config settings
The most important configuration setting to tweak is log.cleaner.dedupe.buffer.size
.
For Kafka > v0.9.0, this is set to 128Mb by default, pre-allocating 128Mb of Java heap
space. You can reduce this to as low as ~11Mb. You could also disable log-compaction altogether
by setting log.cleaner.enable
to false
but you won’t want to do this if you’re using
Kafka to keep track of consumer offsets as these are stored in compacted topics.
Finally, there are other parameters that could potentially be tuned down as well, for
example background.threads
(threads have an associated memory cost), however I’ve never
bothered. I’ve successfully deployed and run Kafka as part of a hobby project on a low-end
VPS with just the above two changes.