Why Zookeeper is always configured with odd number of nodes ?

This is a 2 min read. Someone in Quora.com asked me "Why Zookeeper is always configured with odd number of nodes?". Well, thats a great question but sad part is, not even many practitioners, even those who use Zookeeper in production can explain it simply. I will try to keep this really simple, I promise. ZooKeeper (ZK) is a highly-available, highly-reliable and…

Introducing FunnelCloud – A lightweight abstraction atop Apache Storm

This is a 4 min read. Idea of building a light weight abstraction on top of storm is to bring the best of micro-batching and processing flexibility of storm. FunnelCloud also has few added practical features. Gwen Shapira, Confluent explains the value of micro-batching and how it improves the throughput in distributed architecture where n/w roundtrips are inevitable. Here is the full post. Let's say due…

“Exactly-once” with a Kafka-Storm Integration

This is a 4 min read. Update 4, Nov 2016: When I first wrote this post it was outright mockery and contempt. But the Google Data flow paper (The Unified google framework for Batch (FlumeJava) and Stream processing (MillWheel)) and the Google MillWheel paper clearly explains that this is exactly the same approach google team has taken to solve the duplicate events problem….

What wikipedia can’t tell you about Apache storm and Apache spark streaming

This is a 1 min read. I am seeing a lot of questions around Spark streaming and Storm in Quora. When to choose what and what are their performances, reliability and support like. There are a lot of comparisons as usual available in the web, if you google around you could find. But instead comparing them side by side I thought of talking…