Why Zookeeper is always configured with odd number of nodes ?

This is a 2 min readSomeone in Quora.com asked me  “Why Zookeeper is always configured with odd number of nodes ?”. Well, thats a great question but sad part is, not even many practitioners, even those who use Zookeeper in production can explain it simply. I will try to keep this really simple, I promise. ZooKeeper (ZK) is a highly-available, highly-reliable and… Continue reading

Introducing FunnelCloud – A lightweight abstraction atop Apache Storm

This is a 4 min readIdea of building a light weight abstraction on top of storm is to bring the best of micro-batching and processing flexibility of storm.FunnelCloud also has few added practical features. Gwen Shapira, Confluent explains the value of micro-batching and how it improves the throughput in distributed architecture where n/w roundtrips are inevitable. Here is the full post.  Let’s say due… Continue reading

“Exactly-once” with a Kafka-Storm Integration

This is a 4 min readUpdate 4, Nov 2016: When I first wrote this post it was outright mockery and contempt. But the Google Data flow paper (The Unified google framework for Batch (FlumeJava) and Stream processing (MillWheel)) and the Google MillWheel paper clearly explains that this is exactly the same approach google team has taken to solve the duplicate events problem…. Continue reading

What wikipedia can’t tell you about Apache storm and Apache spark streaming

This is a 1 min readI am seeing a lot of questions around Spark streaming and Storm in Quora. When to choose what and what are their performances, reliability and support like. There are a lot of comparisons as usual available in the web , if you google around you could find. But instead comparing them side by side I thought of talking… Continue reading