Internals of Spark Streaming

This is a 3 min readSome context… As the title of the post suggests, this is not a Spark streaming primer. Frankly, this post is written for an audience who seeks to enhance a foundation of knowledge that has already been established on Spark and Spark streaming. I also find a surprising number of developers programming in Spark streaming without knowing the inner… Continue reading

What you need to know before writing Streaming APIs

This is a 2 min readWhat are Streaming APIs? Streaming APIs are not to be confused with multimedia streaming API services like Netflix or Youtube. Industry is starting to use a newer breed of REST APIs called the Streaming APIs to offer a “high-throughput” pipeline to receive curated data. With these APIs you can capture information in real time. It’s… Continue reading

What wikipedia can’t tell you about Apache storm and Apache spark streaming

This is a 1 min readI am seeing a lot of questions around Spark streaming and Storm in Quora. When to choose what and what are their performances, reliability and support like. There are a lot of comparisons as usual available in the web , if you google around you could find. But instead comparing them side by side I thought of talking… Continue reading

What you didn’t know about Real-time notification systems

This is a 2 min readI have been intrigued by Event Notification systems for a long time now, In fact this started from my programming days in legacy environments like iSeries. So I started working on a toy project which evolved into a solid project. I thought I will muse about that recent project the RealTimeNotification. But before going into the details of the… Continue reading

Polyglot persistence and NOSQL

This is a 2 min readRecently I had a real life encounter with a scenario where I had to design a solution with a strategy which turned out to be strategy called Polyglot persistence, which I learnt later doing some research. Actually 4 years ago from when I wrote this post  Martin Fowler put forth the concept Polyglot persistence. So what’s polyglot persistence ?… Continue reading

I Love scikit-learn and NLTK for python

This is a 1 min readI have been recently trying out scikit-learn and NLTK with python mostly “classifying” data. As they say its the best available combination to “teach yourself data science”. I am in love with the features like the pre-packed corpora it comes with like movie reviews. NLTK comes with 50 corpora and lexical resources which you can play around… Continue reading