What is so sexy about Unikernels ?

Unikernels (sounds almost like unicorns) are the newest advancement or the latest buzzword in the infrastructure virtualisation space to the say the least. Unikernel.org and Wikipedia offer great definitions for unikernels, but I felt stacking it against other virtualization techniques will be a good addition to those definitions. So, here is a quick comparison and a… Continue reading

Continuous streaming integration with streaming data platforms

Someone asked me in Quora “Should I use Gobblin or Spark Streaming to ingest data from Kafka to HDFS?” Here is what I wrote: This introduces a new architecture pattern called continuous streaming integration (CSI) with streaming data platforms (SDP) for solving the app and data integration challenges. Short answer: If your data sink is… Continue reading

Doing “exactly-once” in stream processing, a Google cloud data flow perspective

As the title suggests the scope of this multipart post is to evaluate how exactly-once processing is proposed in Google cloud data flow paper (link shared below) and hence implemented in the data flow service (which is the basis for Apache Beam). Although the titles are different these posts shall be considered as precursors for this post (here… Continue reading