I recently answered “Which NoSQL DB is more advantageous for IoT data?” in quora. ” Let me answer the question in 2 different aspects: 1. Design and 2. Choosing 1. Design: The question: what kind of data store and an upstream system is warranted to back a large IoT (or even an Industrial IoT )… Continue reading
Post Category → NoSQL
Modernising data architecture for enterprises
Prelude Before getting into the topic of focus i.e. how to modernise data architecture for large enterprises (which typically comes with lot of legacy baggage and organisational memory), I would like to set the context by clearing the air around one thing that is related to this subject. First step in Big data solutions consulting like any… Continue reading
Experiences with Kafka and exactly-once processing in IoT apps
Some context on message brokers and delivery guarantees (If you have fair amount of experiences with message processing and delivery guarantees please skip to the next part of this post.) Message delivery guarantee is one of the canonical requirements for message brokers and they are very relevant for all types of brokers: the ones based on queue semantics and the ones… Continue reading
Wide row data modelling with Apache Cassandra
I have always been intrigued by the performance claims of Apache cassandra. So, I wanted to put the whole “wide rows” and the performance edge claims that wide-row data model said to offer to the test. Rumour has it, Facebook hired ex-Amazon engineers who wrote Dynamo to build cassandra. Anyways, a sound starting point is to… Continue reading
Terminology confusion: Column Stores and Column oriented databases
This is my attempt to clear the air in the subjects of Column Stores and Column oriented databases (both at terminology and at understanding level). I will be talking a bit about how terrible is the idea of grouping column oriented databases as flavour of NoSQL data stores. What is a column store really ? There is no scope… Continue reading
How does the Log-Structured-Merge-Tree work?
If you are wondering why should you care about LSM Tree, In one of my previous posts Art of choosing a datastore , I have briefly touched upon LSM-Trees. But this writeup is the best out there if you want to learn the inner workings of a LSM-Tree. How does the Log-Structured-Merge-Tree work? This was Quora answer by David Jeske…. Continue reading
Art of choosing a datastore
Update 3,Nov 2016: When I first wrote this post, there were a lot of opinions/comments (in my older blog) about how I am wrong in thinking about choosing a datastore is almost like choosing a data structure when writing a program. Here is an excerpt from Nathan Marz’s book “Big Data: principles and best practices of… Continue reading
C for Consistency : Inconsistencies about “consistency” in data systems
There are countless resources explaining CAPl, there are a lot of confusions around this subject. CAP has everything to do with distributed systems, NoSQL data stores happen to be one type of distributed system but surprisingly CAP has been misunderstood after the popularity of NoSQL data stores. IMHO no one explains these better than these… Continue reading