I have met quite a few people who were asking, “Who is really dealing with Big Data – in the said volume or variety or velocity”. It’s a great question. So this is how I would like to take the conversation forward with them.
So are you apprehensive about the research results like “the amount of data that will be in the universe by 20xx will be in the order of zettabyte or even more”?.. good. Just think about this for a minute – why can’t this be true ? What could be the possible drivers for such predictions ? What is different now, what has changed and about where these data can possibly come from and why didn’t we have this volume of data a decade back or couple of decades backs. I will explain this in an enterprise context.
# Emerging Sources
Web 2.0 (2004 onwards)
One of the biggest transformations of our time is from web 1.0 to web 2.0, which includes the Social media, Wikipedia, Blogs, YouTube, self-service forums. What do you see as a common thing in all the above? It is all empowering users to generate contents. Basically Web 2.0 is nothing but a fancy name for user generated content. Users of the web 2.0 are empowered to generate a lot more data when compared to the pre-social media era. Web 2.0 has potential for volume, variety and velocity.
Web 2.0 + Mobile evolution (2010 onwards)
With the mobile revolution users started generating more content that before.
# Futuristic Sources
Whilst IoT isn’t a new concept as such. It just had a different guise and was called M2M or Telematics in the previous decades. Telematics as a discipline existed long time back. M2M and Telematics worked typically with huge appliances and bigger machines. How many such bigger machines could have existed in the world using telematics technology churning data? May not be a lot. What has changed now? The difference is participation and empowerment. How? More Ubiquitous devices can participate in M2M today due to some great innovations and miniaturisations in related fields. We got ant size radios, antennas, sensors and micro energy harvesters today. With these advents it is only appropriate reincarnate this discipline as Internet-of-things (IoT). Just to get a feel of things take a look at the IoT timeline.
# Existing Sources
But if you take typical enterprise data this might not entirely make sense, but overall volume of data has changed over a period of time and some enterprises might have requirements to handle the velocity of the data in a affordable scalability to meet market changes. so typical enterprise data might pose only velocity and volume challenges to a large extent. Usually the data we are talking about are transactions and different types of log data. Affordable scalability is the keyword based on which all Hadoop and NoSQL solutions came into existence and not RDBMS cannot handle.
Coming back to the question “Who is creating Big Data”? – Simple answer is you. “Who is dealing with Big Data?” – Sooner or later every enterprise some way shape or form.