The title of this post might sound like a response for a BigData survey or research poll. I felt this as a good discussion point. Even though there is a lot of hype and movement around the BigData space, if you do some digging there will be a lot of research results showing otherwise when it comes to adoption. There is still no clarity on how things will be better than today with Big data among the enterprises. In spite of compelling use cases still there is a good amount of resistance towards spending money in this space. A very interesting trend to be noticed is, If an enterprise has done a due diligence and figured that this is not for them by all means they can disregard this space. But in spite of finding out that adopting BigData can take their business up couple of notches, some enterprises simply hesitate. Could the resistance be due to simple security reasons or simply they just “We don’t want to get caught up on the hype ;-).” Adding to this there is already cloud which lot of people are apprehensive about. I did a bit of thinking as to whats the reason for the holdup. After some thinking I have come up with my version of the possible reasons for this.
Any smart enterprise would have arrays of lessons learnt from their past experience as for as adopting to the so called next big things. Everybody wants to experiment and if at all to fail, they would want to fail cheap. I think most of the enterprises are daunted by the “first step” which is taking the data into the Hadoop ecosystem. All they want is the simple and inexpensive means to ingest all the relevant data they want in the Hadoop ecosystem. All the data problems can be classified into 4 major categories: (Credit Oracle BigData team)
- Acquire and Organise data
- Enable great access to wider data
- Analyse and refine critical data
- Decide and publish insights (real time or batch)
We are going to talk about each of these items in greater detail in the subsequent posts. Let see how daunting can be “Acquire and Organise data” step. This step is refried as Data Ingestion and as in the case of any implementation, torment of choice will be the first thing to deal with . How to pick the right tools which will align to our technical and strategical roadmap is critical.
Some of the possible questions you might have to answer are,
- What are the different kinds of data(transactional and non-transactional) we have and how many different sources are they coming from ?
- Market Intelligence (Noise vs Inputs)
- Go with Ad hoc data loads ?
- Go with open source frameworks and tools which are standalone ?
- Go with managed / hosted services off the shelf ?
- Should we have a different short term solution to quicken time to market?
Each of these questions in turn have several other sub questions or answers with choices.
How did you deal with your first step towards Big data driven solution ? What was your Enterprise data ingestion strategy for different types of data?