The Middle Road to Big Data

Big Data is polarizing. The concept can elicit both overzealous enthusiasm and pointed disdain depending on the audience. Recent trends in the technology press show a strong backlash to counter the hype over the last few years. As with most developing trends, both camps have valid points but the practical approach lies somewhere in the middle.

What is Big Data?

The first problem is the concept “Big Data” is quite nebulous. A Gartner Analyst used the term Big Data to describe the genesis of technology architectures and processes to address web scale data problems of the internet giants such as Google, Yahoo, and Amazon. Essentially, the term describes a challenge rather than a solution. The team at Yahoo! developed Hadoop – a more specific example of Big Data technology – because they needed a highly scalable and reliable general batch processing platform to perform a variety of routines on very large data sets. This innovative approach started with a specific use case – to process web crawler data in a timely fashion on data sets that are growing exponentially. The outcome is a large scale data processing platform and programming model that can apply to a variety of batch data processing tasks.

Where Did Big Data Begin?

With much heralded success at internet companies, the technology press started hyping the potential of Big Data solutions as a remedy for all challenges with large data sets. One 2008 Wired article went so far as to predict the obsolescence of the scientific method with mathematical algorithms and huge data sets replacing methodology. Recently the hype was replaced with an impassioned backlash declaring Big Data sets biased and conclusions based upon Big Data analysis flawed. Big Data advocates suffer from “Big Data Fundamentalism”, which is the idea that with larger data sets we get closer to the objective truth. The philosophical lines are drawn.

Practical Uses of Big Data

Big Data technologies provide useful capabilities to process and analyze huge data sets as well as un-structured and semi-structured data. The uses include operational intelligence, business intelligence, risk analysis and fraud detection, brand and sentiment analysis, and personalized marketing to highlight a few. Big Data does not necessarily replace existing business intelligence solutions but augments organizational capability to gain new insights from existing and new data sources.

Who is wrong and who is right? The answer is neither camp. The reality is somewhere between extremes.

Implementation Pitfalls to Avoid

Big Data technologies are evolving. The opportunity should not be ignored; however, adoption is not without difficulty. Challenges include lack of skills with the technologies, lack of maturity of the technologies and technology ecosystem, and cost of investment of infrastructure to support Big Data platforms. The dearth of skills and experience with Big Data technologies and processes might represent the most significant looming gap today. Organizations need to decide whether to acquire skills, build from within, or find strategic partners.

With these challenges in mind, organizations should evaluate if Big Data technologies support current or emerging business objectives. Apply the technology to solve a business challenge but not the reverse – deploy the technology and then search for a business challenge to solve. Choose an objective that brings real value with a reasonable expectation of success. Evaluate cost in investment against benefit or outcome. Evaluate the maturity of the organization and capability. Choose partners focused on business outcomes. Avoid the hype and temper the criticisms to find the middle road to Big Data.

Tim Eck

About Tim Eck