A CIOs Guide To Big Data: Storage, Process And Analysis

Guide To Big Data

Big data has a vital role to play in making the incomprehensible understandable. The need to target the right people at the right time is something that businesses, NGOs and governments are gradually onboarding as the world starts realising the power of large volumes of rapidly changing data in understanding behaviour and enabling communications to the hardest-to-reach segments of the population.

Organisations need to roll out big data in a considered fashion and, as the saying goes, you should walk before you run. A large part of any big data undertaking is making sure you have the correct infrastructure in place to handle large data sets. This needn’t be cost heavy but how do you apply best practice to storing big data and what do CIOs need to think about?

Firstly, data storage has to be flexible enough to handle the vast array of information you gather. Object storage architectures should be considered and may be the best way to manage unstructured data assets in a scalable manner, separating different classes of data object, giving users the ability to automatically recognise what type of data has been received and package this up in an accessible way. The innate intelligence of this brings with it time savings and much quicker data retrieval time, enabling real time processing.

Much consumer data involves an underlying emotion being expressed, whether it’s someone who’s showing their friends their recently purchased BMW on Facebook, or someone Tweeting a poor experience while waiting for a train. Incorrectly structured data storage will inhibit the ability to react in real time to these key signal drivers. If there’s a need for a company or organisation to respond it’s important to do so in a timely manner, and sometimes that’s instantly! Customers appreciate it when a company understands how they like to be communicated with and on their terms.

Handling extremely large volumes of data on a daily basis lets you add insight to online behaviours through the myriad of devices that today’s connected consumer uses. This means knowing what works and what doesn’t and, in terms of analytic platforms, there’s not one definitive answer. With processing power and techniques evolving rapidly since the turn of the millennium, nowadays there are a number of ways for you to take on the biggest customer engagement challenges, without losing your marbles.

Take the concept of ACID (Atomicity, Consistency, Isolation, Durability) within database systems, for example. These four properties of database transactions, defined in the 1970s, have enabled massive relational database systems to operate consistently for many years but the Achilles heel today is how they complicate building the scalable distributed data stores often used to solve even larger Big Data problems.

However you don’t have to throw the baby out with the bathwater! Modern approaches to data management, such as those applied within MongoDB and Dynamo, free up the data from aspects of these ACID properties enabling even larger systems dealing with even more rapidly changing data, reacting even faster, running more reliably, and giving even richer insight and capability.

Furthermore, technologies like Hadoop give you the power to turn your data processing into a digital car production line of sorts, by partitioning and compartmentalising your data activities and enabling even greater parallel work. Take for example SETI (Search for Extra-terrestrial Intelligence Initiative), the sheer volume of data that they need to process caused a huge problem for researchers.

By developing SETI@home with Berkeley University, the largest current distributed computing effort with over three million users, researchers were able to partition data into manageable chunks and process these on multiple computers, dramatically improving their processing power. The same principle works for processing Big Data in other industries, by chopping up your data and sending this to multiple processors, you can dramatically increase your processing power, giving you the ability to solve problems that a decade ago would have seemed impossible.

Once you’ve done all of this, you’re ready for the exciting part – analysis. This is the area where we can make the most progress over the coming years. It should be the aim of every CIO or CMO to turn their big data into fast data, reacting in real time to key behavioural signals that highlight when a person is looking to purchase a certain product or service.

This is where the tech turns human and a company’s muscle memory has to be nurtured to understand the data, the sentiment behind it and then react appropriately. A company’s peripheral vision will allow them to detect these key signals or intent to purchase as and when they occur, and then process this to react appropriately. Despite our great progress in these fields, bringing this sort of reactional intelligence to big data analytics is a difficult undertaking and only partners that offer true insight can help you do this.

David Keens

David Keens is a consultant at big data and SaaS specialist, Acxiom. David works in the Marketing Technology Practice of Acxiom's Consulting organisation, helping his clients drive insights from the information they have about their customers and optimise their technology.