CIO

Strategies to create a big data management plan: 2015 roadmap

Success lies in starting small with your big data projects.
Big data - starting out small

Big data - starting out small

In the race to embrace big data, companies benefit from a clearly-defined action plan. Here is a step-by-step guide to managing your big data strategy. This roadmap clarifies the concepts and terminologies and ensures your project team is well-armed to walk the talk.

Step 1: Sidestep the jargon

Behind the jargon, concepts around big data are still evolving. Start by clarifying the distinctions between big data and conventional data management. Explain the concepts to key stakeholders and refine the more manageable components.

The danger lies in concepts being lost in translation. Traditional data is clean, with the gaps filled and outliers removed. Hypothesis can be tested together with supporting evidence. This evidence or data is collected or stored in the more traditional enterprise data warehouses.

Big data is messier and comprises structured, semi-structured or unstructured content. This comes from many different sources including mobile devices, internet traffic, streaming, machine-to-machine communication, sensors, or GPS tracking systems.

In this dynamic and unpredictable space, today’s big data may become tomorrow’s old data. Nothing stays constant around the ticker-tape of human communication or interaction.

As a traveler on the big data journey, start small. Leave the science to the big data scientists, a niche breed perhaps best left to smashing atoms at the Large Hadron Collider. Inside the less exciting trenches, ask yourself: what is your big data strategy?

Does this strategy adapt to your lines-of-business, service delivery and operational needs? Which technologies, standards and practices complement what you want?

Step 2: Avoid more of the same

The danger lies in rebadging your information management plan as a big data strategy. To profitably analyse, share and leverage the more unstructured information, firstly clarify your high-value data sets.

These datasets are open, readily-available and can be freely used, re-used or re-distributed by anyone. Beyond the semantics, assess how the analysis of big data allocates services as and where needed, clarifies policy or improves business processes and governance.

Avoid jumping in to interrogate your big data. A plethora of commercially-available analytics’ tools do this for you. Rather, this journey is about exploration, detours, more fluid relationships and adapting to a shifting landscape.

Step 3: Take a look back

If you step “back to the future” take a closer look at your available information resources. Who owns which piece of the puzzle? Strategic planning, a somewhat over-used term, comes into play. This strategy takes a closer look at available data sources, potential of this data, the costs and barriers to access.

This strategy straddles scientific, economic or social research. At an operational level, analytics is useful for customer or client segmentation, market research, managing campaigns or tracking domestic or global economic trends. Fraud detection or managing risk offers untapped potential.

If you’re in the front-line and make quick, high-volume or time-sensitive decisions, big data comes into its own. A broader variety of data sources offer deeper insights into business problems. This comes into sharper focus where data is limited or not readily available. More specifically, you need to be able to predict events with better accuracy. Or connect the dots between the casual but tightly-knitted relationships.

Step 4: Where did this come from?

In the rush to implement a big data strategy, the danger lies in losing sight of accuracy or authenticity. In the government, healthcare or education space, being accountable takes centre stage. The devil lies in the detail, for example, being able to identify and verify diverse data sources.

The goal is to make an intelligent and informed use of data. While policing your data assets may be unrealistic, it helps to identity high-value data sources or the crown jewels. Keeping a register of these resources also helps.

Big data, like other data, is governed by government policies, regulation and laws. This is increasingly so as consumer data is exchanged or shared in a commercial environment. Take time to consider ways to manage the data trail, or leverage the more clearly-defined audits.

You can treat your data as a strategic resource. But take time to examine the privacy and security implications. The conundrum lies in balancing open data access with responsible and accountable information exchange.

Step 5: Why privacy matters

Regulators are coming down hard on privacy and security breaches, especially as the exchange of personal data becomes pervasive, commercially-attractive and within global reach. You may hear about privacy by design principles.

Before jumping into a data lake or a large object-based storage repository, learn more about building your de-identification capabilities. Personal identifiers may be zapped through bare-bone deletions, also called a safe harbour. Or you may try masking, aggregation or expert determinations. The other tack is to offer a separation of duties and staff working with the data.

Privacy safeguards need a closer look at the point of data collection. This incorporates online, mobile or other sensors. While privacy impact statements glaze the eye, the fallout comes at a cost, including unexpected penalties.

Among the remedies, track the information flow across your project. Analyse and assess the impact on people, services or operations. Ideally, this assessment starts in the planning stages and is not an after-thought.

Be proactive rather than react to the bushfires. Use privacy as a default setting, ensure this magic word is embedded into design, and more importantly, invest in end-to-end security rather than plugging the gaps.

This guide sources material from an Australian government Australian Public Service Better Practice Guide for Big Data (version 2.0, January 2015) . A useful list of data analytics resources is available in this guide. Additional insights are sourced from related IDG content.