If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:
Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976
Many thanks.
So you want to ‘do’ Big Data
Now everyone is doing Big Data you don’t want to be the odd one out, right? Of course not.
Now, if you are serious about looking at Big Data from a business perspective then I will try and lend you some advice. If you are doing it from an IT or technology perspective, then I wish you good luck, and I hope that your Big Data initiative doesn’t turn into another tech crash-and-burn show.
Now some Big Data pros are telling us that the place to start with Big Data is with strategy. Now, I’m too polite to call this out as abject bullshit, even though it is, and will instead content myself by offering an alternative and simple approach to approaching and addressing Big Data.
My first piece of advice is this. DON’T START WITH STRATEGY!
Don’t start with Strategy
Strategy is a coherent, cohesive and executable response to a significant challenge.
Strategy is not a definition of objective, a wish list of what you are trying to achieve or aspirational goals of a nebulous nature. No, strategy is not the objective but a means of reaching that objective. Strategy is real, tangible and executable. Strategy is doing.
So what is a Big Data strategy?
If a company is looking at the Big Data options, the last place they should want to start out from is from strategy. That is as silly idea as they come. Starting with strategy on the road to formulating viable responses to significant challenges and opportunities is like saying that before we choose strategic options and a realisable strategy, then you must have a strategy in place.
Strategy is not working out what you want to achieve. That sort of thing should happen prior to any strategic work. Neither is strategy an exercise in establishing starting points, nor formulating questions nor understanding the challenges. All of this should come well before the major strategy aspects even kicks-in.
Big Data strategy is a realisable, tangible and manageable response to a significant challenge, one that depends heavily on the availability, usability and credibility of Big Data (or Very Large Data Bases) and the business value of processing that Big Data.
So, a word of advice. If you are thinking of embarking on a Big Data initiative, do not start with strategy. That is a really daft place to start.
Start with business imperatives
Start here instead. With real business imperatives. This is where you are thinking about the big and significant challenges to the business, and how, at a high level of abstraction, you could go about meeting those challenges. Here you identify your challenges and your responses, aligned to your objectives.
If you can identify business imperatives that make it absolutely necessary to include elements of Big Data, then go forward with that mandatory requirement in mind. If not, then don’t try to shoe-horn Big Data into a place where it really isn’t needed or wanted. Because if you go against the grain in this way it may well hurt you and your business, in more ways than you bargained for.
Know what you are looking for
In order to go out looking for data requirements driven by business imperatives, we really need to know what we are looking for.
What we are looking for maybe highly tangible or less so. We may have to derive the data we are looking for by refining, aggregating, enriching, filtering and cleansing. Therefore, with those and other aspects in mind, we can go out and find what we need.
How to find what you are looking for
From looking at the data requirements, you should have a good idea of potential sources of that data. Agility in this aspect is predicated on the premise that one knows the systems on the IT landscape, the business processes and all the potential sources of data – at a high level at least. So, this is not the sort of work you can do remotely with little or no knowledge of the clients business, IT setup, processes or culture.
But anyway, after you identify the sources you move on to the next step.
Check data availability
Here you discuss aspects of the data you require with the database / application platform owners to ensure that:
- they have the data you are looking for
- that quality of the data is known and data quality can be addressed
- that the data is relevant for what is needed
- that the cost of providing this data is not prohibitive
- that this data can be made available to you
- that service levels could be put in place, if and when required
So far so good. Once passed these hurdles (and don’t forget this is a super-simplification) we move in to the next.
Make proof of concepts
So, now we know:
- What data we need
- Where we can get it from
- How we get it
- What we need to do to make it usable
- How we need to analyse it
Therefore, we go ahead and create a proof of concept or three. Simples!
However, make sure that all prototypes are governed by these simple timeless guidelines:
- The proof of concept should be small enough to be doable in a reasonable time-frame. I would be rather generous for the very first pilot of its type in a company, but would set that limit at 90 days, tops.
- Make sure that the proof of concept is big enough to be significant. Again, ‘simple enough to be realisable’ and ‘large enough to be significant’, should go hand in hand.
- Arrange your proof of concept execution into sprints. So your 90 days may be made up of nine 10 day sprints.
- Don’t try and shoe-horn infrastructure aspects of your initiative into sprints, it just doesn’t work, and simply pisses people off.
- If a proof of concept looks like it will fail, then make sure it fails early. There’s nothing worse than having people insist on pushing a dead project to live the full length of its planned term. Failing early means that business doesn’t take a dim view of the pilot, and will be more open to new proof of concept initiatives.
Analyse the outcomes
You run your proof of concept. You analyse, assess and represent your outcomes. You socialise, present and interpret.
Revise your strategic outlook accordingly
When you’ve done that you are in now in a good position to estimate the usefulness of the exercise, from both a qualitative and quantitative perspective.
Did I mention technology?
I did not want to touch in specific aspects of technology in this piece, in part, because I did not consider it a central issue in the theme of things. Of course, as part of creating proofs of concepts and pilot schemes you may want to experiment with the swatch (swaith? oh for auto-correction) of technologies out there. So go ahead and evaluate ‘Big Data’ technologies, and don’t forget, the answer to every Big Data technology question isn’t an automatic ‘Hadoop’. There are other valid Big Data technology options around, such as Lustre and GPFS, or even Oracle, Teradata or EXASol. Also, remember this, if all you are working on is a prototype, a proof of concept or a pilot then you can try and negotiate a free license with any of the major DBMS vendors for that initiative. So negotiate, bargain and get the most appropriate technologies with the best deals.
That’s all folks
Finally I will leave you with three guidelines to consider:
- Don’t ask ‘how can I do Big Data?’ but ‘what data do we need?’
- You don’t need to seek out Big Data. If you really need it, and it’s available, and it’s adequate and appropriate, then you’ll be getting it soon enough.
- Avoid searching for a Big Data problem you don’t have, which can only be solved by Big Data technology you don’t need.
Many thanks for reading.
In subsequent blog pieces I will be sharing my views on the evolution of information management in general, and the incorporation novel and innovative techniques, technologies and methods into well architected mainstream information supply frameworks, for primarily strategic and tactical objectives.
As always, please reach out and share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational, leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps you will even consider sending me a LinkedIn invite if you feel our data interests coincide. Also feel free to connect via Twitter, Facebook and the Cambriano Energy website.
For more on this and other topics, check out some of my other posts:
Absolutely Fabulous Big Data Roles – https://www.linkedin.com/pulse/absolutely-fabulous-big-data-roles-martyn-jones?trk=prof-post
Not banking on Big Data? – https://www.linkedin.com/pulse/banking-big-data-martyn-jones?trk=prof-post
10 amazing reasons to join The Big Data Contrarians –https://www.linkedin.com/pulse/10-amazing-reasons-join-big-data-contrarians-martyn-jones?trk=prof-post
Amazing Data Warehousing with Hadoop and Big Data –https://www.linkedin.com/pulse/cloudera-kimball-dw-building-disinformation-factory-martyn-jones?trk=prof-post
The Big Data Contrarians: The Agora for Big Data dialogue –https://www.linkedin.com/pulse/big-data-contrarians-agora-dialogue-martyn-jones?trk=mp-reader-card
The Big Data Shell Game – https://www.linkedin.com/pulse/big-data-shell-game-martyn-jones?trk=mp-reader-card
Aligning Data Warehousing and Big Data –https://www.linkedin.com/pulse/aligning-data-warehousing-big-martyn-jones?trk=mp-reader-card
Big Data Luddites – https://www.linkedin.com/pulse/big-data-luddites-martyn-jones?trk=mp-reader-card
Data Warehousing Explained to Big Data Friends –https://www.linkedin.com/pulse/data-warehousing-explained-big-friends-martyn-jones?trk=mp-reader-card
Big Data, a promised land where the Big Bucks grow –https://www.linkedin.com/pulse/big-data-promised-land-where-bucks-grow-martyn-jones-6023459994031177728?trk=mp-reader-card
The Big Data Contrarians – https://www.linkedin.com/pulse/big-data-contrarians-martyn-jones?trk=mp-reader-card
Is big data really for you? Things to consider before diving in –https://www.linkedin.com/pulse/big-data-really-you-things-consider-before-diving-martyn-jones?trk=mp-reader-card
Big Data Explained to My Grandchildren – https://www.linkedin.com/pulse/big-data-explained-my-grandchildren-martyn-jones?trk=mp-reader-card

If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:
Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976
Many thanks.