Martyn Richard Jones
To begin at the beginning
This is the first in a series of collections of talking points on the processing of extensive data sets by non-relational or pseudo-relational means, speculative data analytics with these large data sets which is typically non-operational data and social media data obtained from internet sources, and how usable outcomes, if any, are derived, can be integrated into strategic, tactical and operational decision support.
Currently, this area is parked under the misleadingly named ‘Big Data’ umbrella, but in the near future I predict this niche will be merged into the more recognisable, business-oriented areas of data warehousing, data architecture, and business intelligence, and rebadged to avoid further confusion.
Each number in this series will address 7 talking points.
Here are the first seven talking points that address aspects of primary mass data processing, speculative analytics, and outcome and result persistence and association.
Keep it simple
The leading and continuous mantra for all ‘Big Data’ initiatives should be simplicity.
Simplicity means identifying a well-bounded speculative opportunity and then focusing on it, whilst not allowing for scope creep until the work is done and a following iteration is defined.
Simplicity means taking the data that is needed, along with the useless baggage it is unfortunately bundled with, and reducing it to the essentials at the earliest possible moment.
Simplicity means moving the data reduction problem upstream, preferably to the point where it is generated and stored.
Simplicity means not flabbergasting businesspeople with the supposed benefits of ‘Big Data’. It implies avoiding patronising language akin to “Just do Big Data, because everyone will have to be doing it, and don’t worry your pretty little head about what it’s actually doing under the bonnet”. It means being frank, open and earnest about ‘Big Data’.
Hold this thought: You cannot bullshit simplicity.
Appropriate is good
The great economist John Kenneth Galbraith once observed that “The real accomplishment of modern science and technology consists in taking ordinary men, informing them narrowly and deeply and then, through appropriate organisation, arranging to have their knowledge combined with that of other specialised but equally ordinary men. This dispenses with the need for genius. The resulting performance, though less inspiring, is far more predictable.”
Appropriateness is one of the more important aspects of supplying data for strategic, tactical and operational decision support, and it is data that must, by its very nature, be appropriate.
Appropriateness addresses the need for the correct data.
Hold this thought: Appropriateness is good
Adequate is sufficient
Another important aspect of Adequacy means that there is enough data supplied to adequately meet the requirements for that data. Adequacy addresses the need for the right volume of the correct data and at the right levels of abstraction.
I know that people in IT find it tempting to second-guess requirements and to pile up unasked-for feature additions like they were going out of fashion, but in the lean and iterative age of agile we can no longer afford to be so reckless in how we manage requirements, projects and resources, especially those assigned to ‘Big Data’ projects.
Just hold this thought: Adequate really is enough
Timeliness kills the competition
Another important aspect of this Big Data field is the timely provision of data and the rapid delivery of usable outcomes. But this not only requires ‘Big Data’ but also big data management smarts.
Timeliness addresses the need to provide decision makers with appropriate, adequate data on time and every time, in order to maximise its use and therefore increase the chances of it having business value.
Hold this thought: Speed kills the competition.
Integration makes sense
If, after running speculative analysis (diagnostic or predictive, etc.), you are lucky enough to end up with something tangible and useful, you may also want to consider linking this or integrating its outcomes into mainstream, quality-assured strategic and tactical decision-support and analysis data.
This is where the Data Warehouse concept of Bill Inmon comes into its own. Because Enterprise Data Warehousing (and especially DW 3.0) provides a conceptual data architecture and data management protocols to support the appropriate, timely scaling of data set sizes from gigabytes to terabytes, and then to petabytes – and beyond, if that is really what is needed.
Hold this thought: Integrate without losing essence
Big Data Science name change
There has been so much misleading, unreliable and unrepresentative puff built up around Big Data that it seems like an appropriate time to give it a ‘legal, decent and honest’ makeover, and to also change its name to something more suitable, such as Janus Data Analytics (JDA for short) or New Wave Punk Data.
I believe that Janus Data Analytics may be a good name for this niche technology field because it accurately reflects what it is and, at the same time, is intrinsically linked to beginnings and transitions, to gates, doors, doorways, passages, and endings. Janus Data Analytics looks into the future and the past, and presides over the beginning and end of conflict, war, and peace.
There is also an attraction to the term New Wave Punk Data. It sends a strong and uncompromising signal to business. It deftly and simply describes the two key aspects of what is currently being touted as ‘Big Data’. New Wave Punk Data reflects the rapid, sharp-edged, and primal slicing, dicing, and reduction of extensive datasets, together with short-term speculation and stripped-down analytics, often driven by opinionated and alternative drivers. It embraces a DIY ethic; many businesses that lead the movement (Yahoo, Google, Facebook, etc.) started with self-developed ‘Big Data’ tools (often initially as simple variations on the Unix power-chord themes of parallel grep, awk and cat) and shared them through open source channels.
The third option is to simply place the data aspect of ‘Big Data’ under the data architecture and data management umbrella as a facet of Data Warehousing and to place the ‘data science’ aspect of ‘Big Data’ under the statistics and data analytics umbrella, with a close association with the sub-class known as business intelligence. The true data-mining and machine-learning aspects of ‘Big Data’ can sensibly continue under the umbrella of Artificial Intelligence.
Hold this thought: A rose by any other name
Keep it legal, decent and honest
Potentially, there are methods, technologies, and techniques under the ‘Big Data’ big-top that could be used to accrue real business value; however, those benefits are at risk due to the quality and quantity of puff in the environment, as alluded to in the previous talking point.
The point is this. Banging on about the same nebulous futures of ‘Big Data’ rather than being specific, clear and verifiable about what is really going on is, to state it simply, is going to ‘queer the pitch’ for everyone; the good, the bad and the ugly… but especially the good.
Therefore I would suggest that we all take an additional New Year’s resolution on ‘Big Data’, and in future only refer to the application and benefits of ‘Big Data’ and ‘Big Data’ analytics in terms that could only be construed as legal, decent and honest.
Hold this thought: “If you are not a better person tomorrow than you are today, what need have you for a tomorrow?” – Rebbe Nachman of Breslov
That’s all folks
So, that is all from me in the first of what I hope will be many issues in the series Big Data 7s.
I would like to leave you with this fabulous quote from James Carville… just because.
“Sometimes the right thing gets done for the wrong reason and sometimes, unfortunately, the wrong thing gets done for the right reason”.
As always, many thanks for reading.
File under: Good Strat, Good Strategy, Martyn Richard Jones, Martyn Jones, Cambriano Energy, Iniciativa Consulting, Iniciativa para Data Warehouse, Tiki Taka Pro