Why buy when you can get it for free?
Back at you! Here is the third fantastic delivery of an amazing and fabulous selection of free and widely available business analytics learning content, which has been prepared… just for you. Continue reading
08 Tue Mar 2016
Posted in All Data, Analytics, Ask Martyn, Big Data, Big Data 7s, Big Data Analytics, business strategy, dark data, data architecture, Data governance, Data Lake, data management, data science, Data Supply Framework, Data Warehouse, Data Warehousing, Good Strategy, Inform, educate and entertain., IT strategy, Marty does, Martyn does, Martyn Jones, Martyn Richard Jones, pig data, Process, Strategy, The Amazing Big Data Challenge, The Big Data Contrarians
Why buy when you can get it for free?
Back at you! Here is the third fantastic delivery of an amazing and fabulous selection of free and widely available business analytics learning content, which has been prepared… just for you. Continue reading
05 Sat Mar 2016
Posted in All Data, Ask Martyn, Big Data, Big Data 7s, Big Data Analytics, business strategy, dark data, data architecture, Data governance, Data Lake, data management, data science, Data Supply Framework, Data Warehouse, Data Warehousing, Good Strat, Good Strategy, goodstrat, Inform, educate and entertain., IT strategy, Martyn does, Martyn Jones, Martyn Richard Jones, pig data, Strategy, The Amazing Big Data Challenge, The Big Data Contrarians
Martyn Richard Jones
Dusseldorf, August 2006
Data Warehousing provides possibly one of the best opportunities for IT organizations to deliver a valuable business solution in order to address a set of business needs; requirements that go well beyond the area of day to day operational support, and traditional applications (web enabled or not), and when Data Warehousing is done the right way, and for the right reasons, its payback to all of its stakeholders can be positively significant. Continue reading
05 Sat Mar 2016
Posted in All Data, Ask Martyn, Big Data, Big Data 7s, Big Data Analytics, dark data, data architecture, Data governance, Data Lake, data management, data science, Data Supply Framework, Data Warehouse, Data Warehousing, hadoop, Inform, educate and entertain., Marty does, Martyn does, Martyn Jones, Martyn Richard Jones, pig data, The Amazing Big Data Challenge, The Big Data Contrarians
This is the story of how the amazing Hadoop ecosphere revolutionised IT. If you enjoy it, then consider joining The Big Data Contrarians.
Before the advent of Hadoop and its ecosphere, IT was a desperate wasteland of failed opportunities, archaic technology and broken promises.
In the dark Cambrian days of bits, mercury delay lines and ferrite cores, we knew nothing about digital. The age of big iron did little to change matters, and vendors made enormous profits selling systems that nobody could use and even fewer people could understand. Continue reading
10 Thu Dec 2015
MARTYN RICHARD JONES
Despite the best efforts of Hadoop evangelists, consulting houses, and IT infrastructure and service vendors, Big Data – hailed as the greatest thing since the dawn of greatest things – is failing, and dramatically so, to produce the necessarily corresponding quantity and quality of tangible, detailed and verifiable success stories.
Continue reading
17 Tue Nov 2015
Many people come up to me in the street and beg me to write about the truths, myths and unwise things said about Big Data. I am offered gifts of goats, partners and riches beyond the dreams of avarice just to pronounce on such things. I am not in the habit of bowing to such street-pressure, but I have finally come round to doing something, if only to placate the river of rose-petal bearing infants’ tears flowing past my abode.
01 Sun Mar 2015
Posted in Ask Martyn, Big Data, Consider this

We’ve been told that Big Data is the greatest thing since sliced bread, and that its major characteristics are massive volumes (so great are they that mainstream relational products and technologies such as Oracle, DB2 and Teradata just can’t hack it), high variety (not only structured data, but also the whole range of digital data), and high velocity (the speed at which data is generated and transmitted). Also, from time to time, much to the chagrin of some Big Data disciples, a whole slew of new identifying Vs are produced, touted and then dismissed (check out my LinkedIn Pulse article on Big Data and the Vs).
So, beware. Things in Big Data may not be as they may seem.
It’s not about bigI have been waging an uphill battle against the nonsensical and unsubstantiated idea that more data is better data, but now this view is getting some additional support, and from some surprising corners.
In a recent blog piece on IBM’s Big Data and Analytics Hub (Big data: Think Smarter, not bigger), Bernard Marr wrote that “the truth is, it isn’t how big your data is, it’s what you do with it that matters!”
Elsewhere, SAS echoed similar sentiments on their web site: “The real issue is not that you are acquiring large amounts of data. It’s what you do with the data that counts.”
Can we call that ‘strike one’ for Big Data Vs?
It’s not about varietyIt is claimed that 20% of digital data is structured, it is based on the problematic suggestion that structured data is uniquely relational. It is also claimed that unstructured data includes CSV files and XML data, and this makes up far more than the 20% of the data generated. But this definition is simply wrong.
If anything, CSV data is structured, and XML data is highly structured, and it’s typically regular ASCII data. So it does not add variety, even though it is not structured in the ways that some people might expect, especially if that someone lacks the required knowledge and experience. Simply stated, CSV data is structured, it’s just that it lacks rich metadata, but that doesn’t make it unstructured.
“But”, I hear you say “what about all the non-textual data such as multi-media, and what about the masses of unstructured textual data?”
Take it from me, most businesses will not be basing their business strategies on the analysis of a glut of selfies, home videos of cute kittens, or the complete works of William Shakespeare or Dan Brown. Almost all business analysis will continue to be carried out on structured data obtained primarily from internal operational systems and external structured data providers.
Strike two! Third time lucky?
It’s not even about velocitySo, if we accept that Big Data isn’t really about the data volumes or data variety that leaves us with velocity, right? Well no, because if it isn’t about record breaking VLDBor significant data variety, then for most commercial businesses the management of data velocity becomes either less of an issue or just is no issue. The fact that some software vendors and IT service suppliers set up this ‘straw man’ argument and then knock it down with the ‘amazing powers’ of their products and services, is quite another matter.
Strike three, and counting.
It’s not about the manageability of Big DataWe have been told and time again that the major difference between a data scientist and professional statistician is that the ‘scientists’ know how to cope very well with massive volumes, varieties and velocities of data. Now it turns out that this is also questionable.
According to Bob Violino writing in Information Management (Messy Big Data Overwhelms Data Scientists – 20 February 2015) “Data scientists see messy, disorganized data as a major hurdle preventing them from doing what they find most interesting in their jobs”. So, when it comes to data quality and structure the ‘scientists’ don’t really have an advantage over professional statisticians.
Last year Thomas C. Redman writing in the Harvard Business Review (Data’s Credibility Problem) noted that when Big Data is unreliable “managers quickly lose faith” and “and fall back on their intuition to make decisions, steer their companies, and implement strategy” and when this happens there is a propensity to reject potentially “important, counterintuitive implications that emerge from big data analyses.”
Strike four?
The new analytics aren’t newData science and Big Data analytics are the new kids on the block, aren’t they?
Well, here are some real life scenarios.
A major banking equipment supplier: A lot of banking equipment is hybrid analogic-digital, a simple example of this would be a photo copier or a physical document processing device. One major supplier decided to incorporate the capture of sensor data produced by their devices to predict failure and problems. Predictive preventive maintenance rules are created and corroborated using the data generated by sensors on each customer device, and these rules then get incorporated into the devices logic.
A major IT vendor: What happens when you create an intersection and convergence between technologies, techniques and method from areas of mainstream IT, data architecture and management, statistics (quantitative and qualitative analytics) and data visualisation, artificial intelligence/machine learning and knowledge management? This is precisely what one of the main European IT vendors did, and the idea proved to be quite attractive to customers, prospects and investors.
A major integrated circuit supplier: The testing of ICs at the ‘fabs’ (manufacturing plants) generates serious amount of data. This data is used to detect errors in the IC manufacturing process, it is captured and analysed in as near real-time as possible, which is necessary due to the costly nature of over-running the production of faulty ICs. To get around this problem the company uses a combination of fast data capture, transformation and loading of data into a data analytics area to ensure early and precise problem detection.
All Big Data Analytics success stories?
The first happened in 1989, the second in 1993 and the third in 2001. Yes, Big Data and Big Data analytics are sort of newish.
Strike five.

What is science?
According to Vasant Dhar of the Stern School of Business (Data Science and Prediction), Jeff Leek (The key word in “Data Science” is not Data, it is Science), and repeated on Wikipedia, “In general terms, data science is the extraction of knowledge from data”. Well, excuse me if I beg to differ. I have seen data scientists at work, and the word science doesn’t actually jump out and grab you. It’s difficult to make the connection, just as it is to accurately connect some popular science magazines with fundamental scientific research.
If a professional and qualified statistician wants to label themselves a data scientist then I have no issue with that, it’s their problem, but I am not willing to lend credibility to the term ‘data scientist’ when it is merely an interesting job title, with at most a tenuous connection to the actual role, and one that is liberally applied, with the almost customary largesse of IT, to creative code hackers and business-averse dabblers in data.
As Hazelcast VP Miko Matsumura suggested in Data Science is Dead “… put “Data Scientist” on your resume. It may get you additional calls from recruiters, and maybe even a spiffy new job, where you’ll be the King or Queen of a rotting whale-carcass of data” and ” Don’t be the data scientist tasked with the crime-scene cleanup of most companies’ “Big Data”—be the developer, programmer, or entrepreneur who can think, code, and create the future.”
Strike six.
And the value is questionableDATA: “Data is a super-class of a modern representation of an arcane symbology.” – Anon
If I had a dollar for every time I heard someone claim that data has intrinsic positive value then I would be as wealthy as Warren Buffet.
If I have said it once, I have said it a hundred time. In order for data to be more than an operational necessity it requires context.
Providing valid data with valid context turns that data into information.
Data can be relevant and data can be irrelevant. That relevance or irrelevance of data may be permanent or temporary, continuous or episodic, qualitative or quantitative.
Some data is meaningless, and there are cases whereby nobody can remember why it was collected or what purpose it serves.
Taking all this into account we can ask the deadly pragmatic question: what value does this data have? Which is sometimes answered with a pertinent ‘no value whatsoever’.
Strike seven.
It is said that Big Data is changing the world, but for all intents and purposes, and shamed by previous Big Data excesses, some people are rapidly changing the definitions and parameters of Big Data, and to position it as being more tangible and down-to-earth, whilst moving it away from its position as an overhyped and dead-ended liability.
Big Data is a dopey term, applied necessarily ambiguously to a surfeit of tenuously connected vagaries, and its time has come and gone. So, let’s drop the Big Data moniker, and embrace the fact that data is data, and long live ‘All Data’, yes, all digital data. Let’s consider all data and for what it’s worth to the business, and not for what some chatterers reckon its value is – having as they do, little or no insight into the businesses to which they refer, or of the data in that these businesses possess.
So, when push comes to shove, is Big Data really about high volumes, high velocity and high variety, or is it in fact about much noise, too much pomposity and abundant similarity leading to unnecessary high anxiety?
Thanks very much for reading.
08 Mon Dec 2014
Posted in accountability, agile, Ask Martyn, Consider this, dark data, data science
Tags
Martyn Richard Jones
In the eighties, there was a company called Sperry Univac, which was part of the once-famous Sperry Corporation.
At that time, a significant manufacturing concern in the industrialised Midlands of England was looking to automate and computerise operations. This meant that they would be in the market for some serious heavy iron – to use the old euphemism for mainframe computers. Continue reading
01 Mon Dec 2014
Posted in Ask Martyn
Martyn Richard Jones
As an effective business process paradigm and a powerful mixture of technology, engineering and pragmatic design, Enterprise Data Warehousing (EDW) is arguably unmatched in its capacity to address complex and essential information requirements in Management Reporting, Business Intelligence (BI) and Data Analytics. In spite of the occasional outbreak of anecdotal ruminations to the contrary, key indications point to a bright future for EDW. Indeed, the breadth and richness of nascent applications, that the EDW model can adequately support, will drive an expansion in its utility, acceptance and advancement. For instance, a significant and practical way of maximising the value of innovation and investment in Enterprise Data Warehousing is by reusing and extending the Inmon paradigm to construct the enterprise information hub of the future. Continue reading
30 Sun Nov 2014
Posted in Ask Martyn, Consider this, Infotrends
Tags
Martyn Richard Jones
Hold on to your seats! And, get ready for a bumpy ride of super-sized, baffling and thoroughly absurd dimensions. Continue reading
30 Sun Nov 2014
Posted in Ask Martyn, Consider this, Data Warehouse, Data Warehousing
Martyn Richard Jones
More than 80% of advertising is ignored. More than 50% of Data Warehouse projects fail in one way or another. The information explosion has been accompanied by a massive increase in the ranks of the willfully stupid. Continue reading