Why buy when you can get it for free?
Here is the first fantastic delivery of an amazing and fabulous selection of free and widely available business analytics learning content, which has been prepared… just for you. Continue reading
05 Sat Mar 2016
Posted in All Data, Analytics, Big Data, Big Data 7s, Big Data Analytics, dark data, data architecture, Data governance, Data Lake, data management, data science, Data Supply Framework, Data Warehouse, Data Warehousing, Inform, educate and entertain., pig data, statistics, The Amazing Big Data Challenge, The Big Data Contrarians
Why buy when you can get it for free?
Here is the first fantastic delivery of an amazing and fabulous selection of free and widely available business analytics learning content, which has been prepared… just for you. Continue reading
05 Sat Mar 2016
Posted in All Data, Ask Martyn, Big Data, Big Data 7s, Big Data Analytics, business strategy, dark data, data architecture, Data governance, Data Lake, data management, data science, Data Supply Framework, Data Warehouse, Data Warehousing, Good Strat, Good Strategy, goodstrat, Inform, educate and entertain., IT strategy, Martyn does, Martyn Jones, Martyn Richard Jones, pig data, Strategy, The Amazing Big Data Challenge, The Big Data Contrarians
Martyn Richard Jones
Dusseldorf, August 2006
Data Warehousing provides possibly one of the best opportunities for IT organizations to deliver a valuable business solution in order to address a set of business needs; requirements that go well beyond the area of day to day operational support, and traditional applications (web enabled or not), and when Data Warehousing is done the right way, and for the right reasons, its payback to all of its stakeholders can be positively significant. Continue reading
05 Sat Mar 2016
Posted in All Data, Ask Martyn, Big Data, Big Data 7s, Big Data Analytics, dark data, data architecture, Data governance, Data Lake, data management, data science, Data Supply Framework, Data Warehouse, Data Warehousing, hadoop, Inform, educate and entertain., Marty does, Martyn does, Martyn Jones, Martyn Richard Jones, pig data, The Amazing Big Data Challenge, The Big Data Contrarians
This is the story of how the amazing Hadoop ecosphere revolutionised IT. If you enjoy it, then consider joining The Big Data Contrarians.
Before the advent of Hadoop and its ecosphere, IT was a desperate wasteland of failed opportunities, archaic technology and broken promises.
In the dark Cambrian days of bits, mercury delay lines and ferrite cores, we knew nothing about digital. The age of big iron did little to change matters, and vendors made enormous profits selling systems that nobody could use and even fewer people could understand. Continue reading
14 Sat Mar 2015
Posted in Big Data, Consider this, dark data, Martyn Jones
Dans ce pays-ci, il est bon de tuer de temps en temps un amiral pour encourager les autres – Voltair
My gran used to tell me that honesty pays. Of course, she never really understood banking or IT, probably because she didn’t want to know anything about them, and she never lived to witness the amazing hype circuses, the spin doctors spiel or the focus-group dog-and-pony show of the 21st century. Indeed, if honesty were a guaranteed payer my gran would have amassed more wealth than even Warren Buffet himself.
If my gran lived today, she might reflect on what Big Data might be about – maybe she would even consider it benignly, as a sort of shelter for fallen men of once uncertain virtue. We will never know. So onwards and upwards.
The Harvard Business Review contemplated honesty in somewhat different terms:
“Honesty is, in fact, primarily a moral choice. Businesspeople do tell themselves that, in the long run, they will do well by doing good. But there is little factual or logical basis for this conviction. Without values, without a basic preference for right over wrong, trust based on such self-delusion would crumble in the face of temptation.”
In a marvellous book, A few good from Univac, David E. Lundstrom narrates the story of Sperry Univac in the 1960s, one of the true great innovators in the first forty years of IT, and includes an allegory taken from the engineering front-line. I will recount it here, edited to highlight the zeitgeist, for your entertainment and as Voltaire put it, “to encourage the others”:
In the beginning was the Big Data Plan.
And then came the Big Data Assumptions.
And the Assumptions were without form.
And the Plan was without substance.
And darkness was upon the face of the Workers.
And they spoke amongst themselves, saying: “It is a crock of shit, and it stinketh.”
And the workers went unto their Supervisors and said: “It is a pail of dung, and none may abide the odor thereof.”
And the Supervisors went unto their Managers, saying: “It is a container of excrement, and it is very strong, such that none may abide by it.”
And the Managers went unto their Directors, saying: “It is a vessel of fertilizer, and none may abide its strength.”
And the Directors spoke amongst themselves, saying to one another: “It contains that which aids plant growth, and it is very powerful.”
And the Vice Presidents went unto the President, saying unto him: “This new plan will actively promote the growth and vigor of the company, with powerful effects.”
And the President looked upon the Big Data Plan, and saw that it was good.
“But?” I hear you say, “why fight it, why not take advantage of the Big Data zeitgeist?”, “Why not cash in on the grand bonanza Big Data bandwagon?” or “Monetise the 3 three famous Vs of Big Data?”
Well, it had crossed my mind, briefly, and (outside of the USA) we’ve all done stuff we have not entirely believed in, so the temptation to cash in is present, capisci? This paraphrasing of a piece from My Blue Heaven might give you a better idea:
One of my best friends makes his living as a completely phony Big Data Scientist. For two hundred bucks he can make you a Data Scientist or a Big Data guru. Some guys give you an education but this guy gives you immediate access to high paying jobs, sex that would make the 256 trillion Shades of Blah blush and a life in the City, the Big Apple or a small town in Germany.
Moreover, for an extra 250 bucks (limited time offer) you can also become a certified Big Data Neuro Trainer, which will allow you to do unto others what has been done unto you.
I also considered Big Data Brokerage, Big Data Certification and Big Data Independent Trading (New York – Paris – Peckham). The opportunities are immense.
However, what happens when the Big Data well runs dry, and I (and many others get tarnished with the mark of Big Data) become pariah by complicity, collusion or simple association?
That question I will leave for another day. But just consider the following.
All right, I admit, I am a big long-time fan of comic genius Mel Brooks, who has a knack of capturing deep insight from the human condition, especially when the human condition is off guard and shallow. In that vein, this is how I like to think the dialogue from the Dole Office scene from The History of the World Part Two would have gone, if he were to write that today:
Dole Office Clerk: Occupation?
Data Magnus Comicus: Stand-up Big Data scientist.
Dole Office Clerk: What?
Data Magnus Comicus: Stand-up Big Data scientist. I coalesce the vaporous datas of the human interaction with the social-media networking, Internet of Everything, and always-connected experience into a… viable, analytical and meaningful predictive-comprehension.
Dole Office Clerk: Oh, a Big Data bullshit artist!
Data Magnus Comicus: *Grumble*…
Dole Office Clerk: Did you bullshit Big Data last week?
Data Magnus Comicus: No.
Dole Office Clerk: Did you try to bullshit Big Data last week?
Data Magnus Comicus: Yes!
Finally, I leave you with some wise words from Israeli American professor of psychology and behavioural economics, Dan Ariely:
“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…”
Many thanks for reading.
10 Tue Mar 2015
Posted in All Data, Big Data, Consider this, dark data, Good Strat
Tags
All Data, Big Data, dark data, data architecture, data management, Good Strat, Martyn Jones, Martyn Richard Jones
Dark data, what is it and why all the fuss?
First, I’ll give you the short answer. The right dark data, just like its brother right Big Data, can be monetised – honest, guv! There’s loadsa money to be made from dark data by ‘them that want to’, and as value propositions go, seriously, what could be more attractive?
Let’s take a look at the market.
Gartner defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes” (IT Glossary – Gartner)
Techopedia describes dark data as being data that is “found in log files and data archives stored within large enterprise class data storage locations. It includes all data objects and types that have yet to be analyzed for any business or competitive intelligence or aid in business decision making.” (Techopedia – Cory Jannsen)
Cory also wrote that “IDC, a research firm, stated that up to 90 percent of big data is dark data.”
In an interesting whitepaper from C2C Systems it was noted that “PST files and ZIP files account for nearly 90% of dark data by IDC Estimates.” and that dark data is “Very simply, all those bits and pieces of data floating around in your environment that aren’t fully accounted for:” (Dark Data, Dark Email – C2C Systems)
Elsewhere, Charles Fiori defined dark data as “data whose existence is either unknown to a firm, known but inaccessible, too costly to access or inaccessible because of compliance concerns.” (Shedding Light on Dark Data – Michael Shashoua)
Not quite the last insight, but in a piece published by Datameer, John Nicholson wrote that “Research firm IDC estimates that 90 percent of digital data is dark.” And went on to state that “This dark data may come in the form of machine or sensor logs” (Shine Light on Dark Data – Joe Nicholson via Datameer)
Finally, Lug Bergman of NGDATA wrote this in a sponsored piece in Wired: “It” – dark data – “is different for each organization, but it is essentially data that is not being used to get a 360 degree view of a customer.
Okay, let’s see if we can be a bit more specific about the content of dark data?
Items on the dark data ticket include: Email; Instant messages; documents; Sharepoint content; content of collaboration databases; ZIP files; log files; archived sensor and signal data; archived web content; aged audit trails; operational database backups – full and incremental; roll-back, redo and spooled data files; sunsetted applications (code and documentation); partially developed and then abandoned applications; and, code snippets.
Most importantly, dark data is data that is not actively in use, is underutilised, or is something else. Seriously.
So, the conclusion that some have come to is this: there is a vast collection of data in various formats waiting to be monetised.
Personally, the idea that really grabs my attention is the potential ability to do novel forensic research on email. If only to find out what happened in the past.
For example, maybe it would be fascinating to see how significant challenges were identified, flagged and discussed; how strategic responses to those challenges were formulated, chosen and executed; and, how the outcomes of all of that process were reflected in email communications.
I think that this line of work can be very interesting for some people, and that interesting insights may be uncovered, but I would hate to have to put a tangible value on it, if only to avoid adding to the already galactic magnitudes of nonsense and hype surrounding certain data topics.
There are other more mundane uses of dark data.
Imagine that you are just about to embark on a Data Warehouse project (you really are a late adopter aren’t you), and you want establish a base collection of historical data. Where do you get that historical data from?
Right! Operational databases are not characteristically used to store significant amounts of historical reference data and historical transactions beyond a certain time window; there are performance and other reasons for keeping OLTP systems as lean as possible, so, initial loads of historical data is typically recreated in the Data Warehouse from backups, audit trails or logs.
You don’t need a Chief Data Officer in order to be able to catalogue all your data assets. However, it is still good idea to have a reliable inventory of all your business data, including the euphemistically termed Big Data and dark data.
If you have such an inventory, you will know:
What you have, where it is, where it came from, what it is used in, what qualitative or quantitative value it may have, and how it relates to other data (including metadata) and the business.
What needs to be kept, and for how long, and what can be safely discarded, and when.
The risks associated with the retention or loss of that data.
If you don’t have such a catalogue and have never done a data inventory then a full data inventory and audit seems to be your new best friend.
Simply stated, you may have dark data that has value, or it may be a simple collection of worthless digital nostalgia. But if you don’t know what you have, it may pay to find out what’s there, and if necessary, to let it go.
There is no point in hoarding unneeded and unwanted rubbish data. That is simply not good data management.
Finally a word on all the fuss surrounding dark data.
Failure to monetize when there is value to be obtained from dark data is one thing, claiming that value can be invariably obtained whilst actually not knowing what the data is, or how it could be monetised, is just adding to the mountain of data related ‘nonsense and hype’ doing the rounds these days. Please consider not adding to that mountain.
British Rail, the national UK rail Company, used to be notorious for the number of delays and cancellations to services, and their reasons for failing to meet their obligations became stranger and stranger.
In winter, it would snow and there would be problems. And people would ask ‘how come you couldn’t deal with the snow this year, we’ve had snow for centuries?’ And back came the answers ‘Yes, Sir, but this year it was the wrong type of snow’. In autumn (the fall), it was ‘the wrong types of leaves, and ‘the wrong type of rain’, and in Summer, the ‘wrong type of sunshine’ and so on and so forth.
I hope this will not be the excuse from the Big Data and dark data pundits and punters when the much-vaunted and ‘almost’ guaranteed monetisation isn’t frequently realised.
‘Of course Big Data gives you big dollar benefits, it was just littered with the wrong type of data’ or ‘you just weren’t trying hard enough’.
Many thanks for reading.
08 Mon Dec 2014
Posted in accountability, agile, Ask Martyn, Consider this, dark data, data science
Tags
Martyn Richard Jones
In the eighties, there was a company called Sperry Univac, which was part of the once-famous Sperry Corporation.
At that time, a significant manufacturing concern in the industrialised Midlands of England was looking to automate and computerise operations. This meant that they would be in the market for some serious heavy iron – to use the old euphemism for mainframe computers. Continue reading