• Home
  • About
  • The Good Strategy Blog
  • Strategy
    • Data Warehousing
    • Ask Martyn

GOOD STRATEGY

~ for every significant challenge

GOOD STRATEGY

Tag Archives: All Data

The Big Data Contrarians: The Agora for Big Data dialogue

11 Wednesday Nov 2015

Posted by Martyn Jones in 4th generation Data Warehousing, All Data, Analytics, Big Data, statistics, The Big Data Contrarians

≈ Leave a comment

Tags

All Data, Analytics, Big Data, Martyn Jones

“In fact men will fight for a superstition quite as quickly as for a living truth – often more so, since a superstition is so intangible you cannot get at it to refute it, but truth is a point of view, and so is changeable.”

Hypatia

On the 1st of July, I decided to set up a professional group on LinkedIn in order to create a hype free Agora for Big Data dialogue. I called the group The Big Data Contrarians and although it is a closed group, all those with interest in an open, informed and honest exchange of ideas on data, from whatever angle they are coming from, are very welcome to join in. (URL:  http://www.linkedin.com/grp/home?gid=8338976)

So, why is the group called The Big Data Contrarians and not something more generic, such as The Data Contrarians?

Continue reading →

Who’s afraid of the Big Data Contrarians? Here’s 500 reasons not to be

11 Wednesday Nov 2015

Posted by Martyn Jones in 4th generation Data Warehousing, All Data, Big Data, Business Intelligence, Cambriano, Consider this, Good Strategy, Strategy

≈ 1 Comment

Tags

All Data, Analytics, aspiring tendencies in IM, Big Data, cambriano, Martyn Jones, The Big Data Contrarians

If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:

Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976

Many thanks.

When I first started The Big Data Contrarians group on LinkedIn I was thinking that maybe we would get 100 members within three or four months. Well, I was mistaken. Since the 1st of July, the membership ranks of The Big Data Contrarians has risen to over 500 members. However, it’s not about the quantity it’s about the quality, and The Big Data Contrarians is ‘the nicest Big Data community that you are ever likely to encoun Continue reading →

The Million-Dollar Big Data Briefing

11 Wednesday Nov 2015

Posted by Martyn Jones in 4th generation Data Warehousing, agile, Big Data, Consider this, Data Lake, Martyn Jones, Martyn Richard Jones

≈ Leave a comment

Tags

All Data, Analytics, Big Data

If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians: https://www.linkedin.com/grp/home?gid=8338976

Many thanks, Martyn.

Big Data, together with Cloud computing, the Internet of Things and Machine Learning, are topics that are very much to the fore in contemporary trends in Information Management. But is Big Data really the revolution that people have been waiting for or is it simply about the next steps in the evolution of business data architecture and management?  Continue reading →

The Amazing Big Data Challenge – 2015

11 Wednesday Nov 2015

Posted by Martyn Jones in 4th generation Data Warehousing, Analytics, Big Data, goodstrat, Martyn Jones, Martyn Richard Jones, Strategy, The Amazing Big Data Challenge

≈ Leave a comment

Tags

All Data, Big Data, Martyn Jones, Strategy

For those of you who are familiar with the world of Big Data you will also be aware of the vanguard data community known as The Big Data Contrarians (the most fabulous Big Data community online).

Launched today (23 September 2015), the Big Data Contrarian’s Challenge is destined to fast become the most prestigious, enviable and prized challenge on the entire global world-wide-web. Continue reading →

Whither Big Data bullshit?

11 Wednesday Nov 2015

Posted by Martyn Jones in All Data, Big Data, Good Strategy, goodstrat, Martyn Jones

≈ Leave a comment

Tags

All Data, Big Data, Martyn Jones, Martyn Richard Jones

If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians: https://www.linkedin.com/grp/home?gid=8338976

Many thanks, Martyn.

Pundits far and wide are hailing the end of the period of big data babble, hyperbole and bullshit and are looking forward to an epoch of practical, tangible and verifiable Big Data success stories.

Gartner themselves came out some time ago and declared that Big Data was no longer in the hype cycle. Some took this as a sign that the Big Data bullshit bonanza was over, others were more cynical and suspected a highly orchestrated ruse, a move to the next level in the game plan.

But does this new attitude towards Big Data really ring true?

Accompanying this apparent bold openness, frankness and humility in the ranks of the rehabilitated Big Data bullshit babblers there is an awful lot of what appears to be ‘more of the same’. Or as the people of Thailand might say, “same, same, but different”.

As some of you might know, I am the administrative owner of The Big Data Contrarians community group on LinkedIn, and even I was somewhat taken aback by a recent piece by Bernard Marr entitled 20 Stupid Claims About Big Data. So much so that I wrote a fairly complimentary comment on LinkedIn about it. The thing is, even as a posted it I was thinking to myself “you’ll be sorry”.

Today I read yet another Big Data ‘reformation’ piece on LinkedIn Pulse, this time from Matthew Reaney and with the compelling title of The 5 Myths of Big Data.

Call me naïve, call me illusory, and a believer in humankinds need for basic decency, but I frequently have the idea that praising moderately acceptable behaviour leads to even more good behaviour. But it was not to be, and as fast as one could say ‘what the hell is going on here?’ back came a surfeit of astroturfed Big Data bananas – from all directions – bigger, brasher and more bogus than ever before.

Make no mistake, Big Data hype hasn’t gone away, it has become more subtle, more cunning and even more misleading.

Leading the charge is the initiative to discredit Data Warehousing by all means possible, and the amount of bullshit, disinformation and blatant lies doing the rounds is beginning to look like Big Data hype reflecting Big Data itself, if only in terms of the vast volumes, varieties and velocities that this Big Data babbling bullshit comes in.

But seriously, we are simply getting more of the same, as the end of the Big Data hype war is declared, we are subject to a bombardment of Big Data boloney via Cloud, IoT, the Hadoop ecosphere (as if using Hadoop was someone linked to ecology and saving the planet), and especially this incredibly obnoxious and dopey vehicle for Big Data tripe known widely as the Data Lake – more on that stupidity at some other time. But onwards and upwards…

This all reminds me of a joke from many decades ago, retold in part from memory.

A teacher was looking for a subject about which her class pupils could write, to set as a homework exercise.

After much deliberation she decided to as ask the children to write about what they thought of the police?

Sure, not a good question, I know, and as I stated, this was many decades ago, when even grown-ups could be innocent and naïve and hopeful.

Anyway, when the children had handed in all their essays, the teacher read the essays and was disappointed to find that most of them were very wishy-washy and that the children were almost all unanimously indifferent or grudgingly respectful of the police, except for one. One of the children, let’s call him Dave, was very critical and had written “I don’t think much of the police.” When the teacher asked Dave why he had written that, he replied “All police is bastards, Miss”. The teacher was vexed by the reply, but being a good and caring teacher she considered how she could change this obviously hostile view of the bobby on the beat and the police detective taking evil doers out of circulation, so she decided to do something about it.

She had a bright idea and took her problem to the police and discussed what could be done to give the children a much more positive view of the police and the work they did, so they would see the police as a necessary part of society, to be respected but not feared.

As a result, the teacher and the police organised a police day at the school. It was a big party, with lots of free goodies, badges and posters, rides in patrol cars, sirens, interesting stories and a movie, and a big discussion with the police dog handler and his faithful and brave police-dog, Ajax. The police took special interest in Dave, he was the one they wanted to convince the most, and he was the one they made the most fuss of.

At the end of the day, the teacher again asked the children to write about what they got from the school police day that she had organised.

The following Monday, after all the essays had been handed in by the children, she sought out and read Dave’s essay, eager with anticipation.

This time it contained the surprising phrase of “I really, really don’t think much of the police.”

Again, the teacher asked Dave why he had written what he had wrote, especially considering all the effort the police had gone to in order to leave a good and lasting impression with the children in general, and Dave in particular.

He simply replied “the Police is cunning bastards, Miss.”

Personally, I have respect for the professionalism, courage and hard work of many officers in our police forces, but when it comes to my view of certain Big Data pundits – and naming no names, just watch my eyes – the feeling is not the same.

Make of that what you will.

Many thanks for reading.

If you enjoyed this piece or found it useful then please consider joining The Big Data Contrarians: https://www.linkedin.com/grp/home?gid=8338976

Many thanks,

Martyn.

Big Data Explained to My Grandchildren

29 Monday Jun 2015

Posted by Martyn Jones in Big Data

≈ Leave a comment

Tags

All Data, Big Data, made simple

Ban pan doeth peir

ogyrwen awen teir

The Book of Taliesin 

Once upon a time, a hobgoblin of digital moonshine stalked the land. Its name was Shirley Temple (but, it was better known as Big Data), and it had many followers.

Few really knew where Big Data had come from, because it just appeared overnight. Like owls, snow, rumour and astroturfing flim-flam merchants.

Some say the Gardner brought it in on the bottom of their wellies after a particularly tough night on the lemonade.

One night, a man with a black dog told me that it was really all a load of old nonsense, dreamed up by Redwood Shore Larry, to shake things up a bit.

Others, of the more superstitious bent, claimed that the giant who lived in the Big Blue mansion on the hill, had concocted it, from sugar and spice and all things dodgy and nice

The more cynical amongst the population just pointed at its high priests, acolytes and bicycle boys, and had a good old laugh.

Yet others claimed that it was a digital immaculate conception and a divine-revelation of mega-trend setting proportions that would change the face of the Lleyn peninsula, forever.

Elsewhere, some talked of dark deeds, of wickedness, or that it was a psycho-paranormal phenomenon closely associated with the cultish cult of the badly drawn Yellow Elephant. A wonderful wacky, off-the-wall and global orco-centric sect that sacrificed the processing-cycles of reason, strategy and coherence on the altar of half-baked pragmatism, bodgerism and winging-it.

Nonetheless, some of the global villagers did express opinion that this was no new phenomenon, and that they had seen such a sinister semblance before. Knowledge and experience had informed them. They seemed to intrinsically know that a timeless feature of data is its variable volumes, its variable velocities, its increasing varieties and, because we insisted on hoarding so much of it, its increasingly expansive footprint.

But, how did they know?

They only had the gardener, the butler and the cook to corroborate their suspicions.

We know what we know, and what we don’t know we know what we don’t know, now, that is, but don’t tell, unless we do, or don’t, or not. So I’m glad we cleared that one up.

Down the valleys, across the moors and over the waves. From Bangor to Abertawe via Machynlleth and Caerfilli. Big Data, it moved and expanded, and expanded and moved again. Dong! Dong! Dong! Ominous, humongous and smelling of sulphur and a badly spiced kebab.

The Big Data mini-meme spread like wildfire fuelled by petrol and crack. It was a force to be harnessed, a force for good and bad. Even though no one really knew what it was, and nobody knew how to do it, many claimed to have done it, and successfully so. It didn’t matter how, what where or when. Front interface, back-end processor, client-server, no one and nothing was free or safe.

The benevolent Big Data virus reached everyone. The rich, the poor, the shop on the corner, and the girl next door. Everyone knew its name and that it was new and mega and good and bad and all of that.

The thing is, Big Data wasn’t really anything new. As we now know.

But, at the time, for some people, especially the so called high-ups and professional people, it was a big deal, even in Pontypridd! Like a major inflection point in the evolution of the generation and use of what we now call data, or to use the vernacular, digital gold-gold.

You see, back then, people wanted to make another class of data and another class of gold, and another series of lovely, chubby little verbose categories to describe it. People needed another name, one that represented some data class values, spirituality and imprecision. It was a time of post-modernism, and we were always stoned, mazed or drunk.

Some of my peers at the time – not all of them, just the exceptionally plain Jane, weird and alternative ones – told me that Big Data was data that came in bigger volumes, at greater velocities and in greater varieties.

The first I heard that, I was gobsmacked… Honest to God!

I know probably you’ll laugh at that, now, and you may even tell your friends and butties just how superstitious, primitive and money-grubbing we were back then.

So whilst you are making fun of your old granddad, do not forget either, that is the way it was in those days, down the data mining towns and Big Data pits of South Wales.

I know it is hard to believe, but back in those days, people really believed in that nonsense. I didn’t have any time for it myself, but many did, and many people made a living of sorts just talking and writing about it.

It’s hard to believe now, but at one time we were really dumb, but as we didn’t want to stand out from the crowd by revealing that we didn’t know, we just stood back and let the buffoons, clowns and comics run the show. Whilst we exclaimed, “these guys are really good”, “I wonder if he can turn Big Data into wine” or “maybe he does requests and children’s parties”.

Now we look at data, all data, and we call it…. Yes… data.

How we have advanced. It’s amazing.

But then, when it all died down, as these things do, in their given time, we regained our senses, of sorts, we got back on our feet and we progressed in a sane, rational and humane way.

Now we can look back and see things for what they were. The Big Data hullaballoo, that you have never heard of, was the last gasp of some the biggest IT dinosaurs, who had tied their hopes and aspirations, ridiculously so in my view, to some toys created by the naïve for the gullible. It was always going to end in tears, and it did.

Now you know.

So, this ends the story of Big Data, also known as Shirley Temple.

Goodnight kids!

Many thanks for reading.

As always, please share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps even send me aLinkedIn invite. Also feel free to connect via Twitter, Facebook and the Cambriano Energy website.

For more on this and other topics, check out my other recent posts:

  • Big Data Predictions for 2015
  • 7 New Big Data Roles for 2015
  • Data Made Simple – Even ‘Big Data’
  • Big Data is Dead!
  • Why Destructive Eagerness? The Data Warehouse Example
  • Big Data and the Vs
  • Did Big Data Kill the Statistician?
  • Infotrends 2015: 21 Directions in Information Management
  • On not knowing Climate Change
  • Big Data Robitussin – Big Data: Read all about it!
  • Absolute certainty…
  • Mugged in Data Hell

#BigData #BigDataAnalytics #Decency #Ethics

On Not Knowing Sentiment Analysis

12 Tuesday May 2015

Posted by Martyn Jones in Big Data, Big Data Analytics, Consider this, good start, goodstart, sentiment analysis

≈ Leave a comment

Tags

All Data, Analytics, aspiring tendencies in IM, awareness, good start, Good Strat, goodstart, Martyn Jones, Strategy

If you know all about Sentiment Analysis, you’ve come to the right place. Because I don’t have a clue if what I know about it is accurate or not.

I started to do a bit research into this Sentiment Analysis lark, in particular with the theoretical idea of using it to analyse and draw conclusions from comments on Pulse – assuming that this is what it can be used for.

To begin at the beginning, which is good place to start, I read the piece on Wikipedia, and this was how it began:

“Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials.

Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation (see appraisal theory), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).” Source: Wikipedia Link:http://en.wikipedia.org/wiki/Sentiment_analysis

Well, that’s a fairly intuitive description. I could have almost have guessed as much.

But, back to the aim of analysing sentiment in Pulse comments, where to start and what to do.

What would sentiment analysis make of these:

On the death of an IT-business celebrity. What would sentiment analysis make of the very emotive comments of desolation, sadness and poignancy of people who didn’t personally know the departed, even remotely, or maybe didn’t even know of them until after they had ‘shuffled off life’s mortal coil’? How would that work? What would sentiment analysis make of the maudlin aphorisms, surrogate grief and bizarre sorrow of people separated by more degrees than Kofi Anan and Mork from Ork.  What additional insight does sentiment analysis tell us when these comments are analysed along with the body of the text and other comments that triggers these comments?

In a similar vein, how does sentiment analysis catch instances of sycophancy? Especially considering the fact that some of it is so ‘in your face’ and blatant that it often times seems to be a bad parody of a bad parody. “Oh, Ricky, why are you such a sexy brainbox?” How does it work in those situations?

Worse than that is the preening, gushing and obtuse texts of massive, errm… fabulators[i]. If it wasn’t about Big Data or Strategy or IT, it would be about something else, usually about the writer themselves. “I give Rafa and Rodge tips on tennis! I went to the University of the Universe and got a first! I challenged Superman to a race, and won! I have read the entire works of Dan Brown, 25 times…Neeeh!” What would sentiment analysis do with that sort of gold?

Also, what does sentiment analysis do with texts so ambiguously daft that they could mean anything? Okay, it might be able to pick up a few trigger words here or there, “rubbish”, “of”, “load”, “a”, “what”, etc. However, how does it know when “excellent” is being used in a way that means anything but excellent? For example, “Excellent Big Data job there”, with the silent “if you want a job doing properly then do it yourself”.

Finally, for the purpose of this little piece, what would sentiment analysis do with term abuse, if it could actually identify it? Going back to the use of the terms such as Big Data or Strategy, how can sentiment analysis discern between the dopey and wrong-headed use of the term, and when it is actually being used in a coherent, cohesive and consistent way, in line more or less with its formal definition? I suppose we can always write a mountain of rules to help us out:

If topic in focus of piece is strategy

And context of topic is business

And author of piece is Richard Rumelt

Then the credibility of text is good (with a certainty of 100%)

But you and try and maintain a rule base with isntances like that. It soon becomes a management nightmare.

Alternatively, maybe it could be used to analyse this text. It’ll have its work cut out, that’s for sure. Does sentiment analysis do sarcasm and cynicsm?

Anyway! I bet you might know how this sentiment analysis works, don’t you? On the other hand, if not, then it will be someone else who ‘knows’. But of course, all will not be revealed, because it’s a secret so powerful, that in the wrong hands it could be used to dominate the entire galaxy.

Only joking; and many thanks for reading.

[i]To engage in the composition of fables or stories, especially those featuring a strong element of fantasy: “a land which … had given itself up to dreaming, to fabulating, to tale-telling” (Lawrence Durrell).

lang: en_US

What’s all the fuss about Dark Data? Big Data’s New Best Friend

10 Tuesday Mar 2015

Posted by Martyn Jones in All Data, Big Data, Consider this, dark data, Good Strat

≈ Leave a comment

Tags

All Data, Big Data, dark data, data architecture, data management, Good Strat, Martyn Jones, Martyn Richard Jones

What is Dark Data?

Dark data, what is it and why all the fuss?

First, I’ll give you the short answer. The right dark data, just like its brother right Big Data, can be monetised – honest, guv! There’s loadsa money to be made from dark data by ‘them that want to’, and as value propositions go, seriously, what could be more attractive?

Let’s take a look at the market.

Gartner defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes” (IT Glossary – Gartner)

Techopedia describes dark data as being data that is “found in log files and data archives stored within large enterprise class data storage locations. It includes all data objects and types that have yet to be analyzed for any business or competitive intelligence or aid in business decision making.” (Techopedia – Cory Jannsen)

Cory also wrote that “IDC, a research firm, stated that up to 90 percent of big data is dark data.”

In an interesting whitepaper from C2C Systems it was noted that “PST files and ZIP files account for nearly 90% of dark data by IDC Estimates.” and that dark data is “Very simply, all those bits and pieces of data floating around in your environment that aren’t fully accounted for:” (Dark Data, Dark Email – C2C Systems)

Elsewhere, Charles Fiori defined dark data as “data whose existence is either unknown to a firm, known but inaccessible, too costly to access or inaccessible because of compliance concerns.” (Shedding Light on Dark Data – Michael Shashoua)

Not quite the last insight, but in a piece published by Datameer, John Nicholson wrote that “Research firm IDC estimates that 90 percent of digital data is dark.” And went on to state that “This dark data may come in the form of machine or sensor logs” (Shine Light on Dark Data – Joe Nicholson via Datameer)

Finally, Lug Bergman of NGDATA wrote this in a sponsored piece in Wired: “It” – dark data – “is different for each organization, but it is essentially data that is not being used to get a 360 degree view of a customer.

Say what?

Okay, let’s see if we can be a bit more specific about the content of dark data?

Items on the dark data ticket include: Email; Instant messages; documents; Sharepoint content; content of collaboration databases; ZIP files; log files; archived sensor and signal data; archived web content; aged audit trails; operational database backups – full and incremental; roll-back, redo and spooled data files; sunsetted applications (code and documentation); partially developed and then abandoned applications; and, code snippets.

Most importantly, dark data is data that is not actively in use, is underutilised, or is something else. Seriously.

What can you do with it?

So, the conclusion that some have come to is this: there is a vast collection of data in various formats waiting to be monetised.

Personally, the idea that really grabs my attention is the potential ability to do novel forensic research on email. If only to find out what happened in the past.

For example, maybe it would be fascinating to see how significant challenges were identified, flagged and discussed; how strategic responses to those challenges were formulated, chosen and executed; and, how the outcomes of all of that process were reflected in email communications.

I think that this line of work can be very interesting for some people, and that interesting insights may be uncovered, but I would hate to have to put a tangible value on it, if only to avoid adding to the already galactic magnitudes of nonsense and hype surrounding certain data topics.

There are other more mundane uses of dark data.

Imagine that you are just about to embark on a Data Warehouse project (you really are a late adopter aren’t you), and you want establish a base collection of historical data. Where do you get that historical data from?

Right! Operational databases are not characteristically used to store significant amounts of historical reference data and historical transactions beyond a certain time window; there are performance and other reasons for keeping OLTP systems as lean as possible, so, initial loads of historical data is typically recreated in the Data Warehouse from backups, audit trails or logs.

Dark data and data governance

You don’t need a Chief Data Officer in order to be able to catalogue all your data assets. However, it is still good idea to have a reliable inventory of all your business data, including the euphemistically termed Big Data and dark data.

If you have such an inventory, you will know:

What you have, where it is, where it came from, what it is used in, what qualitative or quantitative value it may have, and how it relates to other data (including metadata) and the business.

What needs to be kept, and for how long, and what can be safely discarded, and when.

The risks associated with the retention or loss of that data.

If you don’t have such a catalogue and have never done a data inventory then a full data inventory and audit seems to be your new best friend.

What does it mean?

Simply stated, you may have dark data that has value, or it may be a simple collection of worthless digital nostalgia. But if you don’t know what you have, it may pay to find out what’s there, and if necessary, to let it go.

There is no point in hoarding unneeded and unwanted rubbish data. That is simply not good data management.

Finally a word on all the fuss surrounding dark data.

Failure to monetize when there is value to be obtained from dark data is one thing, claiming that value can be invariably obtained whilst actually not knowing what the data is, or how it could be monetised, is just adding to the mountain of data related ‘nonsense and hype’ doing the rounds these days. Please consider not adding to that mountain.

That’s all folks

British Rail, the national UK rail Company, used to be notorious for the number of delays and cancellations to services, and their reasons for failing to meet their obligations became stranger and stranger.

In winter, it would snow and there would be problems. And people would ask ‘how come you couldn’t deal with the snow this year, we’ve had snow for centuries?’ And back came the answers ‘Yes, Sir, but this year it was the wrong type of snow’. In autumn (the fall), it was ‘the wrong types of leaves, and ‘the wrong type of rain’, and in Summer, the ‘wrong type of sunshine’ and so on and so forth.

I hope this will not be the excuse from the Big Data and dark data pundits and punters when the much-vaunted and ‘almost’ guaranteed monetisation isn’t frequently realised.

‘Of course Big Data gives you big dollar benefits, it was just littered with the wrong type of data’ or ‘you just weren’t trying hard enough’.

Many thanks for reading.

Big Data in Question – Again

01 Sunday Mar 2015

Posted by Martyn Jones in All Data, Big Data, Consider this

≈ Leave a comment

Tags

All Data, Big Data, data management, Good Strat, good strat blog, Good Strategy, Martyn Jones, Martyn Richard Jones

Big Data is now an inhospitable and unhealthy land inhabited by those who, through accident or design, deceive naïve and sentimental bystanders and those who are willingly mislead.

When all of this Big Data malarkey started it was sort of funny, humorous and occasional witty, especially in the affected, bizarre and the frequently uninhibited ways that freshly-minted self-appointed gurus and experts would “big it up”

Doctor Freud would have had a field day with all of that, being as it was, and for that matter still is, a postmodern mishmash of Riefenstahl, Freddy Mercury and Monty Python on steroids. However, after that extended, operatic and high-camp hiatus it all went downhill.

The Big Data scene is fast becoming an outrageous and brash festival of deception, disinformation and obliviousness. Which is a pity, because it does the industry no good whatsoever.

It is telling that Big Data evangelists, gurus and assorted sycophants cannot even define Big Data adequately, never mind discuss (or for that matter, point at) tangible success stories, without falling into contradictions on all of the key defining characteristics of volume, variety and velocity, and resorting to crude debating devices to avoid or finesse the concerns and the questions.

Almost every morning I check out the industry news, and almost invariably, it comes with new mind-boggling examples of Big Data nonsense.

However, it isn’t always nonsense for nonsense’s sake, there are agendas, there are rational explanations why Big Data has become at the same time, one of the most hyped up fads in the history of IT, and one that its supporters find so difficult to actually explain and justify, in any reasonable sort of way.

Therefore, when it comes to Big Data, beyond the surfeit of platitudes, clichés, bluff and bluster, the only thing in play are the interests of industry, the patrons, the courtesans and their entourage of the innocent and the beguiled.

One of the biggest deceptions in Big Data is in the misleadingly named ‘success stories’. The thing is that most of these success stories that I have ever read have been:

  • So vague that it’s difficult to know how success is being defined never mind reached.
  • So secretive and obtuse is the avoidance of naming names, locations and other relevant Big Data references that it’s impossible to corroborate if these claims are actually true or not. Disclaimer: I have worked for some of the biggest IT vendors, and in senior roles, and I know what is behind comments such as “the Big Data project is a success, although the client name and project are confidential” and “it’s delivering such major competitive advantages that we are obliged to keep it under wraps”.
  • Stories stolen from elsewhere, such as from Data Warehousing, Business Intelligence, VLDB or Business Application projects.
  • Borderline fantasies and badly contrived technology fan fiction.

However, it doesn’t stop there.

One of the clearest examples of the questionable nature of Big Data evangelism is when it is used to piggyback Big Data hype on simple, tangible and immediately recognisable artefacts or applications that have little in common with Big Data.

This is an extreme illustration, but it works like this: “iPhones are commercially successful, iPhones are part of Big Data, and therefore Big Data is commercially successful.”

As if the mere conjuring up of association, affinity and proximity will convince people of the great and growing value of Big Data.

What I am also referring to are publicity pieces that may as well have been titled:

  • Smith, Galbraith, Mies, Keynes, Homer SImpson and the economic justification of Big Data
  • Lovelace, Babbage, von Neumann, Eckert, Davies, Codd, Knuth, Naur and the technological underpinnings of Big Data
  • Einstein, Freud, Edison, Faraday, Recorde and the intellectual structure of Big Data
  • Socrates, Kant, Hegel, Marx , Adorno and the philosophical correctness of Big Data
  • Great quotes about Big Data, from the Cambrian era to the postmodern époque
  • Great jokes about Big Data, from Mel Brooks to Steve Martin
  • Sportspeople and Big Data, from Lottie Dodd and Babe Ruth to Rafa Nadal and CR7
  • Industry support of Big Data, from Henry Ford to Neutron Jack

Do you recognise similarities?

It’s no big deal, just the use of unreliable, misleading and inappropriate fallacies, dressed up as cute, plausible and accessible collateral. People may think that such things are clever and witty, but they aren’t, it’s just misleading.

Let’s continue with something simple.

Evasion is, in ethics, an act that deceives by stating a true statement that is immaterial or leads to a false deduction. For example, citing events, persons or anecdotes from the history of IT to justify the supposed or imaginary value of Big Data. This is close to the notion of a non sequitur, which of course is an argument, the conclusions from which do not follow from its premise. It falls short of being full-on sophistry, purely because the simplistic, puerile and superficial arguments put forward in favour of Big Data do not match those of the true sophist who seeks to reason with clever but fallacious and deceptive arguments. Too many of the Big Data arguments are fallacious and deceptive, but no one, equipped with a reasonable capacity for critical thinking, should take such ‘arguments’ as valid.

Hold this thought: Big Data hype is a viper’s nest of logical fallacies, white lies and disinformation.

Just when I think things could not get any weirder, they do, and Big Data ceiling of hyperbole rises even higher, up to the rarer atmosphere of extreme tendentiousness.

There is a growing mass of Big Data hoop-la, hyperbole and flim flam that exceeds all previously bounds of overstatement, solecism and confabulation. This is where the real volumes, varieties and velocities are in Big Data; in hokie.

We live, as Oscar Wilde said in his day, in and age of surfaces. Yes, superficiality, puerility and short-termism are the competing orders of the day. However, I am still amazed – and maybe wrongly so – by what ostensibly professional, experienced and knowledgeable people are willing, able and prepared to accept, especially when it comes to Big Data flim flam sauce.

Here are some examples of the nonsense about Big Data that is taken as gospel by ‘adults’:

Data Warehousing is part of Big Data: No comment.

Big Data will replace Enterprise Data Warehousing: People can’t even explain the features and benefits of Big Data. I try it make it as easy as possible, ‘if you can’t say it, point to it’. But, seriously, people can’t even relate tangible and credible Big Data success stories, never mind show how it will replace Enterprise Data Warehousing, whether that’s the Inmon or Kimball flavour, take your pick.

Everyone and every organisation can benefit from Big Data: If people can’t explain this, and they don’t in terms of tangible benefits, then the claim should remain questionable.

Data Scientists will replace Statisticians: Why is that so? It is claimed that Data Scientists are uniquely equipped to handle massive volumes, varieties and velocities of data – well, as it turns out, this isn’t certain either.

Big Data is in its infancy: I think we may be confusing infancy with lack of real traction, and of time and place utility.

You cannot be serious: Just what are people talking about here? I have read vague, naïve and ill-informed pieces about data management, data architecture, data warehousing, reporting, business intelligence and a plethora of etcetera that have been passed off as observations and commentary on Big Data. So, what makes people recycle hackneyed, misleading and badly conceptualised ‘content’?

In the commentary on one of Bernard Marr’s pieces on LinkedIn (a professional networking site) I observed that no one can adequately explain what Big Data is without falling into contradictions and fancies, and no one seems to be capable or willing to provide tangible success stories.

Bernard responded to this comment by pointing out “the reason for that is that Big Data means different things to different people.”

Fair enough. It’s an explanation.

That said, I have always had more than a tenuous dislike of postmodern thinking, in fact most things ‘postmodern’. Call me old fashioned, jaded or cynical, but to me, the idea that everything can mean anything is an aberration that I prefer to leave to others.

I am at a loss to explain why so many reasonable people are willing to embrace the hype surrounding Big Data and Big Data Analytics, including the attendant surfeit of nonsense, incongruences and contradictions, and from my perspective, it defies reason and good sense.

Therefore, I will just end again with a fabulous quote from Ben Goldacre:

“You cannot reason people out of a position that they did not reason themselves into”.

Many thanks for reading.

All Data: It’s about statistics

30 Friday Jan 2015

Posted by Martyn Jones in All Data, Consider this, DW 3.0, Good Strat, Good Strategy, Information Supply Frameowrk, Martyn Jones, Martyn Richard Jones, statistics

≈ Leave a comment

Tags

All Data, Big Data, business intelligence, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones, statistics

LinkedInHeader1

A big computer, a complex algorithm and a long time does not equal science.

Robert Gentleman

To begin at the beginning

Fueled by the new fashions on the block, principally Big Data, the Internet of Things, and to a lesser extent Cloud computing, there’s a debate quietly taking please over what statistics is and is not, and where it fits in the whole new brave world of data architecture and management. For this piece I would like to put aspects of this discussion into context, by asking what ‘Core Statistics’ means in the context of the DW 3.0 Information Supply Framework.

Core Statistics on the DW 3.0 Landscape

The following diagram illustrates the overall DW 3.0 framework:

There are three main concepts in this diagram: Data Sources; Core Data Warehousing; and, Core Statistics.

Data Sources: All current sources, varieties, velocities and volumes of data available.

Core Data Warehousing: All required content, including data, information and outcomes derived from statistical analysis.

Core Statistics: This is the body of statistical competence, and the data used by that competence. A key data component of Core Statistics is the Analytics Data Store, which is designed to support the requirements of statisticians.

The focus of this piece is on Core Statistics. It briefly looks at the aspect of demand driven data provisioning for statistical analysis and what ‘statistics’ means in the context of the DW 3.0 framework.

Demand Driven Data Provisioning

The DW 3.0 Information Supply Framework isn’t primarily about statistics it’s about data supply. However, the provision of adequate, appropriate and timely demand-driven data to statisticians for statistical analysis is very much an integral part of the DW 3.0 philosophy, framework and architecture.

Within DW 3.0 there are a number of key activities and artifacts that support the effective functioning of all associated processes. Here are some examples:

All Data Investigation: An activity centre that carries out research into potential new sources of data and analyses the effectiveness of existing sources of data and its usage. It is also responsible for identifying markets for data owned by the organization.

All Data Brokerage: An activity that focuses on all aspects of matching data demand to data supply, including negotiating supply, service levels and quality agreements with data suppliers and data users. It also deals with contractual and technical arrangements to supply data to corporate subsidiaries and external data customers.

All Data Quality: Much of the requirements for clean and useable data, regardless of data volumes, variety and velocity, have been addressed by methods, tools and techniques developed over the last four decades. Data migration, data conversion, data integration, and data warehousing have all brought about advances in the field of data quality. The All Data Quality function focuses on providing quality in all aspects of information supply, including data quality, data suitability, quality and appropriateness of data structures, and data use.

All Data Catalogue: The creation and maintenance of a catalogue of internal and external sources of data, its provenance, quality, format, etc. It is compiled based on explicit demand and implicit anticipation of demand, and is the result of an active scanning of the ‘data markets’, ‘potential new sources’ of data and existing and emerging data suppliers.

All Data Inventory: This is a subset of the All Data Catalogue. It identifies, describes and quantifies the data in terms of a full range of metadata elements, including provenance, quality, and transformation rules. It encompasses business, management and technical metadata; usage data; and, qualitative and quantitative contribution data.

Of course there are many more activities and artifacts involved in the overall DW 3.0 framework.

Yes, but is it all statistics?

Statistics, it is said, is the study of the collection, organization, analysis, interpretation and presentation of data. It deals with all aspects of data, including the planning of data collection in terms of the design of surveys and experiments; learning from data, and of measuring, controlling, and communicating uncertainty; and it provides the navigation essential for controlling the course of scientific and societal advances[i]. It is also about applying statistical thinking and methods to a wide variety of scientific, social, and business endeavors in such areas as astronomy, biology, education, economics, engineering, genetics, marketing, medicine, psychology, public health, sports, among many.

Core Statistics supports micro and macro oriented statistical data, and metadata for syntactical projection (representation-orientation); semantic projection (content-orientation); and, pragmatic projection (purpose-orientation).

The Core Statistics approach provides a full range of data artifacts, logistics and controls to meet an ever growing and varied demand for data to support the statistician, including the areas of data mining and predictive analytics. Moreover, and this is going to be tough for some people to accept, the focus of Core Statistics is on professional statistical analysis of all relevant data of all varieties, volumes and velocities, and not, for example, on the fanciful and unsubstantiated data requirements of amateur ‘analysts’ and ‘scientists’ dedicated to finding causation free correlations and interesting shapes in clouds.

That’s all folks

This has been a brief look at the role of DW 3.0 in supplying data to statisticians.

One key aspect of the Core Statistics element of the DW 3.0 framework is that it renders irrelevant the hyperbolic claims that statisticians are not equipped to deal with data variety, volumes and velocity.

Even with the advent of Big Data alchemy is still alchemy, and data analysis is still about statistics.

If you have any questions about this aspect of the framework then please feel free to contact me, or to leave a comment below.

Many thanks for reading.

Catalogue under: #bigdata #technology

[i] Davidian, M. and Louis, T. A., 10.1126/science.1218685


File under: Good Strat, Good Strategy, Martyn Richard Jones, Martyn Jones, Cambriano Energy, Iniciativa Consulting, Iniciativa para Data Warehouse, Tiki Taka Pro

Follow GOOD STRATEGY on WordPress.com

Top posts

  • Heaven help us! Have you seen the latest Virtual Data Warehouse bullshit?
  • Data Warehousing and Sources of Truth: Rarely Pure, Never Simple
  • The World's Best Data Quotes... Including Big Data quotes
  • Become an Instant Big Data Rock Star with 10 Insider Tips from the Top
  • Agile at Scale is bullshit by design
  • Bullshit at the Data Lakehouse
  • Head Over Heels - The many colours, hues and tones of poems, lyrics and words

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,439 other subscribers

Names in the cloud

4th generation Data Warehousing All Data Ask Martyn Big Data Big Data 7s Big Data Analytics Business Intelligence business strategy Consider this dark data data architecture Data governance Data Lake data management data science Data Supply Framework Data Warehouse Data Warehousing Good Strat goodstrat Good Strategy IT strategy Martyn does Martyn Jones Martyn Richard Jones pig data Politics Strategy The Amazing Big Data Challenge The Big Data Contrarians

The Good Strat Archives

  • March 2023
  • January 2022
  • December 2021
  • November 2021
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • July 2019
  • June 2019
  • May 2019
  • December 2018
  • January 2018
  • December 2017
  • October 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • September 2016
  • August 2016
  • May 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014

The Stats

  • 99,717 hits

Recent posts

  • You don’t need a data warehouse to do data warehousing March 22, 2023
  • Data Warehousing means having thousands of ETL jobs March 21, 2023
  • The data warehouse is the repository for the post-transactional data March 20, 2023
  • Does your way of providing data have business value? March 19, 2023
  • Data warehousing stands in the way of progress March 18, 2023
  • Data Trailblazers: 2022 Vision January 2, 2022
  • Tea with The Data Contrarian: Afilonius Rex December 10, 2021
  • Reality Check: Data Mesh and Data Warehousing   December 5, 2021
  • Myth-busting: Data Mesh and Data Warehousing – Revisited November 25, 2021
  • Heaven help us! Have you seen the latest Virtual Data Warehouse bullshit? June 26, 2020

Hours & Info

Martyn Richard Jones
Madrid, Spain
+33 767 120 160
10:00 - 17:00
Follow GOOD STRATEGY on WordPress.com

Follow me on Twitter

My Tweets

Top Good Strat Posts & Pages

  • The Good Strategy Company
  • Heaven help us! Have you seen the latest Virtual Data Warehouse bullshit?
  • About
  • Data Warehousing and Sources of Truth: Rarely Pure, Never Simple
  • The World's Best Data Quotes... Including Big Data quotes
  • Become an Instant Big Data Rock Star with 10 Insider Tips from the Top
  • Agile at Scale is bullshit by design
  • Bullshit at the Data Lakehouse
  • Head Over Heels - The many colours, hues and tones of poems, lyrics and words

Good strat tag cloud

accountability advertising All Data Analytics aspiring tendencies in IM awareness Banking Behavioural Economics BI Big Data Bill Inmon Brexit BS Business business analysis Business Enablement business intelligence Business Management business strategy Challenges Commercial IT Consider this corporate assets Corporate IT Creativity data data analytics data architecture data integration data management Data Marts data science Data Warehouse Demagogism Dogma DW 3.0 Economics enterprise data warehousing EU Financial Goal Setting goodstart good start Good Strat goodstrat Good Strategy hadoop Information and Technology information management Information Technology IT business IT Strategy knowledge management leadership marketforces Marketing Martyn Jones Martyn Richard Jones MDM Offshoring operationalwareness Organisational Autism organisational awareness Outsourcing Pimps Politics project management Requirements management Risk Risk Management statistics Strategy trading traditional assets UK

Categories

  • 4th generation Data Warehousing
  • accountability
  • advertising
  • agile
  • agile way of working
  • agile@scale
  • AI
  • All Data
  • Analytics
  • anthropology
  • Architecture
  • Artificial Intelligence
  • Ask Martyn
  • Assets
  • awareness
  • bad strategy
  • Banking
  • behaviour
  • Best principles
  • Big Data
  • Big Data 7s
  • Big Data Analytics
  • blockchain
  • Books with influence
  • Brexit
  • BS
  • business
  • Business Intelligence
  • business strategy
  • Cambriano
  • Cambridge Analytica
  • China
  • Climate Change
  • Cloud
  • code of conduct
  • Commercial Analytics
  • community
  • Condiser this
  • Conservative Party
  • consider
  • Consider this
  • Consultation
  • Creativity
  • dark data
  • data
  • data architecture
  • Data governance
  • data hub
  • Data Lake
  • data management
  • Data Mart
  • data mesh
  • data science
  • Data Supply Framework
  • Data Warehouse
  • Data Warehousing
  • deceit
  • deep learning
  • Democracy
  • digital transformation
  • Diplomacy
  • disinformation
  • Dogma
  • Duties
  • DW 3.0
  • ECM
  • Economics
  • EDW
  • England
  • enterprise content management
  • ethics
  • EU
  • Europe
  • European Union
  • Excellence
  • Excerpt
  • Executive
  • Extract
  • Federalism
  • Financial Industry
  • fraud
  • Freedoms
  • Globalisation
  • good start
  • Good Strat
  • Good Strategy
  • Good Strategy Radio
  • goodstart
  • goodstartegy
  • goodstrat
  • goostart
  • governance
  • hadoop
  • hdfs
  • HR
  • humour
  • India
  • influencers
  • informatio Supply Framework
  • information
  • Information Management
  • Information Supply Frameowrk
  • Information Supply Framework
  • Infotrends
  • Inmon
  • instruments
  • IoT
  • IT Circus
  • IT fraud
  • IT strategy
  • IT World
  • iterations
  • java
  • Knowledge
  • knowledge management
  • Labour Party
  • leadership
  • Leadership 7s
  • life
  • listening
  • literature
  • LSE
  • machine learning
  • Management
  • market forces
  • Marketing
  • Marty does
  • Martyn does
  • Martyn Jones
  • Martyn Richard Jones
  • media
  • Memory lane
  • Methodology
  • nationalism
  • nine competitive forces
  • no limits
  • Northern Ireland
  • obituary
  • Obligations
  • offshore
  • Offshoring
  • operational
  • Outsourcing
  • Oxford
  • pain
  • Parliament
  • Peeves
  • Personal Integrity Key
  • Philosophy
  • pig data
  • PIK
  • PIR
  • Plaid Cymru
  • Planning
  • poem
  • poems
  • Poetry
  • Polemic
  • political science
  • Politics
  • pomo
  • postmodern
  • POTUS
  • Process
  • Professional Networking
  • professionalism
  • project management
  • Project to Excel
  • prose
  • public
  • Public Integrity Record
  • Quiz
  • Rant
  • Referendum
  • Remain
  • RIghts
  • Risk
  • Rivalry
  • Russia
  • Ruth Davidson
  • Sales
  • satire
  • Scotland
  • Scottish National Party
  • scrum
  • sentiment analysis
  • SMILES
  • Snippet
  • SNP
  • Social
  • Social Media
  • Sociology
  • spoof
  • statistics
  • Stories
  • Strategy
  • structured intellectual capital
  • supply chain management
  • tactics
  • Tax avoidance
  • Tax evasion
  • TEAM
  • technology
  • The Amazing Big Data Challenge
  • The Big Data Contrarians
  • The Greens
  • The Guardian
  • The hidden wealth of nations
  • Trade
  • UK
  • Uncategorized
  • United Kingdom
  • USA
  • Value
  • Wales
  • wisdom

Blog at WordPress.com.

  • Follow Following
    • GOOD STRATEGY
    • Join 131 other followers
    • Already have a WordPress.com account? Log in now.
    • GOOD STRATEGY
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...
 

    Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
    To find out more, including how to control cookies, see here: Cookie Policy