• Home
  • About
  • The Good Strategy Blog
  • Strategy
    • Data Warehousing
    • Ask Martyn
  • Must-Read Books from Martyn
  • MARTYN’S MUSIC
  • PODCASTS

GOOD STRATEGY

~ for every significant challenge

GOOD STRATEGY

Category Archives: Consider this

7 Signals that someone has quit

14 Saturday Mar 2015

Posted by Martyn Jones in Consider this, good start, Good Strat, goodstart, Martyn Richard Jones

≈ Leave a comment

Tags

careers, Consider this, good start, Good Strat, Good Strategy, goodstart, Martyn Jones, Martyn Richard Jones, quit


You are the boss. You are the leader, coach and manager, and there are some things that you just got to learn, like it or not. One of these skills is to be able to identify when someone has quit. “How dare they?” I here you ask.

The first time I quit a job and didn’t tell anybody was when I was in the RAF working as a fighter pilot in World War 2, and I accidentally bombed Newport in South Wales, and was given a stern talking to for my troubles. Well, I didn’t actually quit and I was never in the armed forces and I was born into the era of the Beat Generation, but that’s by the by, it’s just there for effect, to create some artificial empathy between me and those who have actually quit a job and not told anyone about it. Myself, I would never do such a thing. Although to be fair, Newport has looked like it has been freshly bombed with dark green, brown and grey shades of poster paints and self-raising flour, since forever. Continue reading →

Consider this: Big Data Forever!

14 Saturday Mar 2015

Posted by Martyn Jones in Big Data, Consider this, dark data, Martyn Jones

≈ Leave a comment

Tags

Big Data, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones


Dans ce pays-ci, il est bon de tuer de temps en temps un amiral pour encourager les autres – Voltair

My gran used to tell me that honesty pays. Of course, she never really understood banking or IT, probably because she didn’t want to know anything about them, and she never lived to witness the amazing hype circuses, the spin doctors spiel or the focus-group dog-and-pony show of the 21st century. Indeed, if honesty were a guaranteed payer my gran would have amassed more wealth than even Warren Buffet himself.

If my gran lived today, she might reflect on what Big Data might be about – maybe she would even consider it benignly, as a sort of shelter for fallen men of once uncertain virtue. We will never know. So onwards and upwards.

The Harvard Business Review contemplated honesty in somewhat different terms:

“Honesty is, in fact, primarily a moral choice. Businesspeople do tell themselves that, in the long run, they will do well by doing good. But there is little factual or logical basis for this conviction. Without values, without a basic preference for right over wrong, trust based on such self-delusion would crumble in the face of temptation.”

In a marvellous book, A few good from Univac, David E. Lundstrom narrates the story of Sperry Univac in the 1960s, one of the true great innovators in the first forty years of IT, and includes an allegory taken from the engineering front-line. I will recount it here, edited to highlight the zeitgeist, for your entertainment and as Voltaire put it, “to encourage the others”:

In the beginning was the Big Data Plan.

And then came the Big Data Assumptions.

And the Assumptions were without form.

And the Plan was without substance.

And darkness was upon the face of the Workers.

And they spoke amongst themselves, saying: “It is a crock of shit, and it stinketh.”

And the workers went unto their Supervisors and said: “It is a pail of dung, and none may abide the odor thereof.”

And the Supervisors went unto their Managers, saying: “It is a container of excrement, and it is very strong, such that none may abide by it.”

And the Managers went unto their Directors, saying: “It is a vessel of fertilizer, and none may abide its strength.”

And the Directors spoke amongst themselves, saying to one another: “It contains that which aids plant growth, and it is very powerful.”

And the Vice Presidents went unto the President, saying unto him: “This new plan will actively promote the growth and vigor of the company, with powerful effects.”

And the President looked upon the Big Data Plan, and saw that it was good.

“But?” I hear you say, “why fight it, why not take advantage of the Big Data zeitgeist?”, “Why not cash in on the grand bonanza Big Data bandwagon?” or “Monetise the 3 three famous Vs of Big Data?”

Well, it had crossed my mind, briefly, and (outside of the USA) we’ve all done stuff we have not entirely believed in, so the temptation to cash in is present, capisci? This paraphrasing of a piece from My Blue Heaven might give you a better idea:

One of my best friends makes his living as a completely phony Big Data Scientist. For two hundred bucks he can make you a Data Scientist or a Big Data guru. Some guys give you an education but this guy gives you immediate access to high paying jobs, sex that would make the 256 trillion Shades of Blah blush and a life in the City, the Big Apple or a small town in Germany.

Moreover, for an extra 250 bucks (limited time offer) you can also become a certified Big Data Neuro Trainer, which will allow you to do unto others what has been done unto you.

I also considered Big Data Brokerage, Big Data Certification and Big Data Independent Trading (New York – Paris – Peckham). The opportunities are immense.

However, what happens when the Big Data well runs dry, and I (and many others get tarnished with the mark of Big Data) become pariah by complicity, collusion or simple association?

That question I will leave for another day. But just consider the following.

All right, I admit, I am a big long-time fan of comic genius Mel Brooks, who has a knack of capturing deep insight from the human condition, especially when the human condition is off guard and shallow. In that vein, this is how I like to think the dialogue from the Dole Office scene from The History of the World Part Two would have gone, if he were to write that today:

Dole Office Clerk: Occupation?

Data Magnus Comicus: Stand-up Big Data scientist.

Dole Office Clerk: What?

Data Magnus Comicus: Stand-up Big Data scientist. I coalesce the vaporous datas of the human interaction with the social-media networking, Internet of Everything, and always-connected experience into a… viable, analytical and meaningful predictive-comprehension.

Dole Office Clerk: Oh, a Big Data bullshit artist!

Data Magnus Comicus: *Grumble*…

Dole Office Clerk: Did you bullshit Big Data last week?

Data Magnus Comicus: No.

Dole Office Clerk: Did you try to bullshit Big Data last week?

Data Magnus Comicus: Yes!

Finally, I leave you with some wise words from Israeli American professor of psychology and behavioural economics, Dan Ariely:

“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…”

Many thanks for reading.

Consider this: Big Data and the Curse of the Temple of Java

13 Friday Mar 2015

Posted by Martyn Jones in agile, Big Data, Consider this, java

≈ Leave a comment

Tags

agile, Big Data, java, refractor


r019

“Rats, rats for sale. Get your rats. Good for rat stew, rat soup, or the ever-popular ratatouille”. – Mel Brooks

Hold this thought: Everything that the Templars of Java touch turns to dreck.

In a small and timeless village in misty and mountainous Transylvania, the locals mourn the passing of yet another victim.

On the wind swept beaches of a wintry Costa Blanca, the reverberating voice of childish despair is barely perceptible through the crashing of the waves on the grey, cold and craggy rocks.

In Victorian London, a hobgoblin of indescribable and vacuous insanity stalks the silent and rain drizzled streets.

Cracking this curse will take more than the combined powers of Clint Eastwood, Mel Brooks and Homer Simpson.

A spectre haunts the face of Europe, the spectre of Big Data and the Curse of the Temple of Java.

Everything that the disciples of the Temple touch turns to blah. Everything that the disciples call their own has been blagged from elsewhere.

Take the very language of Java itself, an authentic eccentricity amongst computing languages. If Java code were real coffee grains, it would be used to make the shittiest coffee in the history of humankind.

Given the vast amounts of knowledge and experience that was washing around IT at the time of Java’s hatching, it must be considered to be the most demonic aberration of a programming language ever conceived by woman, man or beast.

“Cats have a scam going – you buy the food, they eat the food, they go away; that’s the deal.” – Eddie Izzard

If ever there was an excuse in IT for failing to deliver or for delivering badly and late, then Java is your friend.

In the hands of the right people, Java can turn a one year and $3M project into a five year and $300M project, and still not deliver anything of use.

Yet magically, and out of the people directly responsible for these debacles, no one is sacked, sued or busted as a result, the incumbent supplier either quietly leaves the scene or is rewarded for their gross incompetence and dishonesty, and in many cases a success is hailed, even if that success looks remarkably like abject failure. It is totally false, absolutely dishonest and thoroughly unprofessional. But that’s what we have, like it or not.

Java sucks, it is a horrid language, aesthetically and functionally, it’s a piecemeal pile of do-do, a dirty old ragbag of ‘object-oriented’ hacks, logical aberrations and lagoons of missing structure, dysfunctional rationality and discontinuity – and that that’s not just my opinion:

“I spent several months programming in Java. Contrary to its authors’ prediction, it did not grow on me. I did not find any new insights – for the first time in my life, programming in a new language did not bring me new insights. It keeps all the stuff that I never use in C++ – inheritance, virtuals – OO gook – and removes the stuff that I find useful.” – Alexander Stepanov

“Claiming Java is easier than C++ is like saying that K2 is shorter than Everest.” – Larry O’Brien

“I would rather use Java than Perl. And I’d rather be eaten by a crocodile than use Java.”

“If I wanted plastic scissors I’d use Java. Give me my scalpel back.”

And for the record, even Linus Torvald hates it.

But if you thought Java was a horrid, hype infested viper’s den of programming bad practice and hyper-hype, just wait until you see what’s behind Hadoop.

As long as the world is turning and spinning, we’re gonna be dizzy and we’re gonna make mistakes. – Mel Brooks

Hadoop must be the biggest piece of technical and rhetorical bullshit in the history of data management.

Repackage a series of Unix primitives (cat, grep, awk, cut, sed, wc) built on top of parallel Linux or Unix. Dress it up, take it out on the town, and call it the greatest thing since sliced bread. It is nothing less than a brazen and blatant con. Want to count words? Use wc (Unix wordcount).

Let me repeat that, using other words. If you made a compilation of extracts from the works of the world’s greatest thinkers and authors, randomised replacement of some of the words, and produced and published this compilation, as all your own work, what would you call that?

So back to when this happens, frequently, in IT.

This might fool the foolish who don’t have the first idea about anything technical, objective or rational beyond whatsapp, kiddy scripting and HTML, but if you have a clue, you know that this is a scam, a very big one. It is also dishonest.

So how do they (the scammers) get away with it?

Easy. You have bad apples everywhere. But there is another reason. For well over a decade the world of IT has become the dumping ground for the stupid, lazy and indolent kids of the comfortable middle-classes and also a hunting ground for unscrupulous wide-boys.

Listen up parents!

Do you think that your kid is way too thick to be a doctor, scientist, lawyer, researcher, professor, teacher, statistician, health worker, politician, bus driver, street cleaner, entrepreneur, sandwich maker or economist?

Your kid has no creativity beyond messing with their food?

Your kid has no sporting ability apart from skills at gaming?

The only academic ability your kid has is your money?

No worries!

IT for you, my son!

So if that’s you, then lap it up. Real knowledge and experience will not come your way, but you will learn the dogma of the Temple of Java, and you will be able to repeat it to perfection, just like Pavlov’s favourite dog.

You will learn to be be pliable, usable and even more gullible. You will know bugger all about practical IT or the architecture, evolution and application of information technology and data, and vendors will love you for it, for you will be just an extension of their idea of increasing the profit rate.

This is how IT business has become the refuge of liars, cheats, pimps and the chronically dopey, and this is why Java and Hadoop have become the ultimate expression in programming and data. It’s a geeky Greek tragedy being played out as we speak. O tempora, o morons.

But it isn’t just about Java and Hadoop. Everything the Templars of Java touch turns to dreck. Whether we are looking at aberrations and failures in rapid joint application development, end user computing, database design (refractor this, dimwits!), or solutions and domain architecture, and more, the dead cold hand of the Java Mafia is invariably behind it.

And now, to top it all off, the miserable Templars of Java want to take over and displace Bill’s Data Warehousing. You couldn’t make it up.

So, who will save the IT world from the evil doers?

To paraphrase Homer Simpson: I’m not normally a praying man, but if you’re up there, please save us, Wonderwoman.

Thank you so much for reading.

Big Data and the Rise of Instant Analytics: The word

12 Thursday Mar 2015

Posted by Martyn Jones in Consider this, spoof

≈ Leave a comment

Tags

spoof


Magnus data wis novum bulla – data est homo ex Walliam

The subject of Big Data and the rise of instant analytics has been covered intensively by the world’s media over the past decade or so, it has been hyped to the heavens and praised to the skies. The constantly changing fashionable take on Big Data and the intensification of instantaneous analytics demonstrates the depth of the subject and the volumes, varieties and velocities of the data and the speculation. Given that its influence pervades our privileged society, it is important to remember that ‘what goes up must come down’, ‘what goes around, comes around’ and ‘a data byte in time, generates 9 terabytes of machine learning. It is therefore an unfortunate consequence of our society’s history that Big Data and instant analytics is rarely given rational consideration by global commercial initiatives and global governance, whom I can say no more about due to legal constraints. Though I would rather not be in consort with the devil – i.e. the downside of human projection and superstition – I will now examine the primary drivers behind Big Data and the rise of instant analytics.

Social Factors

Society is a human product. When J H Darcy said ‘fervour will spread’ [1] she must have been referring to Big Data. Both tyranny and democracy are tried and questioned. Yet Big Data, The Mail on Sunday and instant analytics raises the question ‘why not?’

When one is faced with people of today a central theme emerges – Big Data and the role of the ‘data scientist’ is either adored or despised, it leaves no one undecided. It has been said that the one type of society that could survive a nuclear attack is a Big Data driven one. This is hypothetically incorrect, actually nuclear powered neuro-cockroaches are the only things that can survive an all-out nuclear attack perpetrated by the evil doers.

Economic Factors

Is unemployment inherently bad for an economy? Yes. We shall examine the Maiden-Tuesday-Lending model, which I hope will be familiar to most readers.

National
Debt
Big Data and the rise of instant analytics

Clearly, the graphs demonstrates a strong correlation. Why is this? Obviously, the national debt will continue to follow Big Data, data science and instant analytics for the near future. The financial press seems unable to make up its mind on these issues, which unsettles investors.

Political Factors

Politics, we all agree, is a fact. Comparing the electoral politics of most Western and Eastern European countries is like comparing pre and post war views of Big Data and the rise of instant analytics.

It is always enlightening to consider the words of one of the great political analysts Augstin Rock ‘A man must have his cake and eat it in order to justify his actions.’ [2] What a fantastic quote. Both spectacular failure and unequalled political accomplishment may be accredited to Big Data and the rise of instant analytics.

Is Big Data and the rise of instant analytics politically correct, in every sense? Each man, woman and to a lesser extent, child, must make up their own mind.

Conclusion

In conclusion, Big Data and the rise of instant analytics may not be the best thing since sliced bread, but it’s still important. It sings a new song, brought up a generation and statistically it’s great.

Here with the final word is Hollywood’s Denzel Travolta: ‘You win some, you lose some, but Big Data and instant analytics wins most often.’ [3]

Thanks

I would like to thank Professor Afilonius Jones, Professor Chon Quenadi and Doctor Ardio Weltweit for collaborating in the writing of this piece and for correcting the draft.

[1] J H Darcy – The Spaniard – 1988 – PPT

[2] Rock – Roll It Up – 1977 – F. Lower Publishing

[3] Weekly Big Data and the rise of instant analytics – Issue 54 – Rhino Media

What’s all the fuss about Dark Data? Big Data’s New Best Friend

10 Tuesday Mar 2015

Posted by Martyn Jones in All Data, Big Data, Consider this, dark data, Good Strat

≈ Leave a comment

Tags

All Data, Big Data, dark data, data architecture, data management, Good Strat, Martyn Jones, Martyn Richard Jones


What is Dark Data?

Dark data, what is it and why all the fuss?

First, I’ll give you the short answer. The right dark data, just like its brother right Big Data, can be monetised – honest, guv! There’s loadsa money to be made from dark data by ‘them that want to’, and as value propositions go, seriously, what could be more attractive?

Let’s take a look at the market.

Gartner defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes” (IT Glossary – Gartner)

Techopedia describes dark data as being data that is “found in log files and data archives stored within large enterprise class data storage locations. It includes all data objects and types that have yet to be analyzed for any business or competitive intelligence or aid in business decision making.” (Techopedia – Cory Jannsen)

Cory also wrote that “IDC, a research firm, stated that up to 90 percent of big data is dark data.”

In an interesting whitepaper from C2C Systems it was noted that “PST files and ZIP files account for nearly 90% of dark data by IDC Estimates.” and that dark data is “Very simply, all those bits and pieces of data floating around in your environment that aren’t fully accounted for:” (Dark Data, Dark Email – C2C Systems)

Elsewhere, Charles Fiori defined dark data as “data whose existence is either unknown to a firm, known but inaccessible, too costly to access or inaccessible because of compliance concerns.” (Shedding Light on Dark Data – Michael Shashoua)

Not quite the last insight, but in a piece published by Datameer, John Nicholson wrote that “Research firm IDC estimates that 90 percent of digital data is dark.” And went on to state that “This dark data may come in the form of machine or sensor logs” (Shine Light on Dark Data – Joe Nicholson via Datameer)

Finally, Lug Bergman of NGDATA wrote this in a sponsored piece in Wired: “It” – dark data – “is different for each organization, but it is essentially data that is not being used to get a 360 degree view of a customer.

Say what?

Okay, let’s see if we can be a bit more specific about the content of dark data?

Items on the dark data ticket include: Email; Instant messages; documents; Sharepoint content; content of collaboration databases; ZIP files; log files; archived sensor and signal data; archived web content; aged audit trails; operational database backups – full and incremental; roll-back, redo and spooled data files; sunsetted applications (code and documentation); partially developed and then abandoned applications; and, code snippets.

Most importantly, dark data is data that is not actively in use, is underutilised, or is something else. Seriously.

What can you do with it?

So, the conclusion that some have come to is this: there is a vast collection of data in various formats waiting to be monetised.

Personally, the idea that really grabs my attention is the potential ability to do novel forensic research on email. If only to find out what happened in the past.

For example, maybe it would be fascinating to see how significant challenges were identified, flagged and discussed; how strategic responses to those challenges were formulated, chosen and executed; and, how the outcomes of all of that process were reflected in email communications.

I think that this line of work can be very interesting for some people, and that interesting insights may be uncovered, but I would hate to have to put a tangible value on it, if only to avoid adding to the already galactic magnitudes of nonsense and hype surrounding certain data topics.

There are other more mundane uses of dark data.

Imagine that you are just about to embark on a Data Warehouse project (you really are a late adopter aren’t you), and you want establish a base collection of historical data. Where do you get that historical data from?

Right! Operational databases are not characteristically used to store significant amounts of historical reference data and historical transactions beyond a certain time window; there are performance and other reasons for keeping OLTP systems as lean as possible, so, initial loads of historical data is typically recreated in the Data Warehouse from backups, audit trails or logs.

Dark data and data governance

You don’t need a Chief Data Officer in order to be able to catalogue all your data assets. However, it is still good idea to have a reliable inventory of all your business data, including the euphemistically termed Big Data and dark data.

If you have such an inventory, you will know:

What you have, where it is, where it came from, what it is used in, what qualitative or quantitative value it may have, and how it relates to other data (including metadata) and the business.

What needs to be kept, and for how long, and what can be safely discarded, and when.

The risks associated with the retention or loss of that data.

If you don’t have such a catalogue and have never done a data inventory then a full data inventory and audit seems to be your new best friend.

What does it mean?

Simply stated, you may have dark data that has value, or it may be a simple collection of worthless digital nostalgia. But if you don’t know what you have, it may pay to find out what’s there, and if necessary, to let it go.

There is no point in hoarding unneeded and unwanted rubbish data. That is simply not good data management.

Finally a word on all the fuss surrounding dark data.

Failure to monetize when there is value to be obtained from dark data is one thing, claiming that value can be invariably obtained whilst actually not knowing what the data is, or how it could be monetised, is just adding to the mountain of data related ‘nonsense and hype’ doing the rounds these days. Please consider not adding to that mountain.

That’s all folks

British Rail, the national UK rail Company, used to be notorious for the number of delays and cancellations to services, and their reasons for failing to meet their obligations became stranger and stranger.

In winter, it would snow and there would be problems. And people would ask ‘how come you couldn’t deal with the snow this year, we’ve had snow for centuries?’ And back came the answers ‘Yes, Sir, but this year it was the wrong type of snow’. In autumn (the fall), it was ‘the wrong types of leaves, and ‘the wrong type of rain’, and in Summer, the ‘wrong type of sunshine’ and so on and so forth.

I hope this will not be the excuse from the Big Data and dark data pundits and punters when the much-vaunted and ‘almost’ guaranteed monetisation isn’t frequently realised.

‘Of course Big Data gives you big dollar benefits, it was just littered with the wrong type of data’ or ‘you just weren’t trying hard enough’.

Many thanks for reading.

Consider this: Big Data is not Data Warehousing

06 Friday Mar 2015

Posted by Martyn Jones in Big Data, Consider this, Data Warehousing, Good Strat, hadoop, hdfs, Martyn Jones

≈ 4 Comments

Tags

Big Data, enterprise data warehousing, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones


Hold this thought: To paraphrase the great Bob Hoffman, just when you think that if the Big Data babblers were to generate one more ounce of bull**** the entire f****** solar system would explode, what do they do? Exceed expectations.

I am a mild mannered person, but if there is one thing that irks me, it is when I hear variations on the theme of “Data Warehousing is Big Data”, “Big data is in many ways an evolution of data warehousing” and “with Big Data you no longer need a Data Warehouse”.

Big Data is not Data Warehousing, it is not the evolution of Data Warehousing and it is not a sensible and coherent alternative to Data Warehousing. No matter what certain vendors will put in their marketing brochures or stick up their noses.

In spite of all of the high-visibility screw-ups that have carried the name of Data Warehousing, even when they were not Data Warehouse projects at all, the definition, strategy, benefits and success stories of data warehousing are known, they are in the public domain and they are tangible.

Data Warehousing is a practical, rational and coherent way of providing information needed for strategic and tactical option-formulation and decision-making.

Data Warehousing is a strategy driven, business oriented and technology based business process.

We stock Data Warehouses with data that, in one way or another, comes from internal and optional external sources, and from structured and optional unstructured data. The process of getting data from a data source to the target Data Warehouse, involves extraction, scrubbing, transformation and loading, ETL for short.

Data Warehousing’s defining characteristics are:

Subject Oriented: Operational databases, such as order processing and payroll databases and ERP databases, are organized around business processes or functional areas. These databases grew out of the applications they served. Thus, the data was relative to the order processing application or the payroll application. Data on a particular subject, such as products or employees, was maintained separately (and usually inconsistently) in a number of different databases. In contrast, a data warehouse is organized around subjects. This subject orientation presents the data in a much easier-to-understand format for end users and non-IT business analysts.

Integrated: Integration of data within a warehouse is accomplished by making the data consistent in format, naming and other aspects. Operational databases, for historic reasons, often have major inconsistencies in data representation. For example, a set of operational databases may represent “male” and “female” by using codes such as “m” and “f”, by “1” and “2”, or by “b” and “g”. Often, the inconsistencies are more complex and subtle. In a Data Warehouse, on the other hand, data is always maintained in a consistent fashion.

Time Variant: Data warehouses are time variant in the sense that they maintain both historical and (nearly) current data. Operational databases, in contrast, contain only the most current, up-to-date data values. Furthermore, they generally maintain this information for no more than a year (and often much less). In contrast, data warehouses contain data that is generally loaded from the operational databases daily, weekly, or monthly, which is then typically maintained for a period of 3 to 10 years. This is a major difference between the two types of environments.

Historical information is of high importance to decision makers, who often want to understand trends and relationships between data. For example, the product manager for a Liquefied Natural Gas soda drink may want to see the relationship between coupon promotions and sales. This is information that is almost impossible – and certainly in most cases not cost effective – to determine with an operational database.

Non-Volatile: Non-volatility means that after the data warehouse is loaded there are no changes, inserts, or deletes performed against the informational database. The Data Warehouse is, of course, first loaded with cleaned, integrated and transformed data that originated in the operational databases.

We build Data Warehouses iteratively, a piece or two at a time, and each iteration is primarily a result of business requirements, and not technological considerations.

Each iteration of a Data Warehouse is well bound and understood – small enough to be deliverable in a short iteration, and large enough to be significant.

Conversely, Big Data is characterised as being about:

Massive volumes: so great are they that mainstream relational products and technologies such as Oracle, DB2 and Teradata just can’t hack it, and

High variety: not only structured data, but also the whole range of digital data, and

High velocity: the speed at which data is generated, transmitted and received.

These are known as the three Vs of Big Data, and they are subject to significant and debilitating contradictions, even amongst the gurus of Big Data (as I have commented elsewhere: Contradictions of Big Data).

From time to time, Big Data pundits slam Data Warehousing for not being able to cope with the Big Data type hacking that they are apparently used to carrying out, but this is a mistake of those who fail to recognise a false Data Warehouse when they see one.

So let’s call these false flag Data Warehouse projects something else, such as Data Doghouses.

“Data Doghouse, meet Pig Data.”

Failed or failing Data Doghouses fail for the same reasons that Big Data projects will frequently fail. Both will almost invariably fail to deliver artefacts on time and to expectations; there will be failures to deliver value or even simply to return a break even in costs versus benefits; and of course, there will be failures to deliver any recognisable insight.

Failure happens in Data Doghousing (and quite possibly in Big Data as well) because there is a lack of coherent and cohesive arguments for embarking on such endeavours in the first place; a lack of real business drivers; and, a lack of sense and sensibility.

There is also a willing tendency to ignore the advice of people who warn against joining in the Big Data hubris. Why do some many ignore the ulterior motives of interested parties who are solely engaged in riding on the faddish Big Data bandwagon to maximise the revenue they can milk off punters? Why do we entertain pundits and charlatans who ‘big up’ Big Data whilst simultaneously cultivating an ignorance of data architecture, data management and business realities?

Some people say that the main difference between Big Data and Data Warehousing is that Big Data is technology, and Data Warehousing is architecture.

Now, whilst I totally respect the views of the father of Data Warehousing himself, I also think that he was being far too kind to the Big Data technology camp. However, of course, that is Bill’s choice.

Let me put it this way, if Oracle gave me the code for Oracle 3, I could add 256 bit support, parallel processing and give it an interface makeover, and it would be 1000 times better than any Big Data technology currently in the market (and that version of Oracle is from about 1983).

Therefore, Data Warehousing has no serious competing paragon. Data Warehousing is a real architecture, it has real process methodologies, it is tried and proven, it has success stories that are no secrets, and these stories include details of data, applications and the names of the companies and people involved, and we can point at tangible benefits realised. It’s clear, it’s simple and it’s transparent.

Just like Big Data, right?

Well, no.

See what I mean?

Therefore, the next time someone says to you that Big Data will replace Data Warehousing or that Data Warehousing is Big Data, or any variations on that sort of ‘stupidity’ theme, you can now tell them to take a hike, in the confidence that you are on the side of reason.

Many thanks for reading.

More perspectives on Big Data

Aligning Big Data: http://www.linkedin.com/pulse/aligning-big-data-martyn-jones

Big Data and the Analytics Data Store: http://www.linkedin.com/pulse/big-data-analytics-store-martyn-jones

A Modern Manager’s Guide to Big Data:http://www.linkedin.com/pulse/managers-guide-big-data-context-martyn-jones

Core Statistics coexisting with Data Warehousing

Accomodating Big Data

And a big thank you to Bill Inmon (the father of Data Warehousing and of DW 2.0)

Aligning Big Data – Chinese

03 Tuesday Mar 2015

Posted by Martyn Jones in Big Data, Consider this

≈ Leave a comment

Tags

Big Data, Good Strat, Martyn Jones


Aligning Big Data – Chinese version is thanks to Optimus Prime – published on http://www.36dsj.com/archives/23692

译文:数据仓库DW 3.0,一个大数据通用的结构框架和模型

大数据36大数据专稿,原文作者:Martyn Jones  本文由1号店-欧显东编译向36大数据投稿,并授权36大数据独家发布。转载必须获得本站及作者的同意,拒绝任何不标明作者及来源的转载!

引言:

为了带来一些类似的简单性,连贯性和完整性的大数据的辩论,我分享一个普遍信息架构和管理的进化模型。

这是对大数据到一个更通用的体系结构框架的调整和布局,架构集成了数据仓库(DW 2.0),商业智能和统计分析。

这个模型目前称为DW 3.0信息提供框架,简称DW 3.0。

回顾

在以前的一篇比较适用的博客名为“Data Made Simple – Even ‘Big Data‘ ”,里面主要有三个粗略类型的数据:企业运营数据;企业过程数据;以及企业信息数据。如下图:

大数据

图1-简要数据模型

简而言之数据的类型可以定义在以下几个:

企业运营数据:这是用于应用程序的数据,支持一个企业的日常运营。

企业过程数据:这是从企业系统是运行的测量和管理收集的数据。

企业信息数据:这主要是数据收集的来自内部和外部的数据源,通常最重要来源是企业运营数据。

这三个底层类型数据是DW 3.0基础。

主体

下面的图展示了DW 3.0总体框架::

大数据

图2 -DW3.0信息框架

在这个图中有三个主要元素:数据来源,核心数据仓库和核心数据。

数据来源:这个元素涵盖所有当前的来源,可用的数据的品种和数量用来支持“挑战识别”,“选择定义”的过程和决策,包括统计分析方法和场景法

数据仓库:这是一个DW 2.0模型的演化路径。它扩展了数据仓库的范式不仅包括非结构化和复杂的数据,而且执行的信息和结果来源于统计分析之外的核心数据仓库的场景。

核心统计:这个元素涵盖了核心的统计能力,特别是但不限于对于进化的数据量,数据速度,数据质量和数据的多样性。

这模块的重点是核心统计。也将提及到三者的关系和合并的效果。

核心统计:

下图关注的核心元素模型:

大数据

图3 – DW3.0核心统计

上图说明了数据流和信息通过数据采集的过程然后到统计分析和结果的集成。

这个模型还引入了分析数据存储的概念。这可以说是最重要的建筑元素。

数据来源

为了简单起见图中有三个显式指定的数据源(当然依赖的企业数据仓库或数据集市也可以作为一个数据源),但是,我在这篇文章中主要有以下三个数据源:复杂的数据;事件数据;基础数据。

复杂数据:这是结构化或高度复杂的结构化数据文件和其他复杂的数据中包含的文物,如多媒体文件。

事件数据:这是企业过程数据的一个方面,通常在一个细粒度的抽象层次。下面是业务流程日志,互联网web活动日志和其他类似事件数据的来源。这些来源所产生的量往往会高于其他数据源,和那些目前与大数据相关的大量的信息通过追踪即使是最轻微的行为数据覆盖生成一样。例如,有人随意浏览网站。

基础数据:这方面的数据包含可能描述为信号类型数据。通过复杂的事件关联和组件分析产生的连续高速流或者高度动荡的的数据。

革命从这里开始

在这里我将稍微突出建筑元素背后的一些指导原则。

没有业务就没有理由这样做:这是什么意思呢?这意味着每一个重大行动,甚至是高度投机活动,必须有一个有形的和可信的业务支持。就和“奥马哈圣人”,和“圣诞老人”的区别一样清楚。

架构决策都是基于一个完整的和深刻的理解需要实现什么和所有可用的选择:例如,拒绝使用高性能的数据库管理产品必须是有原因的,即使这原因是成本。不应该基于技术意见,如“我不喜欢供应商”如果对Hadoop有感觉,然后使用它,如果对Exasol或Oracle或Teradata有感觉,然后使用它们。那么你一定是一个技术不可知论者,但不是一个有教条的技术论者。

统计和非传统的数据源是完全集成到数据仓库未来架构前景::建设更多的公司仓库,无论是通过行动或遗漏,将导致更大的效率低下,更大的误解和更大的风险。

架构必须连贯,连贯,可用和成本效益:如果没有,有什么意义,对吧?

没有技术,技艺或方法是短板:我们需要能够低成本纳入任何相关现有的新兴技术。

减少早期性和减少频繁性:大量的数据,特别是在高速运转的是存在问题的。减少它们的存储容量,即使我们不能在理论上减少的速度是绝对必要的。我将详细说明这一点区别。

减少早期性,减少频繁性

这里我扩大早期的主题数据减少过滤和聚合,我们可能会产生越来越多的大量的数据,但这并不意味着我们需要囤积所有它为了得到一些价值。

简单的来说这就是将初始数据进行ETL(提取和转换)尽可能靠近数据生成器。这是数据库适配器的概念,但它可以逆转的。

让我们看一个场景。

一个公司想要实施一些投机性分析每天的每一分钟收集的许多互联网网站活动日志数据成,他们运行大量的日志文件分布式平台减少数据映射。

然后他们可以分析结果数据。

面临的问题,与许多网站被黑客,设计师,而不是工程师、建筑师和数据库专家开发,是乱堆着极大的和笨拙的文物,如大量的日志文件的详细钝角和新鲜感添加数据。

我们需要确保这个挑战可以移除吗?

我们需要重新考虑网络日志,然后我们需要重新设计它。

我们需要能够进行语法分析日志数据,以减少产生的大量数据占用严重设计和详细数据。

我们需要的双重选择,能够不断地将数据发送给一个事件设备,可以用来降低数据量在一个事件会话的基础上。

如果我们必须使用日志文件,用许多小日志文件减少大量的日志文件和更多的日志周期减少几个日志周期。我们还必须最大化并行日志的好处。

所以现在,我们得到了日志数据的使用可以通过日志文件、日志文件由一个事件设备(如工具包的一部分分析数据收集适配器)或发送的设备通过消息传递信号点而来。

一旦数据已经传输(传统文件传输/共享或消息)我们可以进入下一个步骤:ET(A)L -提取、转换、分析和负载。

日志文件,我们通常采用ETL(A)但是当然我们不需要ETL中的E即提取,因为这是直接连接。

再次减少ET(AL)是另一种形式的机制,这就是为什么分析方面包括确保得到的数据通过需要的数据,而没有认可价值的垃圾和噪音,会尽早并且经常清理。

分析数据存储

分析数据存储(可以是一个分布式数据存储在某个云)支持统计分析的数据需求。这里的数据组织、结构、集成和丰富的持续波动,偶尔需要统计学家和科学家关注数据挖掘。分析数据存储中的数据可以累计或完全刷新。它可以有一个短寿命或有显著高寿命。

分析数据存储的核心是分析数据。不仅可以用于提供数据统计分析过程,但它也可以用来提供长期持久存储分析结果和场景,和未来的一些分析,因此具有“回馈”的能力。

分析数据存储中的数据和信息也可以使用、来源于数据仓库中存储的数据,它也可能受益于拥有自己的专用数据集市专门为这个目的而设计的。

在分析数据存储的统计分析的结果也可能导致反馈用于调优数据,过滤和浓缩的规则,无论是智能数据分析、复杂事件和歧视适配器或ET(AL)工作。

总结

这一定是非常短暂的对于目前的DW 3.0的标签

模型不寻求定义统计或统计分析是如何应用的,已经做了足够多,但如何适应统计在一个扩展的DW 2.0架构,和几乎不需要想出反动和不合身的问题解决方案,可以解决的更好、更有效的方法通过明智、健全的工程原则和适当的明智的应用方法,技术和技巧。

原文:Aligning Big Data

Contradictions of Big Data

01 Sunday Mar 2015

Posted by Martyn Jones in Ask Martyn, Big Data, Consider this

≈ 1 Comment

Tags

Big Data, data management, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones


What we’ve been told

We’ve been told that Big Data is the greatest thing since sliced bread, and that its major characteristics are massive volumes (so great are they that mainstream relational products and technologies such as Oracle, DB2 and Teradata just can’t hack it), high variety (not only structured data, but also the whole range of digital data), and high velocity (the speed at which data is generated and transmitted). Also, from time to time, much to the chagrin of some Big Data disciples, a whole slew of new identifying Vs are produced, touted and then dismissed (check out my LinkedIn Pulse article on Big Data and the Vs).

So, beware. Things in Big Data may not be as they may seem.

It’s not about big

I have been waging an uphill battle against the nonsensical and unsubstantiated idea that more data is better data, but now this view is getting some additional support, and from some surprising corners.

In a recent blog piece on IBM’s Big Data and Analytics Hub (Big data: Think Smarter, not bigger), Bernard Marr wrote that “the truth is, it isn’t how big your data is, it’s what you do with it that matters!”

Elsewhere, SAS echoed similar sentiments on their web site: “The real issue is not that you are acquiring large amounts of data. It’s what you do with the data that counts.”

Can we call that ‘strike one’ for Big Data Vs?

It’s not about variety

It is claimed that 20% of digital data is structured, it is based on the problematic suggestion that structured data is uniquely relational. It is also claimed that unstructured data includes CSV files and XML data, and this makes up far more than the 20% of the data generated. But this definition is simply wrong.

If anything, CSV data is structured, and XML data is highly structured, and it’s typically regular ASCII data. So it does not add variety, even though it is not structured in the ways that some people might expect, especially if that someone lacks the required knowledge and experience. Simply stated, CSV data is structured, it’s just that it lacks rich metadata, but that doesn’t make it unstructured.

“But”, I hear you say “what about all the non-textual data such as multi-media, and what about the masses of unstructured textual data?”

Take it from me, most businesses will not be basing their business strategies on the analysis of a glut of selfies, home videos of cute kittens, or the complete works of William Shakespeare or Dan Brown. Almost all business analysis will continue to be carried out on structured data obtained primarily from internal operational systems and external structured data providers.

Strike two! Third time lucky?

It’s not even about velocity

So, if we accept that Big Data isn’t really about the data volumes or data variety that leaves us with velocity, right? Well no, because if it isn’t about record breaking VLDBor significant data variety, then for most commercial businesses the management of data velocity becomes either less of an issue or just is no issue. The fact that some software vendors and IT service suppliers set up this ‘straw man’ argument and then knock it down with the ‘amazing powers’ of their products and services, is quite another matter.

Strike three, and counting.

It’s not about the manageability of Big Data

We have been told and time again that the major difference between a data scientist and professional statistician is that the ‘scientists’ know how to cope very well with massive volumes, varieties and velocities of data. Now it turns out that this is also questionable.

According to Bob Violino writing in Information Management (Messy Big Data Overwhelms Data Scientists – 20 February 2015) “Data scientists see messy, disorganized data as a major hurdle preventing them from doing what they find most interesting in their jobs”. So, when it comes to data quality and structure the ‘scientists’ don’t really have an advantage over professional statisticians.

Last year Thomas C. Redman writing in the Harvard Business Review (Data’s Credibility Problem) noted that when Big Data is unreliable “managers quickly lose faith” and “and fall back on their intuition to make decisions, steer their companies, and implement strategy” and when this happens there is a propensity to reject potentially “important, counterintuitive implications that emerge from big data analyses.”

Strike four?

The new analytics aren’t new

Data science and Big Data analytics are the new kids on the block, aren’t they?

Well, here are some real life scenarios.

A major banking equipment supplier: A lot of banking equipment is hybrid analogic-digital, a simple example of this would be a photo copier or a physical document processing device. One major supplier decided to incorporate the capture of sensor data produced by their devices to predict failure and problems. Predictive preventive maintenance rules are created and corroborated using the data generated by sensors on each customer device, and these rules then get incorporated into the devices logic.

A major IT vendor: What happens when you create an intersection and convergence between technologies, techniques and method from areas of mainstream IT, data architecture and management, statistics (quantitative and qualitative analytics) and data visualisation, artificial intelligence/machine learning and knowledge management? This is precisely what one of the main European IT vendors did, and the idea proved to be quite attractive to customers, prospects and investors.

A major integrated circuit supplier: The testing of ICs at the ‘fabs’ (manufacturing plants) generates serious amount of data. This data is used to detect errors in the IC manufacturing process, it is captured and analysed in as near real-time as possible, which is necessary due to the costly nature of over-running the production of faulty ICs. To get around this problem the company uses a combination of fast data capture, transformation and loading of data into a data analytics area to ensure early and precise problem detection.

All Big Data Analytics success stories?

The first happened in 1989, the second in 1993 and the third in 2001. Yes, Big Data and Big Data analytics are sort of newish.

Strike five.

The science is frequently not very scientific

What is science?

According to Vasant Dhar of the Stern School of Business (Data Science and Prediction), Jeff Leek (The key word in “Data Science” is not Data, it is Science), and repeated on Wikipedia, “In general terms, data science is the extraction of knowledge from data”. Well, excuse me if I beg to differ. I have seen data scientists at work, and the word science doesn’t actually jump out and grab you. It’s difficult to make the connection, just as it is to accurately connect some popular science magazines with fundamental scientific research.

If a professional and qualified statistician wants to label themselves a data scientist then I have no issue with that, it’s their problem, but I am not willing to lend credibility to the term ‘data scientist’ when it is merely an interesting job title, with at most a tenuous connection to the actual role, and one that is liberally applied, with the almost customary largesse of IT, to creative code hackers and business-averse dabblers in data.

As Hazelcast VP Miko Matsumura suggested in Data Science is Dead “… put “Data Scientist” on your resume. It may get you additional calls from recruiters, and maybe even a spiffy new job, where you’ll be the King or Queen of a rotting whale-carcass of data” and ” Don’t be the data scientist tasked with the crime-scene cleanup of most companies’ “Big Data”—be the developer, programmer, or entrepreneur who can think, code, and create the future.”

Strike six.

And the value is questionable

DATA: “Data is a super-class of a modern representation of an arcane symbology.” – Anon

If I had a dollar for every time I heard someone claim that data has intrinsic positive value then I would be as wealthy as Warren Buffet.

If I have said it once, I have said it a hundred time. In order for data to be more than an operational necessity it requires context.

Providing valid data with valid context turns that data into information.

Data can be relevant and data can be irrelevant. That relevance or irrelevance of data may be permanent or temporary, continuous or episodic, qualitative or quantitative.

Some data is meaningless, and there are cases whereby nobody can remember why it was collected or what purpose it serves.

Taking all this into account we can ask the deadly pragmatic question: what value does this data have? Which is sometimes answered with a pertinent ‘no value whatsoever’.

Strike seven.

So what is it really about?

It is said that Big Data is changing the world, but for all intents and purposes, and shamed by previous Big Data excesses, some people are rapidly changing the definitions and parameters of Big Data, and to position it as being more tangible and down-to-earth, whilst moving it away from its position as an overhyped and dead-ended liability.

Big Data is a dopey term, applied necessarily ambiguously to a surfeit of tenuously connected vagaries, and its time has come and gone. So, let’s drop the Big Data moniker, and embrace the fact that data is data, and long live ‘All Data’, yes, all digital data. Let’s consider all data and for what it’s worth to the business, and not for what some chatterers reckon its value is – having as they do, little or no insight into the businesses to which they refer, or of the data in that these businesses possess.

So, when push comes to shove, is Big Data really about high volumes, high velocity and high variety, or is it in fact about much noise, too much pomposity and abundant similarity leading to unnecessary high anxiety?

Thanks very much for reading.

Being dishonest about honesty

01 Sunday Mar 2015

Posted by Martyn Jones in Consider this

≈ Leave a comment

Tags

dishonesty, ethical leadership, honesty, hypocrisy, personal development, psychology


BEING DISHONEST ABOUT HONESTY

I never touched a gun in my life. That and that alone forever doomed me to middle management.

Vincent “Vinnie” Antonelli

From: My Blue Heaven

Okay. For the record, I never lie; just ask my cousins, Rocky 1, 2 and 3.

Ooops… yeah, as I was saying…

Now hold this thought: Have you ever told someone “you are the most beautiful person in the world”?

Put it this way, every so often, there are blog pieces, especially on LinkedIn, exhorting people to be honest, always honest and for their own good, and frankly, to me, this is the sordid and despicable height of dishonesty.

Let me state this up front, in my view, honesty is always the best policy, especially if you have a bad memory. Honesty in the workplace makes a lot of sense. So try to be honest in your chosen or imposed profession or work activity, and enthusiastically so.

Now should you also temper this view with the realities of working life in market oriented or capitalist societies, or indeed, you may be someone who has not had such good fortune, but the question still stands. Can you be pragmatic and still maintain a moral compass?

You work in a bank and notice what you believe to be irregularities in the accounting process; you denounce those irregularities, because you are honest, right?

Your best friend runs a satellite dish company and you suspect that they may be doing work off their books to avoid paying taxes; you report them to the URS, right? Because you are honest.

Your 89-year-old pot-smoking neighbour, a onetime best friend of your parents, is growing marihuana in her bathroom; you report her to the cops, right? Because you are honest.

The caretaker at the school who claimed to have been mortally wounded in the Great Klingon Wars; and you call them out on their lies. They may be sacked because of your honesty, but that’s okay. Right?

Here’s one close to my heart; really, truly, honestly… Oops, now how did that happen? This is not close to my heart, this is business. So, you are working for a company that is trying to get into the big time and the fast bucks with Big Data. Your peers claim they are the experts, when actually they are not, they claim that the software is new and without equal, even though you know it is based on really old technology in a fancy new package. Do you denounce them for their lies, deception and guile? You know, for the sake of honesty.

A salesperson guilds the lily in a presentation to a client. Even though the client isn’t phased by this, because they know the game, and they aren’t entirely candid themselves, you still report them for lying, right? Because you are honest, and they are very, very naughty people.

The corporation you are working for, that like many others, continually reinvents their history and their product and service line; they are altering the facts to suit the market. You denounce them as well right? Because you are honest, and they are so wrong.

Well, no.

When it comes to the truth, there are many sanctimonious, two-faced puritanical hypocrites out there.

I expect both people and businesses, especially businesses, to gild the lily, to stretch the point, to exaggerate, to invent histories, anecdotes, success stories or to spin failures as successes, to sex up, back fill, hype up and to offer flim flam as fact.

Which is why we have contracts, pertinent contract clauses, incentives, penalties and lawyers.

If someone tells me that captains of industry and great leaders never lied to anyone, or misrepresented something, or exaggerated or diminished something, some way, shape, or form. That leaders have never tricked someone, failed to be entirely candid with all of their management team or fooled an entire organisation, and a large list of etceteras, then I’ll show you someone who is being, to put it politely, naïve. Of course, it could also be that they are simply lying. More to the point, if I had a dollar for every time a management consultant told a porkie, I would be surrounded by mountains of Ben Franklins.

The people I will never trust are those who claim to be above the human condition, a superior form of being, all sacred and without profanity. These people cannot be trusted at all because they are permanently mendacious, and delusional, freakily so, and they do not actually know it or are willing to recognise it.

So remember this, honesty may be the best policy, but reality and pragmatism dictates that it’s not the only best policy, and that we live in an unfair, volatile and competitive world, where the biggest liars are those who pretend they couldn’t possibly lie.

Moreover, before I leave you, just remember this:

[Vincent “Vinnie” Antonelli is questioned about the stolen goods in the trunk of the car he stole]

Hannah Stubbs: The books…

Vinnie: You have something against books?

Hannah Stubbs: I have nothing about books! I am curious about the books in your trunk.

Vinnie: You see, I was thinking of writing my story, so I bought this one on how to do it.

Hannah Stubbs: Why do you need 25 copies of it?

Vinnie: In case I want to read it more than once…

Thank you for reading.

Addendum – Here’s something else to consider:

This is from the back cover of Dan Ariely’s latest book The Honest Truth About Dishonesty: How We Lie to Everyone–Especially Ourselves

The New York Times bestselling author of Predictably Irrational and The Upside of Irrationality returns with a thought-provoking work that challenges our preconceptions about dishonesty and urges us to take an honest look at ourselves.

Does the chance of getting caught affect how likely we are to cheat?
How do companies pave the way for dishonesty?
Does collaboration make us more or less honest?
Does religion improve our honesty?

Most of us think of ourselves as honest, but, in fact, we all cheat. From Washington to Wall Street, the classroom to the workplace, unethical behavior is everywhere. None of us is immune, whether it’s a white lie to head off trouble or padding our expense reports. In The (Honest) Truth About Dishonesty, award-winning, bestselling author Dan Ariely shows why some things are easier to lie about than others; how getting caught matters less than we think in whether we cheat; and how business practices pave the way for unethical behavior, both intentionally and unintentionally. Ariely explores how unethical behavior works in the personal, professional, and political worlds, and how it affects all of us, even as we think of ourselves as having high moral standards.

But all is not lost. Ariely also identifies what keeps us honest, pointing the way for achieving higher ethics in our everyday lives.

With compelling personal and academic findings, The (Honest) Truth About Dishonesty will change the way we see ourselves, our actions, and others.

Big Data in Question – Again

01 Sunday Mar 2015

Posted by Martyn Jones in All Data, Big Data, Consider this

≈ Leave a comment

Tags

All Data, Big Data, data management, Good Strat, good strat blog, Good Strategy, Martyn Jones, Martyn Richard Jones


Big Data is now an inhospitable and unhealthy land inhabited by those who, through accident or design, deceive naïve and sentimental bystanders and those who are willingly mislead.

When all of this Big Data malarkey started it was sort of funny, humorous and occasional witty, especially in the affected, bizarre and the frequently uninhibited ways that freshly-minted self-appointed gurus and experts would “big it up”

Doctor Freud would have had a field day with all of that, being as it was, and for that matter still is, a postmodern mishmash of Riefenstahl, Freddy Mercury and Monty Python on steroids. However, after that extended, operatic and high-camp hiatus it all went downhill.

The Big Data scene is fast becoming an outrageous and brash festival of deception, disinformation and obliviousness. Which is a pity, because it does the industry no good whatsoever.

It is telling that Big Data evangelists, gurus and assorted sycophants cannot even define Big Data adequately, never mind discuss (or for that matter, point at) tangible success stories, without falling into contradictions on all of the key defining characteristics of volume, variety and velocity, and resorting to crude debating devices to avoid or finesse the concerns and the questions.

Almost every morning I check out the industry news, and almost invariably, it comes with new mind-boggling examples of Big Data nonsense.

However, it isn’t always nonsense for nonsense’s sake, there are agendas, there are rational explanations why Big Data has become at the same time, one of the most hyped up fads in the history of IT, and one that its supporters find so difficult to actually explain and justify, in any reasonable sort of way.

Therefore, when it comes to Big Data, beyond the surfeit of platitudes, clichés, bluff and bluster, the only thing in play are the interests of industry, the patrons, the courtesans and their entourage of the innocent and the beguiled.

One of the biggest deceptions in Big Data is in the misleadingly named ‘success stories’. The thing is that most of these success stories that I have ever read have been:

  • So vague that it’s difficult to know how success is being defined never mind reached.
  • So secretive and obtuse is the avoidance of naming names, locations and other relevant Big Data references that it’s impossible to corroborate if these claims are actually true or not. Disclaimer: I have worked for some of the biggest IT vendors, and in senior roles, and I know what is behind comments such as “the Big Data project is a success, although the client name and project are confidential” and “it’s delivering such major competitive advantages that we are obliged to keep it under wraps”.
  • Stories stolen from elsewhere, such as from Data Warehousing, Business Intelligence, VLDB or Business Application projects.
  • Borderline fantasies and badly contrived technology fan fiction.

However, it doesn’t stop there.

One of the clearest examples of the questionable nature of Big Data evangelism is when it is used to piggyback Big Data hype on simple, tangible and immediately recognisable artefacts or applications that have little in common with Big Data.

This is an extreme illustration, but it works like this: “iPhones are commercially successful, iPhones are part of Big Data, and therefore Big Data is commercially successful.”

As if the mere conjuring up of association, affinity and proximity will convince people of the great and growing value of Big Data.

What I am also referring to are publicity pieces that may as well have been titled:

  • Smith, Galbraith, Mies, Keynes, Homer SImpson and the economic justification of Big Data
  • Lovelace, Babbage, von Neumann, Eckert, Davies, Codd, Knuth, Naur and the technological underpinnings of Big Data
  • Einstein, Freud, Edison, Faraday, Recorde and the intellectual structure of Big Data
  • Socrates, Kant, Hegel, Marx , Adorno and the philosophical correctness of Big Data
  • Great quotes about Big Data, from the Cambrian era to the postmodern époque
  • Great jokes about Big Data, from Mel Brooks to Steve Martin
  • Sportspeople and Big Data, from Lottie Dodd and Babe Ruth to Rafa Nadal and CR7
  • Industry support of Big Data, from Henry Ford to Neutron Jack

Do you recognise similarities?

It’s no big deal, just the use of unreliable, misleading and inappropriate fallacies, dressed up as cute, plausible and accessible collateral. People may think that such things are clever and witty, but they aren’t, it’s just misleading.

Let’s continue with something simple.

Evasion is, in ethics, an act that deceives by stating a true statement that is immaterial or leads to a false deduction. For example, citing events, persons or anecdotes from the history of IT to justify the supposed or imaginary value of Big Data. This is close to the notion of a non sequitur, which of course is an argument, the conclusions from which do not follow from its premise. It falls short of being full-on sophistry, purely because the simplistic, puerile and superficial arguments put forward in favour of Big Data do not match those of the true sophist who seeks to reason with clever but fallacious and deceptive arguments. Too many of the Big Data arguments are fallacious and deceptive, but no one, equipped with a reasonable capacity for critical thinking, should take such ‘arguments’ as valid.

Hold this thought: Big Data hype is a viper’s nest of logical fallacies, white lies and disinformation.

Just when I think things could not get any weirder, they do, and Big Data ceiling of hyperbole rises even higher, up to the rarer atmosphere of extreme tendentiousness.

There is a growing mass of Big Data hoop-la, hyperbole and flim flam that exceeds all previously bounds of overstatement, solecism and confabulation. This is where the real volumes, varieties and velocities are in Big Data; in hokie.

We live, as Oscar Wilde said in his day, in and age of surfaces. Yes, superficiality, puerility and short-termism are the competing orders of the day. However, I am still amazed – and maybe wrongly so – by what ostensibly professional, experienced and knowledgeable people are willing, able and prepared to accept, especially when it comes to Big Data flim flam sauce.

Here are some examples of the nonsense about Big Data that is taken as gospel by ‘adults’:

Data Warehousing is part of Big Data: No comment.

Big Data will replace Enterprise Data Warehousing: People can’t even explain the features and benefits of Big Data. I try it make it as easy as possible, ‘if you can’t say it, point to it’. But, seriously, people can’t even relate tangible and credible Big Data success stories, never mind show how it will replace Enterprise Data Warehousing, whether that’s the Inmon or Kimball flavour, take your pick.

Everyone and every organisation can benefit from Big Data: If people can’t explain this, and they don’t in terms of tangible benefits, then the claim should remain questionable.

Data Scientists will replace Statisticians: Why is that so? It is claimed that Data Scientists are uniquely equipped to handle massive volumes, varieties and velocities of data – well, as it turns out, this isn’t certain either.

Big Data is in its infancy: I think we may be confusing infancy with lack of real traction, and of time and place utility.

You cannot be serious: Just what are people talking about here? I have read vague, naïve and ill-informed pieces about data management, data architecture, data warehousing, reporting, business intelligence and a plethora of etcetera that have been passed off as observations and commentary on Big Data. So, what makes people recycle hackneyed, misleading and badly conceptualised ‘content’?

In the commentary on one of Bernard Marr’s pieces on LinkedIn (a professional networking site) I observed that no one can adequately explain what Big Data is without falling into contradictions and fancies, and no one seems to be capable or willing to provide tangible success stories.

Bernard responded to this comment by pointing out “the reason for that is that Big Data means different things to different people.”

Fair enough. It’s an explanation.

That said, I have always had more than a tenuous dislike of postmodern thinking, in fact most things ‘postmodern’. Call me old fashioned, jaded or cynical, but to me, the idea that everything can mean anything is an aberration that I prefer to leave to others.

I am at a loss to explain why so many reasonable people are willing to embrace the hype surrounding Big Data and Big Data Analytics, including the attendant surfeit of nonsense, incongruences and contradictions, and from my perspective, it defies reason and good sense.

Therefore, I will just end again with a fabulous quote from Ben Goldacre:

“You cannot reason people out of a position that they did not reason themselves into”.

Many thanks for reading.

← Older posts
Newer posts →

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 635 other subscribers

Top posts

  • 2026 Tech Trends: The Rise of Hyper-Hype - 2026/01/08
  • X Is Dying In Europe: Here's Why
  • A Brief History of Data Warehousing - 2026/01/07
  • The Magical Celtic Welshness of the Number Ten - 2026/01/10
  • The Welsh Nine - 2026/01/09
  • Wales will be Wales - 2026/01/06
  • Top Countries Known for Arrogance and Ignorance
  • An Open Letter to Mansoor Hussain Laghari
  • Template for Blog Article
  • Mobile Device Revolution: Five Trends for 2026

Recent Comments

Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn de Tours's avatarMartyn de Tours on The Perpetual Victim: How Prof…
Tiffany's avatarTiffany on Consider this: Data Made …
Unknown's avatarThe Case for a Globa… on REVEALING WEALTH: USING BIG DA…
Follow GOOD STRATEGY on WordPress.com

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Names in the cloud

All Data Ask Martyn awareness Big Data Big Data 7s Big Data Analytics Business Intelligence business strategy Consider this dark data data architecture Data governance Data Lake data management data science Data Supply Framework Data Warehouse Data Warehousing Good Strat goodstrat Good Strategy Inform, educate and entertain. IT strategy Martyn Jones Martyn Richard Jones pig data Politics Strategy The Amazing Big Data Challenge The Big Data Contrarians

Hours & Info

Spain
+33 767 120 160
martyn.jones@martyn.es
Lunch: 13:30pm - 14:30pm
Dinner: M-Th 20:00pm - 21:00pm, Fri-Sat:21:00pm - 22:00pm

The Good Strat Archives

  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • March 2023
  • January 2022
  • December 2021
  • November 2021
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • July 2019
  • June 2019
  • May 2019
  • December 2018
  • January 2018
  • December 2017
  • October 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • September 2016
  • August 2016
  • May 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014

The Stats

  • 112,630 hits

Recent posts

  • The Welsh Nine – 2026/01/09 January 10, 2026
  • The Magical Celtic Welshness of the Number Ten – 2026/01/10 January 10, 2026
  • 2026 Tech Trends: The Rise of Hyper-Hype – 2026/01/08 January 10, 2026
  • A Brief History of Data Warehousing – 2026/01/07 January 10, 2026
  • Template for Blog Article January 10, 2026
  • Wales will be Wales – 2026/01/06 January 10, 2026
  • The Wisdom of Three – 2026/01/03 January 9, 2026
  • Independent Wales – 2026/01/05 January 9, 2026
  • Meet the Euro Press – 2026/01/04 January 9, 2026
  • Nine Absurd AI Use Cases We Don’t Need January 9, 2026

Recent Comments

Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn de Tours's avatarMartyn de Tours on The Perpetual Victim: How Prof…
Tiffany's avatarTiffany on Consider this: Data Made …
Unknown's avatarThe Case for a Globa… on REVEALING WEALTH: USING BIG DA…

Archives

Categories

  • accountability
  • advertising
  • agile
  • agile way of working
  • agile@scale
  • AI
  • All Data
  • Analytics
  • anthropology
  • Architecture
  • Artificial Intelligence
  • Ask Martyn
  • Assets
  • awareness
  • bad strategy
  • Banking
  • behaviour
  • Best principles
  • Big Data
  • Big Data 7s
  • Big Data Analytics
  • blockchain
  • Books with influence
  • Brexit
  • BS
  • business
  • Business Intelligence
  • business strategy
  • Cambriano
  • Cambridge Analytica
  • China
  • Climate Change
  • Cloud
  • code of conduct
  • Commercial Analytics
  • community
  • Condiser this
  • Conservative Party
  • consider
  • Consider this
  • Consultation
  • Creativity
  • Culture
  • dark data
  • data
  • data architecture
  • Data governance
  • data hub
  • Data Lake
  • data management
  • Data Mart
  • data mesh
  • data science
  • Data Supply Framework
  • Data Warehouse
  • Data Warehousing
  • deceit
  • deep learning
  • Democracy
  • digital transformation
  • Diplomacy
  • disinformation
  • Dogma
  • Duties
  • DW 3.0
  • ECM
  • Economics
  • EDW
  • England
  • enterprise content management
  • ethics
  • EU
  • Europe
  • European Union
  • Excellence
  • Excerpt
  • Executive
  • Extract
  • Federalism
  • films
  • Financial Industry
  • fraud
  • Freedoms
  • Globalisation
  • good start
  • Good Strat
  • Good Strategy
  • Good Strategy Radio
  • goodstart
  • goodstartegy
  • goodstrat
  • goostart
  • governance
  • hadoop
  • hdfs
  • HR
  • humour
  • India
  • influencers
  • Inform, educate and entertain.
  • informatio Supply Framework
  • information
  • Information Management
  • Information Supply Frameowrk
  • Information Supply Framework
  • Infotrends
  • Inmon
  • instruments
  • IoT
  • IT Circus
  • IT fraud
  • IT strategy
  • IT World
  • iterations
  • java
  • Knowledge
  • knowledge management
  • Labour Party
  • leadership
  • Leadership 7s
  • life
  • listening
  • literature
  • LSE
  • machine learning
  • Management
  • market forces
  • Marketing
  • Marty does
  • Martyn does
  • Martyn Jones
  • Martyn Richard Jones
  • media
  • Memory lane
  • Methodology
  • nationalism
  • nine competitive forces
  • no limits
  • Northern Ireland
  • obituary
  • Obligations
  • offshore
  • Offshoring
  • operational
  • Outsourcing
  • Oxford
  • pain
  • Parliament
  • Peeves
  • Personal Integrity Key
  • Philosophy
  • pig data
  • PIK
  • PIR
  • Plaid Cymru
  • Planning
  • poem
  • poems
  • Poetry
  • Polemic
  • political science
  • Politics
  • pomo
  • postmodern
  • POTUS
  • Process
  • Professional Networking
  • professionalism
  • project management
  • Project to Excel
  • prose
  • public
  • Public Integrity Record
  • Quiz
  • Rant
  • Referendum
  • Remain
  • RIghts
  • Risk
  • Rivalry
  • Russia
  • Ruth Davidson
  • Sales
  • satire
  • Scotland
  • Scottish National Party
  • scrum
  • sentiment analysis
  • SMILES
  • Snippet
  • SNP
  • Social
  • Social Media
  • Sociology
  • Spain
  • spoof
  • statistics
  • Stories
  • Strategy
  • structured intellectual capital
  • supply chain management
  • tactics
  • Tax avoidance
  • Tax evasion
  • TEAM
  • technology
  • The Amazing Big Data Challenge
  • The Big Data Contrarians
  • The Greens
  • The Guardian
  • The hidden wealth of nations
  • Trade
  • UK
  • Uncategorized
  • United Kingdom
  • USA
  • Value
  • Wales
  • wisdom

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com
Log in

Hours & Info

Martyn Richard Jones
Madrid, Spain
+34 692 376 698
martyn.jones@martyn.es
10:00 - 17:00
Follow GOOD STRATEGY on WordPress.com

Top Good Strat Posts & Pages

  • Innovative Strategies for Modern Governance
  • 2026 Tech Trends: The Rise of Hyper-Hype - 2026/01/08
  • X Is Dying In Europe: Here's Why
  • A Brief History of Data Warehousing - 2026/01/07
  • The Magical Celtic Welshness of the Number Ten - 2026/01/10
  • The Welsh Nine - 2026/01/09
  • Wales will be Wales - 2026/01/06
  • Top Countries Known for Arrogance and Ignorance
  • An Open Letter to Mansoor Hussain Laghari
  • Template for Blog Article

Good strat tag cloud

1 2 3 4 AI All Data Analytics Artificial Intelligence Behavioural Economics BI Big Data bigdata blog books Business business analysis Business Enablement business intelligence Business Management business strategy chatgpt cloud Consider this data data integration data management data science Data Warehouse Demagogism digital-marketing Dogma Donald Trump enterprise data warehousing espanol EU fe fiction gaza goodstart good start Good Strat goodstrat Good Strategy hamas history ia information Information and Technology information management Information Technology israel IT Strategy jesus knowledge leadership life llm machine learning Marketing Martyn Jones Martyn Richard Jones News Offshoring Organisational Autism palestine Philosophy poesia Politics Russia Spain statistics Strategy technology trump writing

Categories

  • accountability
  • advertising
  • agile
  • agile way of working
  • agile@scale
  • AI
  • All Data
  • Analytics
  • anthropology
  • Architecture
  • Artificial Intelligence
  • Ask Martyn
  • Assets
  • awareness
  • bad strategy
  • Banking
  • behaviour
  • Best principles
  • Big Data
  • Big Data 7s
  • Big Data Analytics
  • blockchain
  • Books with influence
  • Brexit
  • BS
  • business
  • Business Intelligence
  • business strategy
  • Cambriano
  • Cambridge Analytica
  • China
  • Climate Change
  • Cloud
  • code of conduct
  • Commercial Analytics
  • community
  • Condiser this
  • Conservative Party
  • consider
  • Consider this
  • Consultation
  • Creativity
  • Culture
  • dark data
  • data
  • data architecture
  • Data governance
  • data hub
  • Data Lake
  • data management
  • Data Mart
  • data mesh
  • data science
  • Data Supply Framework
  • Data Warehouse
  • Data Warehousing
  • deceit
  • deep learning
  • Democracy
  • digital transformation
  • Diplomacy
  • disinformation
  • Dogma
  • Duties
  • DW 3.0
  • ECM
  • Economics
  • EDW
  • England
  • enterprise content management
  • ethics
  • EU
  • Europe
  • European Union
  • Excellence
  • Excerpt
  • Executive
  • Extract
  • Federalism
  • films
  • Financial Industry
  • fraud
  • Freedoms
  • Globalisation
  • good start
  • Good Strat
  • Good Strategy
  • Good Strategy Radio
  • goodstart
  • goodstartegy
  • goodstrat
  • goostart
  • governance
  • hadoop
  • hdfs
  • HR
  • humour
  • India
  • influencers
  • Inform, educate and entertain.
  • informatio Supply Framework
  • information
  • Information Management
  • Information Supply Frameowrk
  • Information Supply Framework
  • Infotrends
  • Inmon
  • instruments
  • IoT
  • IT Circus
  • IT fraud
  • IT strategy
  • IT World
  • iterations
  • java
  • Knowledge
  • knowledge management
  • Labour Party
  • leadership
  • Leadership 7s
  • life
  • listening
  • literature
  • LSE
  • machine learning
  • Management
  • market forces
  • Marketing
  • Marty does
  • Martyn does
  • Martyn Jones
  • Martyn Richard Jones
  • media
  • Memory lane
  • Methodology
  • nationalism
  • nine competitive forces
  • no limits
  • Northern Ireland
  • obituary
  • Obligations
  • offshore
  • Offshoring
  • operational
  • Outsourcing
  • Oxford
  • pain
  • Parliament
  • Peeves
  • Personal Integrity Key
  • Philosophy
  • pig data
  • PIK
  • PIR
  • Plaid Cymru
  • Planning
  • poem
  • poems
  • Poetry
  • Polemic
  • political science
  • Politics
  • pomo
  • postmodern
  • POTUS
  • Process
  • Professional Networking
  • professionalism
  • project management
  • Project to Excel
  • prose
  • public
  • Public Integrity Record
  • Quiz
  • Rant
  • Referendum
  • Remain
  • RIghts
  • Risk
  • Rivalry
  • Russia
  • Ruth Davidson
  • Sales
  • satire
  • Scotland
  • Scottish National Party
  • scrum
  • sentiment analysis
  • SMILES
  • Snippet
  • SNP
  • Social
  • Social Media
  • Sociology
  • Spain
  • spoof
  • statistics
  • Stories
  • Strategy
  • structured intellectual capital
  • supply chain management
  • tactics
  • Tax avoidance
  • Tax evasion
  • TEAM
  • technology
  • The Amazing Big Data Challenge
  • The Big Data Contrarians
  • The Greens
  • The Guardian
  • The hidden wealth of nations
  • Trade
  • UK
  • Uncategorized
  • United Kingdom
  • USA
  • Value
  • Wales
  • wisdom

Blog at WordPress.com.

  • Subscribe Subscribed
    • GOOD STRATEGY
    • Join 133 other subscribers
    • Already have a WordPress.com account? Log in now.
    • GOOD STRATEGY
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...
 

    Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
    To find out more, including how to control cookies, see here: Cookie Policy