Tags

, , , , , , , ,


OLYMPUS DIGITAL CAMERA
Blue sky data

Hold this thought: ‘There are big lies, damn big lies and big data science’.

Statistics is a science. Some argue that it is the oldest of sciences. It can be traced back in history to the days of Augustus Caesar, and before.

In 1998, Lynn Billard wrote a paper that laid out the role of the Statistician and Statistics. She stated that “no science began until man mastered the concepts and arts of counting, measuring, and weighting”.[1]

I first became aware of the role of the statistician during my studies. I was studying a combination of philosophy, politics, and economics. Later, my first two managers were enthusiastic members of the Royal Statistical Society (RSS). They were pedagogic as well. The society aims to advance the science and application of statistics. It promotes use and awareness for public benefit.

The RSS do a good job of raising awareness about statistics and statisticians. But maybe they aren’t getting enough people’s attention.

After all, many people seem to think that statistical methods and quantitative analysis were born somewhere around 2001. Which, and sorry for raining on anyone’s parade, is not in fact the case.

To me a statistician is like a true artist.

Let me explain what I mean by that.

Picasso was perhaps the greatest painter of the 20th century.

He is on record saying, “It took me four years to paint like Raphael.” He added that it took him a lifetime to paint like a child. But that’s not the same as a child painting, with little or no technique, skill or experience.

Picasso projected the visions of a child, through the hands of a genius. Picasso could paint like Raphael, but also as “a child”. He could paint like anyone. Many would argue that he was a true artist.

Which isn’t the same as splodging some abstract and random colourful shapes on canvas. That doesn’t automatically make someone an artist. Not in any modern formal sense. Although, that said, in the age of Postmodern Nonsense[2], anything can be anything. Which however still does not make it a fact.

Those people who watched the American television medical drama House might also make this connection.

In this series, Hugh Laurie played the part of Dr Gregory House.

In entertainment terms, Laurie convinced viewers that he was a credible physician. Only thing is, he wasn’t a physician. He was an actor pretending to be a physician, and he did a great job. He learned his lines well, and he knew how to interpret them to perfection. But as an actor, not as a doctor.

We believe Big Data is more than just a new name for a collection of old ideas. We think data science is forward-looking. In contrast, statistics is just dealing with the past. Why do we lend more credibility to rebranding than to historical fact?

More to the point, why do people clamour to self-define themselves as data scientists? Why not choose the more recognizable role of statistician? It is more measurable and manageable. A modern statistician who can both interpret the past and try to correctly forecast the future?

I am well aware that there has been a proclivity to hire enthusiastic amateurs or certificate-harvesters. This occurs instead of hiring trained, experienced, and qualified professionals, especially if ‘the price is right’. But it is a proclivity firmly planted in the absurd, incoherent and irrational. As absurd as the dialectic notion that two-a-halfpenny qualifications are more important than knowledge and experience.

So, call me old fashioned. But when I need a haircut, I will go to a hairdresser or a barber. I will not go to a hair artiste or a mop-follicle scientist.

When I need someone skilled in a wide range of statistics, I will hire a professional. An experienced Statistician will be my choice.

It’s not exactly rocket surgery.

A good statistician will understand that “not everything that counts can be counted. Not everything that can be counted counts.” A quote which is variously attributed to either Albert Einstein or William Bruce Cameron.

So, getting down to fundamentals. Why would a Statistician prefer to call themselves Data Scientists? Why are some Data Scientists oblivious to the nature of contemporary Statistics and Statisticians? Why are they misinformed about it?

I think the biggest problem is in the way that the IT industry relentlessly flogs new fads. It’s new lamps for old. No matter how much obfuscation is added, it remains a massive dose of flimflam. The marketing churned into the mixture adds hyperbole.

The other ‘big’ problem is in how so many people are eager to adopt the flimflam trend. They do this in order to wing their way into a ‘data scientist’ niche. Some rebrand themselves as data scientists. This is a reaction to the IT industry’s crude ‘downgrading’ of the role of statistician. This rebranding is often backed by meaningless clichés, logical fallacies, inaccuracies, and blatant misrepresentation.

Using the past to predict or shape the future is nothing new. So why do people pretend that it is new?

Finally, I think it’s clear where this is leading. My prediction for 2016 is that Big Data will not kill the Statistician?

My prediction for 2026 is that the ‘data scientists’ of the day will criticise the next Big Data-like fad. They will especially criticise its evangelists. Hopefully, they will clarify that this concerns something with a long history. It is also something with a rich history.

That said, I think the predicament and the ‘challenge’ we face with much of the industry hype is clear. The unquestioning zeal of many big data and data science ‘evangelists’ can be summed up by two absolutely fabulous quotes. Ben Goldacre provides these in Bad Science. “These corporations run our culture, and they riddle it with bullshit.” He also wrote, “You cannot reason people out of a position that they did not reason themselves into.”

Thanks for reading.

[1] Billard, Lynn. The Role of Statistics and the Statistician. The American Statistician, November 1998

[2] Sokal, Alan. Bricmont, Jean. Fashionable Nonsense: Postmodern Intellectuals’ Abuse of Science

As always, please share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational, leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link. You can also send me a LinkedIn invite. Also feel free to connect via Twitter, Facebook and the Cambriano Energy website.

For more on the topic, check out my other recent posts:


File under: Good Strat, Good Strategy, Martyn Richard Jones, Martyn Jones, Cambriano Energy, Iniciativa Consulting, Iniciativa para Data Warehouse, Tiki Taka Pro

Data Warehouse Action: Big Business Drivers

Martyn: The Enterprise Data Warehouse should be driven by business demand and nothing else.

Ed: What does that mean in practice?


Discover more from GOOD STRATEGY

Subscribe to get the latest posts sent to your email.