, , ,

Listen up Big Data playmates! The ubiquitous Big Data gurus, tied up in their regular chores of astroturfing mega-volumes, velocities and varieties of superficial flim flam, may not have noticed this, but, Hadoop is getting set up for one mighty fall – or a fast-tracked and vertiginous black run descent. Why do I say that? Well, let’s check the market.

According to Gartner there is “continuing enthusiasm for the big data phenomenon”. However, “demand for Hadoop specifically is not accelerating.” So what’s up doc?

The Hadoop companies have been spending big-time on marketing, and I mean big, and lo and behold, the sales are dipping as fast as the prospects evaporate. You might even have noticed this. Of course, this can’t go on forever. You cannot maximise your marketing spend when your revenues are heading south, and not even the most naïve of angels or other types of investors will keep up with misguided corporate-welfare for too long. Even if you’re flogging a moribund yellow elephant, you make bank or you die.

So what went wrong?

First of all the Hadoop people got well ahead of themselves. Their own enthusiasm for the toys they helped to ‘create’ got the better of them. But first, in order to understand the commercial problems it’s necessary to delve into some of the techie stuff.

Hadoop is a child of Agile and Agile is no way to design a solid, coherent and cohesive data management architecture, never mind build it.

Hadoop grown, feature by feature, as a response to specific technical challenges in specific and somewhat peculiar businesses. When it all kicked off, the developers weren’t thinking about creating a new generic data management architecture, one for handling massive amounts of data. They were thinking of how to solve specific problems. Then it rather got out of hand, and the piecemeal scope grew like topsy as did the multifarious ways to address the product backlog. At least that’s the impression I get.

At Sperry and then Unisys, I worked in database technology research and development for a number of years, and learned from the best and mentored some of the best. To be frank, if I had to design an architecture for handling massive data volumes, data velocities and data varieties, it would not look like Hadoop nor would it have the ‘features’ of Hadoop that are there because of a lack of a coherent up-front architecture or simply naiveté. Hadoop people, who thought they knew data architecture, actually did not, and they were therefore simply winging-it. I mean, what sort of eejit would deliberately design a comprehensive parallel data management system with only one master controller (namenode)? Duh!

What’s the take-away from that? Hadoop clearly demonstrates that Agile is not a good approach for all types of software development projects.

You don’t need to read pieces and comments on LinkedIn’s Big Data channel to know that some people, many people, think that Hadoop is the brand new shiny gizmo that will cure all existential threats, bring about world peace and fix the negative effects of climate change. Read items on the Big Data channel if you must – this piece probably won’t get featured there, but anyway. So there they are, the marketers, Big Data gurus, the chancers, PR people, vendors, management consultants, the wacky and waylaid, all of them, ‘bigging it up for Big Data’, up the wazoo. Clueless, almost to a man. They all proclaim the Emperor’s new distributed file system as the greatest thing since time-slicing my little pony, but they are so dumb, so enamoured with their new-best-fad, and so wilfully superficial that they don’t see or care that they are somewhat wrong.

Hadoop? It’s almost all about old technology, which has been repackaged, and frequently not very well. The only thing truly innovative about it is how a repackaging of Unix primitives on parallel platforms can be marketed in such imaginative, exaggerated and inaccurate ways.

Then things moved over to Skid Row when Big Data and Hadoop people started eating their own marketing dog food.

So, the sorry story goes… if Hadoop can be used for managing all data, all data volumes, all data velocities and all data varieties, it would be a small step to displace existing technologies in the Enterprise Data Warehousing and Business Intelligence space. It’s so cool, so powerful and so remarkable that it can even replace IMS, Teradata, DB2, SQL Server and Oracle, without skipping a beat. Right? Wrong!

The idea must have looked very attractive at the time, but the reality is that Hadoop is far too much like using parallel cat, grep, awk and cut (apologies for the techie talk). These Unix (and Linux) primitives have their uses, but Data Warehousing data-management is definitely not one of them. But, no one mentioned this mistaken conclusion – Oy! Mate! The Emperor has no clothes – and it helped considerably in inflating sales projections and the massive hype surrounding Hadoop and Big Data.

It also turns out that, in spite of the babbling of the usual suspects, Big Data is not for everyone, not everyone needs it, and even if some businesses benefit from analysing their data, they can do smaller Big Data using conventional rock-solid, high-performance and proven database technologies, well-architected and packaged technologies that are in wide use.

Put it this way, when I hear descriptions of how this revolutionary new Hadoop works, it’s a bit like a sophomore who has read up the previous night on the early architectures of Oracle, and is now recounting what they have read, badly, with the concepts and architecture turned into a potpourri of buzz words, incongruences and baloney.

And, you wonder why the corporate uptake of Hadoop is stalling?

Many thanks for reading.

lang: en_US