Considering the canvas that is the Pacific Ocean. “How on earth” he thought, “can people die of thirst and polluted water, when we have so much fresh, clean and pristine water on this goddam planet?”
The Data Leviathan, Martyn Jones Continue reading
11 Wed Nov 2015
Posted in All Data, Big Data, Data Lake, Inform, educate and entertain.
Considering the canvas that is the Pacific Ocean. “How on earth” he thought, “can people die of thirst and polluted water, when we have so much fresh, clean and pristine water on this goddam planet?”
The Data Leviathan, Martyn Jones Continue reading
11 Wed Nov 2015
Posted in Big Data, Data Lake, hadoop, Inform, educate and entertain., Martyn Jones
Tags
If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians: https://www.linkedin.com/grp/home?gid=8338976
Many thanks, Martyn.

If Princess Diana had been alive during the formative years of the Big Data revolution there would have been a plethora of influential Big Data bullshit babblers issuing their gushingly awful pieces in places like Forbes, the WSJ and professional blogging forums about the Big Data humanitarian causes closest to the heart of the peoples’ princess. And if tragedy had repeated itself, and had been reported not as paparazzi driven schmaltz or morbid vulgarity, but as something even more rancid and farcical, we would now have a Princess Diana Memorial Data Lake in some regal park in London or Milton Keynes –powered by Hadoop. Because, as the bullshit babblers would have it, “that is what she would have wanted”.
But, is this entirely fair? Should we view the outpourings of the biggest Big Data bullshit babblers on the entire internet as the inevitable result of free will, or is Big Data a message from God, in the same way that hard drugs are a signal to certain rock stars that they have too much available cash?
Which brings me to another issue. In a recent interview, I was given a list of data related terms, and was asked which one I preferred. Big Data, Smart Data, Small Data… you know what I mean. Anyway, I went off on a tangent about domestic pets and anthropomorphism. Okay, so it was logical entrapment, but I wanted to make a point. “Don´t you think that ascribing human behavior and thought to pets is a bit weird?” I asked. “No, came back the reply”. It wasn´t the answer I wanted, because the answer I wanted was “Yes, it certainly is” not a “No, that’s what my mum thinks as well”. I wanted to say see, people who ascribe human characteristics to dogs strike us as being a bit fanciful, but people who do the same for data? How can a bunch of recognizable symbols embody smartness? I mean, data by itself, of itself, is dumber than a rock.
So why do we pretend that the information, knowledge and the smarts are in the data and that data itself, without the need for any intervention (other than Hadoop, Sparke or Hive, etc.), is capable of revealing this smartness?
And the only thing I can think is that we are so desperate to sell useless crap that no one needs or wants, that we are even capable of saying the most dopiest of things in order to do so.
Anyway, I was at a Big Data conference recently, and every presenter selling a tool made exactly the same type of pitch. The amazing ways that their tools could establish correlations. Some of the examples of the correlations were so contrived, so obviously the creation of PR than the outcome of hands-off automated analysis, that it became seriously embarrassing, not as a professional, but as a human being. What´s more, no one mentioned the absent elephantine concept of causation, so everyone who went in clueless stayed happy in their ignorance throughout the whole wham-bam-tank-you-mam dog and pony session.
Now, I do think that the sort of data processing associated with Big Data does have a place in the old IT toolkit, but the levels of hype, misappropriation and downright lies is seriously queering the pitch. Just look at some of the Big Data articles in places like Forbes, Information Management and LinkedIn. If you haven’t yet noticed the tendency to use tremendous volumes, varieties and velocities of bullshit to push the Big Data envelope, then you really haven’t been paying enough attention.
Many thanks for reading.
If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians: https://www.linkedin.com/grp/home?gid=8338976
Many thanks, Martyn.
11 Wed Nov 2015
Posted in Big Data, Consider this, Quiz
Tags
If you enjoy this piece or find it useful (or something) then please consider joining The Big Data Contrarians: https://www.linkedin.com/grp/home?gid=8338976
Many thanks, Martyn.
For your amusement, delectable enjoyment and delight, I bring you the first in a series of Big Data Quizzes from The Big Data Contrarians – the nicest, most civilised and congenial Big Data community on the entire World Wide Web. Continue reading
11 Wed Nov 2015
As has been stated elsewhere, human resource management is a content and process intensive activity, which makes it somewhat amenable to the deployment of content and process centric IT solutions. In particular, Enterprise Content Management tools that also offer advanced process design and deployment, would seem to be an ideal fit for any significant and continuous human resource activity.
Like many other activities in business, the roles and responsibilities embodied in human resource management have emerged, developed and transformed over the years, and with subjective improvements and innovations the field has become more complex, more varied and more concentrated – in a wide range of aspects, but especially in terms of the explosive proliferation of process, business rules and content. Continue reading
11 Wed Nov 2015
Posted in All Data, Big Data, Good Strategy, goodstrat, Martyn Jones
If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians: https://www.linkedin.com/grp/home?gid=8338976
Many thanks, Martyn.

Pundits far and wide are hailing the end of the period of big data babble, hyperbole and bullshit and are looking forward to an epoch of practical, tangible and verifiable Big Data success stories.
Gartner themselves came out some time ago and declared that Big Data was no longer in the hype cycle. Some took this as a sign that the Big Data bullshit bonanza was over, others were more cynical and suspected a highly orchestrated ruse, a move to the next level in the game plan.
But does this new attitude towards Big Data really ring true?
Accompanying this apparent bold openness, frankness and humility in the ranks of the rehabilitated Big Data bullshit babblers there is an awful lot of what appears to be ‘more of the same’. Or as the people of Thailand might say, “same, same, but different”.
As some of you might know, I am the administrative owner of The Big Data Contrarians community group on LinkedIn, and even I was somewhat taken aback by a recent piece by Bernard Marr entitled 20 Stupid Claims About Big Data. So much so that I wrote a fairly complimentary comment on LinkedIn about it. The thing is, even as a posted it I was thinking to myself “you’ll be sorry”.
Today I read yet another Big Data ‘reformation’ piece on LinkedIn Pulse, this time from Matthew Reaney and with the compelling title of The 5 Myths of Big Data.
Call me naïve, call me illusory, and a believer in humankinds need for basic decency, but I frequently have the idea that praising moderately acceptable behaviour leads to even more good behaviour. But it was not to be, and as fast as one could say ‘what the hell is going on here?’ back came a surfeit of astroturfed Big Data bananas – from all directions – bigger, brasher and more bogus than ever before.
Make no mistake, Big Data hype hasn’t gone away, it has become more subtle, more cunning and even more misleading.
Leading the charge is the initiative to discredit Data Warehousing by all means possible, and the amount of bullshit, disinformation and blatant lies doing the rounds is beginning to look like Big Data hype reflecting Big Data itself, if only in terms of the vast volumes, varieties and velocities that this Big Data babbling bullshit comes in.
But seriously, we are simply getting more of the same, as the end of the Big Data hype war is declared, we are subject to a bombardment of Big Data boloney via Cloud, IoT, the Hadoop ecosphere (as if using Hadoop was someone linked to ecology and saving the planet), and especially this incredibly obnoxious and dopey vehicle for Big Data tripe known widely as the Data Lake – more on that stupidity at some other time. But onwards and upwards…
This all reminds me of a joke from many decades ago, retold in part from memory.
A teacher was looking for a subject about which her class pupils could write, to set as a homework exercise.
After much deliberation she decided to as ask the children to write about what they thought of the police?
Sure, not a good question, I know, and as I stated, this was many decades ago, when even grown-ups could be innocent and naïve and hopeful.
Anyway, when the children had handed in all their essays, the teacher read the essays and was disappointed to find that most of them were very wishy-washy and that the children were almost all unanimously indifferent or grudgingly respectful of the police, except for one. One of the children, let’s call him Dave, was very critical and had written “I don’t think much of the police.” When the teacher asked Dave why he had written that, he replied “All police is bastards, Miss”. The teacher was vexed by the reply, but being a good and caring teacher she considered how she could change this obviously hostile view of the bobby on the beat and the police detective taking evil doers out of circulation, so she decided to do something about it.
She had a bright idea and took her problem to the police and discussed what could be done to give the children a much more positive view of the police and the work they did, so they would see the police as a necessary part of society, to be respected but not feared.
As a result, the teacher and the police organised a police day at the school. It was a big party, with lots of free goodies, badges and posters, rides in patrol cars, sirens, interesting stories and a movie, and a big discussion with the police dog handler and his faithful and brave police-dog, Ajax. The police took special interest in Dave, he was the one they wanted to convince the most, and he was the one they made the most fuss of.
At the end of the day, the teacher again asked the children to write about what they got from the school police day that she had organised.
The following Monday, after all the essays had been handed in by the children, she sought out and read Dave’s essay, eager with anticipation.
This time it contained the surprising phrase of “I really, really don’t think much of the police.”
Again, the teacher asked Dave why he had written what he had wrote, especially considering all the effort the police had gone to in order to leave a good and lasting impression with the children in general, and Dave in particular.
He simply replied “the Police is cunning bastards, Miss.”
Personally, I have respect for the professionalism, courage and hard work of many officers in our police forces, but when it comes to my view of certain Big Data pundits – and naming no names, just watch my eyes – the feeling is not the same.
Make of that what you will.
Many thanks for reading.
If you enjoyed this piece or found it useful then please consider joining The Big Data Contrarians: https://www.linkedin.com/grp/home?gid=8338976
Many thanks,
Martyn.
19 Wed Aug 2015
Posted in Big Data, good start, goodstart, goodstrat
If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:
Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976
Many thanks.
To the layperson anxious for answers to complicated questions, the very idea of bringing together sets of disparate data and turning it into precious insights may seem like magic, a modern day alchemy, a goal placed well beyond the grasp of mere mortals. Fortunately, this is no longer the case, thanks in part to bagatelle-proportioned advances in Big Data and Big Data analytics and massive advances in imagination; we are able to look into the past, the present and the future, with absolute certainty. Continue reading
16 Sun Aug 2015
Posted in Big Data, Consider this, good start, goodstart, Strategy
Tags
Big Data, cynicism, data management, fakes, good start, goodstart, gurus, Martyn Jones, Martyn Richard Jones, Strategy
Why so many ‘fake’ Big Data Gurus?
Where do you all come from?
Where do you all come from?
All your integrity’s gone
Now tell me, where do you all come from?
From ‘Where Do You All Come From‘ by Mott the Hoople Continue reading
03 Mon Aug 2015
Posted in Big Data, Consider this, good start, goodstart, Martyn Jones, Strategy
If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:
Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976
Many thanks.
Now everyone is doing Big Data you don’t want to be the odd one out, right? Of course not.
Now, if you are serious about looking at Big Data from a business perspective then I will try and lend you some advice. If you are doing it from an IT or technology perspective, then I wish you good luck, and I hope that your Big Data initiative doesn’t turn into another tech crash-and-burn show.
Now some Big Data pros are telling us that the place to start with Big Data is with strategy. Now, I’m too polite to call this out as abject bullshit, even though it is, and will instead content myself by offering an alternative and simple approach to approaching and addressing Big Data.
My first piece of advice is this. DON’T START WITH STRATEGY!
Strategy is a coherent, cohesive and executable response to a significant challenge.
Strategy is not a definition of objective, a wish list of what you are trying to achieve or aspirational goals of a nebulous nature. No, strategy is not the objective but a means of reaching that objective. Strategy is real, tangible and executable. Strategy is doing.
So what is a Big Data strategy?
If a company is looking at the Big Data options, the last place they should want to start out from is from strategy. That is as silly idea as they come. Starting with strategy on the road to formulating viable responses to significant challenges and opportunities is like saying that before we choose strategic options and a realisable strategy, then you must have a strategy in place.
Strategy is not working out what you want to achieve. That sort of thing should happen prior to any strategic work. Neither is strategy an exercise in establishing starting points, nor formulating questions nor understanding the challenges. All of this should come well before the major strategy aspects even kicks-in.
Big Data strategy is a realisable, tangible and manageable response to a significant challenge, one that depends heavily on the availability, usability and credibility of Big Data (or Very Large Data Bases) and the business value of processing that Big Data.
So, a word of advice. If you are thinking of embarking on a Big Data initiative, do not start with strategy. That is a really daft place to start.
Start here instead. With real business imperatives. This is where you are thinking about the big and significant challenges to the business, and how, at a high level of abstraction, you could go about meeting those challenges. Here you identify your challenges and your responses, aligned to your objectives.
If you can identify business imperatives that make it absolutely necessary to include elements of Big Data, then go forward with that mandatory requirement in mind. If not, then don’t try to shoe-horn Big Data into a place where it really isn’t needed or wanted. Because if you go against the grain in this way it may well hurt you and your business, in more ways than you bargained for.
In order to go out looking for data requirements driven by business imperatives, we really need to know what we are looking for.
What we are looking for maybe highly tangible or less so. We may have to derive the data we are looking for by refining, aggregating, enriching, filtering and cleansing. Therefore, with those and other aspects in mind, we can go out and find what we need.
From looking at the data requirements, you should have a good idea of potential sources of that data. Agility in this aspect is predicated on the premise that one knows the systems on the IT landscape, the business processes and all the potential sources of data – at a high level at least. So, this is not the sort of work you can do remotely with little or no knowledge of the clients business, IT setup, processes or culture.
But anyway, after you identify the sources you move on to the next step.
Here you discuss aspects of the data you require with the database / application platform owners to ensure that:
So far so good. Once passed these hurdles (and don’t forget this is a super-simplification) we move in to the next.
So, now we know:
Therefore, we go ahead and create a proof of concept or three. Simples!
However, make sure that all prototypes are governed by these simple timeless guidelines:
You run your proof of concept. You analyse, assess and represent your outcomes. You socialise, present and interpret.
When you’ve done that you are in now in a good position to estimate the usefulness of the exercise, from both a qualitative and quantitative perspective.
I did not want to touch in specific aspects of technology in this piece, in part, because I did not consider it a central issue in the theme of things. Of course, as part of creating proofs of concepts and pilot schemes you may want to experiment with the swatch (swaith? oh for auto-correction) of technologies out there. So go ahead and evaluate ‘Big Data’ technologies, and don’t forget, the answer to every Big Data technology question isn’t an automatic ‘Hadoop’. There are other valid Big Data technology options around, such as Lustre and GPFS, or even Oracle, Teradata or EXASol. Also, remember this, if all you are working on is a prototype, a proof of concept or a pilot then you can try and negotiate a free license with any of the major DBMS vendors for that initiative. So negotiate, bargain and get the most appropriate technologies with the best deals.
Finally I will leave you with three guidelines to consider:
Many thanks for reading.
In subsequent blog pieces I will be sharing my views on the evolution of information management in general, and the incorporation novel and innovative techniques, technologies and methods into well architected mainstream information supply frameworks, for primarily strategic and tactical objectives.
As always, please reach out and share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational, leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps you will even consider sending me a LinkedIn invite if you feel our data interests coincide. Also feel free to connect via Twitter, Facebook and the Cambriano Energy website.
For more on this and other topics, check out some of my other posts:
Absolutely Fabulous Big Data Roles – https://www.linkedin.com/pulse/absolutely-fabulous-big-data-roles-martyn-jones?trk=prof-post
Not banking on Big Data? – https://www.linkedin.com/pulse/banking-big-data-martyn-jones?trk=prof-post
10 amazing reasons to join The Big Data Contrarians –https://www.linkedin.com/pulse/10-amazing-reasons-join-big-data-contrarians-martyn-jones?trk=prof-post
Amazing Data Warehousing with Hadoop and Big Data –https://www.linkedin.com/pulse/cloudera-kimball-dw-building-disinformation-factory-martyn-jones?trk=prof-post
The Big Data Contrarians: The Agora for Big Data dialogue –https://www.linkedin.com/pulse/big-data-contrarians-agora-dialogue-martyn-jones?trk=mp-reader-card
The Big Data Shell Game – https://www.linkedin.com/pulse/big-data-shell-game-martyn-jones?trk=mp-reader-card
Aligning Data Warehousing and Big Data –https://www.linkedin.com/pulse/aligning-data-warehousing-big-martyn-jones?trk=mp-reader-card
Big Data Luddites – https://www.linkedin.com/pulse/big-data-luddites-martyn-jones?trk=mp-reader-card
Data Warehousing Explained to Big Data Friends –https://www.linkedin.com/pulse/data-warehousing-explained-big-friends-martyn-jones?trk=mp-reader-card
Big Data, a promised land where the Big Bucks grow –https://www.linkedin.com/pulse/big-data-promised-land-where-bucks-grow-martyn-jones-6023459994031177728?trk=mp-reader-card
The Big Data Contrarians – https://www.linkedin.com/pulse/big-data-contrarians-martyn-jones?trk=mp-reader-card
Is big data really for you? Things to consider before diving in –https://www.linkedin.com/pulse/big-data-really-you-things-consider-before-diving-martyn-jones?trk=mp-reader-card
Big Data Explained to My Grandchildren – https://www.linkedin.com/pulse/big-data-explained-my-grandchildren-martyn-jones?trk=mp-reader-card

If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:
Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976
Many thanks.
03 Mon Aug 2015
Posted in Big Data, Consider this, good start, goodstart, Martyn Jones, Strategy
Plus ça change, plus c’est la même chose.
Jean-Baptiste Alphonse Karr
I wrote a piece called ‘7 New Big Data Roles for 2015’. I published it on LinkedIn. Many people read it. Some people made suggestions. Others politely ignored it.
I listened to the suggestions, comment and criticisms, and revised the piece as a result.
So here, it is… I hope you like it. And if not, I might try again in six months’ time.
Continue reading26 Sun Jul 2015
Posted in Big Data, Consider this, Data Warehousing, good start, goodstart, hadoop
Many thanks for reading, and don’t forget, please join The Big Data Contrarians.
Some time back, Bill Inmon, the father of Data Warehousing, took the Hadoop vendor Cloudera to task for putting out some confusing advertising.
In recent times, Cloudera have linked up with Ralph Kimball, who, as some in the data world will know, has been an eternal ‘rival’ of Bill Inmon.
For some, the name of Ralph Kimball has become synonymous with dimensional modelling, and although the Kimball Group once stated that Ralph did not invent the original basic concepts of facts and dimensions, Ralph has contributed much to the development of dimensional modelling and the innovative use of SQL. Subsequently, the Kimball Group reassessed, and are now labelling Ralph as the “Dimensional modelling inventor”.
Kimball and Cloudera have collaborated on a number of initiatives, such as a webinar and slide set, with particular emphasis on the theme of Hadoop and Data Warehousing.
Now, I do not know whether this is intentional or accidental, but this collaboration has produced a lot of disingenuous claims and dubious comparisons, so much so, that I get the impression that building the DW Disinformation Factory is becoming a cottage industry in its own right.
Personally, I can see scenarios in which Big Data complements Enterprise Data warehousing, and I have explained my vision and possible architectures for these scenarios. However, what some Hadoop vendors are alluding to in the Data Warehousing space, is actually quite mischievous and misleading and is not constructive in the least, in fact, the biggest side-effect is to muddy the Big Data and Data Warehousing waters even further. That is not good, either for the industry or for the customers, or indeed, for the professionals.
In one piece of content from Cloudera, we can read that…
“Dr. Kimball explains how Hadoop can be both:
A destination data warehouse, and also
An efficient staging and ETL source for an existing data warehouse”
On the first point? No, Hadoop will not be replacing Teradata, Oracle, EXASol or any other high-performance relational database management system.
On the second point. Hadoop could support a data source for Data Warehousing, as can many other technologies. However, there is no such animal as an ETL source. There are data sources and data targets, extractions, transformations and loads, and all that cool data management, but ETL is a technology, not a source.
I think Big Data may have a big future; it depends on how deeply the internet development culture pervades enterprise application development. A lot of what Big Data addresses is about is making up for shortfalls created by badly architected web applications and shoddy application development, in which data use and data persistence were at best workaround bodges, rather than being well designed and coherent approaches to data management.
Maybe this is some why people have a hard time explaining why they are considering using Hadoop technologies for Big Data. What would a CEO say if it was brought to their attention that Hadoop was being used in their business simply to make up for the fact that their internet applications are really shoddy examples of analysis, design, architecture and management? More to the point, what would the shareholders say if they understood the full ramifications behind the need to use Hadoop?
In many cases, I think that Hadoop can be an indication that your IT organisation did something very wrong in the past, and that in these cases Hadoop is the price one pays when you one does not want to bite the bullet and admit that to screwing up, big time.
In my opinion, it would make more sense to replace applications built on faulty architectures with robust and well-architected applications, rather than fix a problem by overmedicating the patient. This would mean that data generated and used by these applications could simply dovetail into standard decision-support data platforms, such as the Enterprise Data Warehouse.
As for Cloudera and their bizarre and babbling baloney about Hadoop replacing the Data Warehouse? I suggest they read a book in the subject of Building the Data Warehouse, and maybe buck up their ideas a bit. As Bill Inmon stated “You would think that the executives of Cloudera would have familiarized themselves with what a data warehouse is.”
As for recognised data professionals and influencers who support such Hadoop tripe? The less said the better. Eh, Ralphie?
That stated, maybe Cloudera, Kimball and the Big Data flim-flam merchants simply don’t care.
So go ahead, “turbocharge your Porsche – buy an elephant.”
Many thanks for reading. Don’t forget, please join The Big Data Contrarians. The best Big Data community on the planet.