• Home
  • About
  • Consider this
  • Strategy
    • Data Warehousing
  • Ask Martyn

GOOD STRATEGY

GOOD STRATEGY

Category Archives: good start

Consider this: The ten key dimensions of Applied Business Knowledge and AI

22 Thursday Jun 2017

Posted by Martyn Jones in 4th generation Data Warehousing, Artificial Intelligence, Ask Martyn, good start, Good Strat, Good Strategy, Good Strategy Radio, knowledge management, Marty does, Martyn does, Martyn Jones, Martyn Richard Jones

≈ Leave a comment

10 dimensions B

AI, KNOWLEDGE, INFORMATION AND DATA

“Knowledge is the capacity to give correct answers to questions.”

“There is no well trodden path that takes a straight line from symbols, through data to knowledge and wisdom. This is just some nonsense invented by the IT industry.” – Martyn Jones

We may define data as being the symbolic representation of value or conversely of something which has no attached value. Data may represent, among other things, time, money, resources or worldly objects.

Continue reading →

Advertisements

Seven Magnificent Big Data Success Stories

31 Wednesday Aug 2016

Posted by Martyn Jones in 4th generation Data Warehousing, All Data, Ask Martyn, Big Data, Big Data 7s, Big Data Analytics, Cambriano, dark data, data architecture, Data governance, Data Lake, data management, data science, Data Supply Framework, Data Warehouse, Data Warehousing, good start, Good Strat, Good Strategy, goodstart, goodstartegy, goodstrat, Martyn does, Martyn Jones, Martyn Richard Jones, pig data, The Amazing Big Data Challenge, The Big Data Contrarians

≈ Leave a comment

Mount_Everest_as_seen_from_Drukair2_PLW_editMartyn Richard Jones

Lora del Rio, 31st August 2016

Big data has arrived. Big Data is here for keeps. Big Data is the future.

Despite some of the malicious, mendacious and malodorous words of naysayers, sceptics and contrarians, the world of big data and big data analytics is replete with totally amazing and fabulous success stories.

Big Data gurus are often accused of not delivering coherent, cohesive and verifiable accounts of Big Data successes. Which is understandable but at the same time a pity. So here, to illustrate this miraculous and remarkable turnaround, I give you not three but seven of the many Big Data success stories that I could have casually grabbed out of the ether.

First, we take a trip to Glasgow to discover the leveraging of Big Data in alternative investments. Then we pass over to Boston to explore the magic of Big Data at Universal Legal. We venture through Switzerland and innovative marketing. Explore the heights of Dongalong Creek. Have a word with the good folks at Heisenberg Labs. Then round it off with a quick in-depth summary of Big Data at Choppers. So, here we go…

Continue reading →

Big Data on the Roof of the World

29 Monday Aug 2016

Posted by Martyn Jones in 4th generation Data Warehousing, All Data, Ask Martyn, Big Data, Big Data 7s, Big Data Analytics, business, Business Intelligence, business strategy, dark data, data architecture, Data governance, Data Lake, data management, good start, Good Strat, Good Strategy, goodstart, goodstartegy, goodstrat, Martyn does, Martyn Jones, Martyn Richard Jones

≈ Leave a comment

Mount_Everest_as_seen_from_Drukair2_PLW_editOnce upon a time, there was a mountain known as Peak 15. Very little was known about it. Then in 1852, surveyors found it was the highest in the world, and they named it Everest.

As with other significant challenges that we can identify in life, many people have been driven by a passionate desire to conquer peaks all around the world. This is just one illustration of those of us who can identify their significant challenges and rise to them. This sharp focus, determination and courage turns ordinary citizens into people who are invariably on a mission. People who know what they want. Continue reading →

The Digital Document Lifecycle

01 Tuesday Dec 2015

Posted by Martyn Jones in data architecture, Data governance, data management, ECM, good start, Good Strat, Good Strategy, governance, Management, Martyn Jones, Martyn Richard Jones, Uncategorized

≈ Leave a comment

Tags

Content Management, ECM, Good Strat, Martyn Jones, Strategy

The Digital Document Lifecycle

MARTYN RICHARD JONES

To begin at the beginning

This is a story of the life of a digital document. Its purpose is to explain the process of analysing, designing, building, testing and delivering content rich business artefacts in today’s digital age.

Continue reading →

Big Data, ESP and Transubstantiation

19 Wednesday Aug 2015

Posted by Martyn Jones in Big Data, good start, goodstart, goodstrat

≈ Leave a comment

Tags

Big Data, Consider this, good start, goodstart, goodstrat, Martyn Jones

vocationIf you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:

Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976

Many thanks.

To the layperson anxious for answers to complicated questions, the very idea of bringing together sets of disparate data and turning it into precious insights may seem like magic, a modern day alchemy, a goal placed well beyond the grasp of mere mortals. Fortunately, this is no longer the case, thanks in part to bagatelle-proportioned advances in Big Data and Big Data analytics and massive advances in imagination; we are able to look into the past, the present and the future, with absolute certainty. Continue reading →

Why so many ‘fake’ Big Data Gurus?

16 Sunday Aug 2015

Posted by Martyn Jones in Big Data, Consider this, good start, goodstart, Strategy

≈ Leave a comment

Tags

Big Data, cynicism, data management, fakes, good start, goodstart, gurus, Martyn Jones, Martyn Richard Jones, Strategy

Why so many ‘fake’ Big Data Gurus?

Where do you all come from?

Where do you all come from?

All your integrity’s gone

Now tell me, where do you all come from?

From ‘Where Do You All Come From‘ by Mott the Hoople

A note from the editor:

Readers should be well aware that the comedian who wrote this piece is the self-styled founder of The Big Data Contrarians, which is quite possibly the most belligerently intelligent Big Data group you will ever come across in your entire life. You have been warned. If you need to verify these facts for yourself, then take a look here at your own risk: https://www.linkedin.com/grp/home?gid=8338976

And now for something completely different…

You may have noticed the massive relative-growth in the number of people who are describing themselves as Big Data gurus, data science Kaisers or analytics evangelists. Okay, I exaggerate to evidence the trend, but you’ll hopefully get the gist. Your fellow comrade on the picket line, that sweetie you met at the Pitt Club — even your darling masseuse has had their carte de visite transformed according to the prevailing désir de Jour.

Many people out there in the big data world suddenly call themselves ‘Big Data gurus’ simply because it is the latest vogue. The Caerfyrddin Good Pub Guide even went so far as to say that, “adding Big Data to your job title was the equivalent of sexing up a dodgy dossier”. They also later suggested that boosting your resume with the judicious incorporation of a titillating title, such as Big Data analytic pole-dancer, may get you a few chuckles, even if most people don’t understand and appreciate its broad multi-faceted and humoristic ramifications.

That stated, the cruel and harsh reality is that many who call themselves Big Data gurus appear to be lacking the full Big Data picnic by quite a few sandwiches when it comes down to looking at the nitty gritty of their visible resumes.  Moreover, and to be Frank and Earnest – Frank in Zurich and Earnest in Pontypridd – if I was hiring a top notch Big Data guru I wouldn’t even know where to start.

What I see are quite a number of courageous fellows who don’t really have a Scooby[1] (that’s a ‘clue’ for readers overseas) about Big Data, whose are enriched and swelled by others of their mind, who know just enough to be dangerous.

What I also see are Big Data hacks who cannot bring themselves to articulate one coherent, cohesive and verifiable Big Data success story. They are calling themselves Big Data gurus, presumably because of the incredible value accruable from their unrevealed bullion-class information.

The best offenders amongst the Big Data gurus being incredibly precocious with the hype and amazingly prudish with the facts.

Why, only the other day, one of these Big Data hype chappies – whose name escapes me for the moment – was lamenting the dearth of true data scientists whilst simultaneously lambasting and misrepresenting the profession of the contemporary statistician.

Now, I I am not fundamentally averse to a bit of hype, so long as it’s in moderation, and it doesn’t frighten the horses. Take my Dad, for example, he never curses, and he never has. Me, I use it for dramatic effect and emphasis – occasionally. However, a lot of the Big Data ‘hackery’ that we are entertained with is like having Big Data Derek and Clive playing in your ear, 24x7x52. It’s just too much, and most of the time it should be toned down or turned off. You know “So, this bloke comes up to me and says ‘Hello!’ ”

Then there are the people from the consultancies, the IT vendors and the service providers (alright, not all, just a few) who have a good grasp of the superficialities, the business, analytics and Big Data terminology, even if  they have at best a tenuous grasp of the underlying structures, concepts and relationships that these terms relate to. Of course, it’s all hooked on context.

It’s not important if it’s just about a bit of a chit-chat down ‘The Black’ or a bit of banter over Kaffee und Kuchen, or a passing comment from the umpire at the crease. As Afilonius Basto put it to me “in a seriously professional business setting, many of the predominant Big Data gurus who rock the Big Data bull**** babble, just lack a certain creativity, knowledge and experience in order to act as truly reliable informers and trusted Big Data advisers.” So I ask you, who am I to argue such matters with a highly intelligent Gos D’atura?

Part of the problem here is, to borrow the unwritten style guide of Big Data ‘hackdom’, simple supply and demand voodoo and superfluous use of important and amorphous terms tagged on to such examples. To wit, the incongruous use of terms such as ‘economics’, ‘performance’ and ‘science’. To say nothing of the irritatingly banal use of ‘amazing’, ‘amazing’ and ‘amazing’. In addition, one thing that has been puzzling me is this. Why do some Big Data pundits have to overegg their verbosely literal output with US slang that even professional bloggers in the USA would tend to use very sparingly? I know, I know what you’re thinking. “Gerroutta heres”, right? But for me it is like the IT equivalent of the mock cockney accents that are used by some celebrity chefs. I kid you not! It’s awful and it’s going viral, just like the bubonic plague and Greensleeves.

Now back to the main story thread. There simply are not enough true Big Data gurus out there to fill the demand, and so barely qualified (or ‘make it up as you go along’) aspirants make it into the higher stratified stratosphere of the Big Data saloon.

Second, just like many Big Data success stories themselves, the role of a Big Data guru is often poorly demarcated within the ambit of association and relation and even indeed within a solitary business.  People bandy around terms such as Big Data guru, Big Data whizz and Big Data who’s your daddy, willy-nilly postman style, These are terms that can mean everything and anything. From “he’s hot on Big Data that chap is” to “there goes thick Jack the Spratt the densest Big Data guru in Christendom”, and upwards and onwards to “did you see Marty, the Big Data party? Got Carmen Miranda from Data Analytics a leave of absence, although rumour has it that the she is down with predictive impregnation after reading a particularly sordid Big Data hype-piece on LinkedIn”.

A true Big Data guru/expert is so much more, so much more than what we have now. In my opinion, a Big Data guru/expert is about:

facts

Facts. The Big Data guru should stick to the facts. For example, if a Big Data guru does not understand the roles and responsibilities of a statistician then they should keep mum, and be admired for their discretion, rather than opening up a floodgates of mental garbage and thereby inviting questions regarding credibility, and introducing the risk of being viewed as a buffoon. Exempli gratia, if you do not know what a statistician does, then ask, rather than simply making things up. Also ensure that you never get into a position of belief and thought that you did not rationalise yourself into. Because you will be stuck there and no one will be able to reason you out of that belief, because it’s a position that is not in itself based on reason, truth and logic.

As Malcolm X put it “Despite my firm convictions, I have always been a man who tries to face facts, and to accept the reality of life as new experience and new knowledge unfolds. I have always kept an open mind, a flexibility that must go hand in hand with every form of the intelligent search for truth.”

So, aspiring Big Data boys and girls? Please stick to the facts! Your Big Data god, friends or family will thank you for it.

Integrity

Integrity. What Big Data definitely does not need is more hype-schlepping hypocrites, bamboozling babblers and conniving charlatans. What is needed are people who exude virtuous truthfulness, candour and pedagogical ethics.

According to Integrity Action, integrity is “the set of characteristics that justify trustworthiness and generate trust among stakeholders. Integrity creates the conditions for organisations to intelligently resist corruption and to be more trusted and efficient.” More broadly, Wikipedia puts it this way: ” Integrity is the quality of being honest and having strong moral principles; moral uprightness. It is generally a personal choice to uphold oneself to consistently moral and ethical standards.”

Trust

Trust. A Big Data guru/expert should be trustworthy, and seen to be trusty. Just like Caesar’s wife or the quality of brunch bar at Tiffany’s, the Big Data guru must be beyond reproach. Not that they aren’t allowed a little journalistic license, simply that the gaping abyss that separates barefaced porkies[2] from simple embellishments, is frankly enormous.

knowledge

Knowledge. A Big Data guru/expert should be knowledgeable in all things data, and not just Big Data. Knowledge means you know what Data Warehousing is and don’t fib about it or grossly misrepresent it in order to score ill-gotten brownie points for Big Data babble. Knowledge means you know, not that you have a bit of an idea, know a friend of friend of a friend, or can ask the audience, in cases where one is caught-out, ill-advisedly pretending to know something one doesn’t know. My advice is this. Temper knowledge with humility, honesty and decency, and you won’t go far wrong.

experience

Experience. A Big Data guru/expert must have walked the talk. Knowledge must go hand in hand with experience. Clearly a few self-labelled Big Data gurus doing the rounds these days do not fit the bill in this respect (or for that matter in many of the other ‘respects’). Alas, not all is lost. You too can acquire both the data knowledge and data experience to become a Big Data guru. How? Try working at it for a while.

There is another way at looking at experience, in a half-comic and half-serious way. To paraphrase a popular joke “Do not argue with an idiotic Big Data guru. He will drag you down to his level and beat you with experience.” And it happens…

vocation

Vocation. No, this is no time for a holiday. The Big Data guru must assume the mantle of Big Data stardom as a vocation and not just as an early-adopting fashion follower. As Voltaire put it, speaking of Newton but also commenting more broadly on education and the Enlightenment: “I have seen a professor of mathematics only because he was great in his vocation, buried like a king who had done well by his subjects.”

simplicity

Simplicity. A Big Data guru/expert must be able to explain complex Big Data ideas in a simple ways, but without losing the essence or the credibility of what requires conveying. Tesco had a slogan, “you cannot bullshit simplicity”, which tried to convey this essence, and a German retailer took this even further with “Every Lidl helps”.  So remember, keep it simple. “Sophistication is the ultimate sophistication” – Leonardo da Vinci.

journalist3

Journalism. Although not necessarily a master of joined up handwriting and maths 101, a Big Data guru must also write like a journalist. He or she does not have to be a great scribe, simply being competent with words, concepts and numbers is a high enough bar. Okay, there is a little more to it than that – or at least, there should be. So I will explain.

These are a few things (also mentioned elsewhere in this piece, and influenced by the World Journalism Institute), that a good Big-Data-guru journalist should possess or be aware of:

  • It’s mainly about people. What we write about will influence and affect people, so we should remember that, and act accordingly, with empathy and with compassion.
  • Don’t ever put-up and shut-up when presented with bullshit.
  • Be sceptical and be prepared to verify.
  • Have great and reliable sources.
  • Continually check your biases.
  • Be adaptable and welcome change.
  • Don’t be intimidated.
  • Be tenacious.
  • Be open minded.
  • Always maintain one’s own integrity.

agoodstory

A good story. According to the Writers Store there are seven elements that make a god story: ” the change of fortune, the problem of the story, the complications, crisis, climax and resolution of the classical structure, and the threat, which is by far the most important.” A truly great Big Data guru will be able to take this advice and apply it to the field of Big Data with little difficulty. It would make a great change from wading through Data Lakes of ‘bleh’ in search of the Holy Big Data grail. So, having set the scene, let’s take a closer look.

A real Big Data guru must be able to tell a compelling and credible story. However, they must also show as well as tell. Telling alone is fiction, which is fine, but a Big Data needs to back great fiction up with fact. A Big Data guru worth their salt must be able to tell and show. Nothing less than a verifiable story is acceptable. Don’t be promiscuous with the facts in a story, especially when one heralds a Big Data success, without providing any hard evidence to back it up. Remember this, just because some people are incredibly gullible when it comes to Big Data, it doesn’t mean you should lead them in ignorance down the garden path, like so many innocent lambs to the slaughter. That sort of thing is quite despicable even by our regular standards. So don’t go out of your way to prove Dan Ariely right yet again, especially with regards to his very accurate comment that “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…”

technology2

Technology. A Big Data guru should know the technology. They should also know the origins of the technology and its influences. For example, a Big Data guru should have a sound grasp of the principles of the following:

  • Distributed file stores and the many varieties that exist and have existed. Examples of these are Lustre, GPFS and HDFS.
  • Database models, architectures and technologies. From 1960 to today. Flat files and hierarchical, network, relational, object, and, dimensional models. There are more, but this is the place to start.
  • In-memory relational database technologies. E.g. EXASol and Vertica.
  • Distributed processing and orchestration.
  • Shared everything, shared nothing and conditional sharing.
  • Function shipping.
  • Parallel search, count and merge.
  • Extract, transport, transform and load technologies and techniques.
  • The history, architectures and technologies of Information Management, Information Centres, Executive Information Systems, Decision Support Systems, Management Information Systems, Report generators, 4GLs, End-user computing, Business Performance Management and even operational reporting.
  • Enterprise Data Warehousing architecture, process and technology. (See Bill Inmon for this aspect).
  • Statistics, data analytics, visualisation and presentation.

This is far from being an exhaustive list, but it should give you a flavour for what is required.

Also, take heed of the words of Pablo Picasso, who stated, “Computers are useless. They can only give you answers.”

agnostic

Agnostic. An ethical and professional Big Data pundit that really rocks the Kasbah, must also be as agnostic as it is possible to be, yet without letting an insistence on ‘fair’ and ‘balance’ upset the civic order of things. This simply means that you do not have to give equal merit to all and sundry, especially when equal merit is palpably not inherently the case in the broad range of technical, project and business offerings that wash over the decks of the SS Big Data. At the end of the day, in all cases the postmodern interpretation of fair and balanced is also a massive contradiction in terms.

The Swiss philosopher, poet and critic Henri Frédéric Amiel once wrote that “A belief is not true because it is useful”, and this is we should take special care in taking an agnostic stance with regards to Big Data and its technologies. There is a tendency for Big Data gurus to big up Hadoop and ignore the rest of the field. Not only is this narrow-minded view an injustice, but it is ultimately detrimental to those who seek to understand and obtain benefit from deploying Big Data solutions.

versatile

Versatile. I have seen many people try to accommodate the meagre competence of some self-anointed Big Data gurus within a far too wide an area of acceptance. This is not an issue if people are aware of the rampant bias, babble and boloney factors, but nonetheless extreme caution needs to be exercised if awkward unintended consequences are to be avoided. Remember a good Big Data guru can come from a variety of backgrounds — and not all of them will necessarily require a degree from a prestigious centre of educational excellence such as Oxford, the LSE or Prifysgol Cymru, Y Drindod Dewi Sant. In fact, one could say that attendance at St Trinian’s School for Young Ladies, Hogwarts School of Witchcraft and Wizardry, or Cambridge Finishing School could quite possibly be a negative factor, although I would not associate a certainty factor with any of these bets. Although to be fair, I am reliably informed by the Headmistress that Saint Trinian has an excellent Data Science faculty.

understanding

Business understanding. The ideal Big Data guru must be business aware, savvy to the point of shrewdness, and cunning, preferably “as cunning as a fox who’s just been appointed Professor of Cunning at Oxford University”[3]. They must understand business process, the commonalities and differences of business sectors and players, and the motivations, competitive forces and key influencers in business. They must also understand the meaning of business-strategy, how it is developed, chosen and executed. Finally, the Big Data guru must have a good handle on irrationality and its degrees of predictability. So, know your business, and understand the business of others. So when someone gives you a bucket and tells you to go down to Tesco’s to buy a petabyte of data and a Euro Millions Lottery ticket, you’ll know what it’s all about.

analytical

Analytical. A good Big Data guru must be naturally analytical, but not to the point of being anally analytical, and should possess an ability to spot patterns in behaviour as well as in data. CIA veteran Dick Heuer put it this way: “Thinking analytically is a skill like carpentry or driving a car. It can be taught, it can be learned, and it can improve with practice. But like many other skills, such as riding a bike, it is not learned by sitting in a classroom and being told how to do it. Analysts learn by doing.”

thatsALLfolks

That’s all folks

If you encounter a candidate Big Data guru with all of these traits — or have a candidate who ticks most of the boxes but is willing to acquire more ticks — then you’ve found someone who might deliver unbelievable value to the cause of Big Data, your struggle, your reason, your content, and your living-room. So, delight in your find.

However, be sparing with candidates on any of these inherent individualities, and you run the risk of acquiring a graded and coarse-grained pretender, someone just hoping to travel the Big Data bullshit babbling bubble until it bursts in their brazen time-pieces[4].

I know, I know, I hear you say “but does this all really matter?” Probably not. Probably in the grand scheme of things this is yet another boom and bust fad destined to become matter of fact in some reduced circles and a plain waste of time, money and patience, in others. No doubt we shall see, as the story unfolds and the ‘sublimely absurd’ metamorphoses into the ‘I don’t bloody believe it’, or not.

So, now over to you. What would you add to this compendium of convenient characteristics? I would really love to receive your views, opinions and perspectives in the comments section that follows on from this piece.

Many thanks for reading.

A note from the Prime Minister:

Data is only as good as its time and place utility. If it has none, it has no present value, unless of course, someone wants to pay something for nothing, but that is constructing a con not an economy, an aberration destined to be hated and then forgotten. Don’t only think about how to use the data you have, but also about what data should be captured and how it should be used. By the way, join the Big Data contrarians here on LinkedIn: https://www.linkedin.com/grp/home?gid=8338976

Vote for the Big Data Contrarians. Vote early! Vote often!

[1] Scooby, Scooby doo, rhyming slang for ‘clue’

[2] Porkies, pork pies, lies

[3] Blackadder Goes Forth, Richard Curtis and Ben Elton

[4] Clock, face.

Big Data, the promised land where ‘smart’ is the new doh!

03 Monday Aug 2015

Posted by Martyn Jones in Big Data, Consider this, good start, goodstart, Martyn Jones, Strategy

≈ Leave a comment

Tags

Big Data, Consider this, goodstart, Martyn Jones, Strategy

If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:

Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976

Many thanks.

So you want to ‘do’ Big Data

Now everyone is doing Big Data you don’t want to be the odd one out, right? Of course not.

Now, if you are serious about looking at Big Data from a business perspective then I will try and lend you some advice. If you are doing it from an IT or technology perspective, then I wish you good luck, and I hope that your Big Data initiative doesn’t turn into another tech crash-and-burn show.

Now some Big Data pros are telling us that the place to start with Big Data is with strategy. Now, I’m too polite to call this out as abject bullshit, even though it is, and will instead content myself by offering an alternative and simple approach to approaching and addressing Big Data.

My first piece of advice is this. DON’T START WITH STRATEGY!

Don’t start with Strategy

Strategy is a coherent, cohesive and executable response to a significant challenge.

Strategy is not a definition of objective, a wish list of what you are trying to achieve or aspirational goals of a nebulous nature. No, strategy is not the objective but a means of reaching that objective. Strategy is real, tangible and executable. Strategy is doing.

So what is a Big Data strategy?

If a company is looking at the Big Data options, the last place they should want to start out from is from strategy. That is as silly idea as they come. Starting with strategy on the road to formulating viable responses to significant challenges and opportunities is like saying that before we choose strategic options and a realisable strategy, then you must have a strategy in place.

Strategy is not working out what you want to achieve. That sort of thing should happen prior to any strategic work. Neither is strategy an exercise in establishing starting points, nor formulating questions nor understanding the challenges. All of this should come well before the major strategy aspects even kicks-in.

Big Data strategy is a realisable, tangible and manageable response to a significant challenge, one that depends heavily on the availability, usability and credibility of Big Data (or Very Large Data Bases) and the business value of processing that Big Data.

So, a word of advice. If you are thinking of embarking on a Big Data initiative, do not start with strategy. That is a really daft place to start.

Start with business imperatives

Start here instead. With real business imperatives. This is where you are thinking about the big and significant challenges to the business, and how, at a high level of abstraction, you could go about meeting those challenges. Here you identify your challenges and your responses, aligned to your objectives.

If you can identify business imperatives that make it absolutely necessary to include elements of Big Data, then go forward with that mandatory requirement in mind. If not, then don’t try to shoe-horn Big Data into a place where it really isn’t needed or wanted. Because if you go against the grain in this way it may well hurt you and your business, in more ways than you bargained for.

Know what you are looking for

In order to go out looking for data requirements driven by business imperatives, we really need to know what we are looking for.

What we are looking for maybe highly tangible or less so. We may have to derive the data we are looking for by refining, aggregating, enriching, filtering and cleansing. Therefore, with those and other aspects in mind, we can go out and find what we need.

How to find what you are looking for

From looking at the data requirements, you should have a good idea of potential sources of that data. Agility in this aspect is predicated on the premise that one knows the systems on the IT landscape, the business processes and all the potential sources of data – at a high level at least. So, this is not the sort of work you can do remotely with little or no knowledge of the clients business, IT setup, processes or culture.

But anyway, after you identify the sources you move on to the next step.

Check data availability

Here you discuss aspects of the data you require with the database / application platform owners to ensure that:

  1. they have the data you are looking for
  2. that quality of the data is known and data quality can be addressed
  3. that the data is relevant for what is needed
  4. that the cost of providing this data is not prohibitive
  5. that this data can be made available to you
  6. that service levels could be put in place, if and when required

So far so good. Once passed these hurdles (and don’t forget this is a super-simplification) we move in to the next.

Make proof of concepts

So, now we know:

  1. What data we need
  2. Where we can get it from
  3. How we get it
  4. What we need to do to make it usable
  5. How we need to analyse it

Therefore, we go ahead and create a proof of concept or three. Simples!

However, make sure that all prototypes are governed by these simple timeless guidelines:

  1. The proof of concept should be small enough to be doable in a reasonable time-frame. I would be rather generous for the very first pilot of its type in a company, but would set that limit at 90 days, tops.
  2. Make sure that the proof of concept is big enough to be significant. Again, ‘simple enough to be realisable’ and ‘large enough to be significant’, should go hand in hand.
  3. Arrange your proof of concept execution into sprints. So your 90 days may be made up of nine 10 day sprints.
  4. Don’t try and shoe-horn infrastructure aspects of your initiative into sprints, it just doesn’t work, and simply pisses people off.
  5. If a proof of concept looks like it will fail, then make sure it fails early. There’s nothing worse than having people insist on pushing a dead project to live the full length of its planned term. Failing early means that business doesn’t take a dim view of the pilot, and will be more open to new proof of concept initiatives.

Analyse the outcomes

You run your proof of concept. You analyse, assess and represent your outcomes. You socialise, present and interpret.

Revise your strategic outlook accordingly

When you’ve done that you are in now in a good position to estimate the usefulness of the exercise, from both a qualitative and quantitative perspective.

Did I mention technology?

I did not want to touch in specific aspects of technology in this piece, in part, because I did not consider it a central issue in the theme of things. Of course, as part of creating proofs of concepts and pilot schemes you may want to experiment with the swatch (swaith? oh for auto-correction) of technologies out there. So go ahead and evaluate ‘Big Data’ technologies, and don’t forget, the answer to every Big Data technology question isn’t an automatic ‘Hadoop’. There are other valid Big Data technology options around, such as Lustre and GPFS, or even Oracle, Teradata or EXASol. Also, remember this, if all you are working on is a prototype, a proof of concept or a pilot then you can try and negotiate a free license with any of the major DBMS vendors for that initiative. So negotiate, bargain and get the most appropriate technologies with the best deals.

That’s all folks

Finally I will leave you with three guidelines to consider:

  1. Don’t ask ‘how can I do Big Data?’ but ‘what data do we need?’
  2. You don’t need to seek out Big Data. If you really need it, and it’s available, and it’s adequate and appropriate, then you’ll be getting it soon enough.
  3. Avoid searching for a Big Data problem you don’t have, which can only be solved by Big Data technology you don’t need.

Many thanks for reading.

In subsequent blog pieces I will be sharing my views on the evolution of information management in general, and the incorporation novel and innovative techniques, technologies and methods into well architected mainstream information supply frameworks, for primarily strategic and tactical objectives.

As always, please reach out and share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational, leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps you will even consider sending me a LinkedIn invite if you feel our data interests coincide. Also feel free to connect via Twitter, Facebook and the Cambriano Energy website.

For more on this and other topics, check out some of my other posts:

Absolutely Fabulous Big Data Roles – https://www.linkedin.com/pulse/absolutely-fabulous-big-data-roles-martyn-jones?trk=prof-post

Not banking on Big Data? – https://www.linkedin.com/pulse/banking-big-data-martyn-jones?trk=prof-post

10 amazing reasons to join The Big Data Contrarians –https://www.linkedin.com/pulse/10-amazing-reasons-join-big-data-contrarians-martyn-jones?trk=prof-post

Amazing Data Warehousing with Hadoop and Big Data –https://www.linkedin.com/pulse/cloudera-kimball-dw-building-disinformation-factory-martyn-jones?trk=prof-post

The Big Data Contrarians: The Agora for Big Data dialogue –https://www.linkedin.com/pulse/big-data-contrarians-agora-dialogue-martyn-jones?trk=mp-reader-card

The Big Data Shell Game – https://www.linkedin.com/pulse/big-data-shell-game-martyn-jones?trk=mp-reader-card

Aligning Data Warehousing and Big Data –https://www.linkedin.com/pulse/aligning-data-warehousing-big-martyn-jones?trk=mp-reader-card

Big Data Luddites – https://www.linkedin.com/pulse/big-data-luddites-martyn-jones?trk=mp-reader-card

Data Warehousing Explained to Big Data Friends –https://www.linkedin.com/pulse/data-warehousing-explained-big-friends-martyn-jones?trk=mp-reader-card

Big Data, a promised land where the Big Bucks grow –https://www.linkedin.com/pulse/big-data-promised-land-where-bucks-grow-martyn-jones-6023459994031177728?trk=mp-reader-card

The Big Data Contrarians – https://www.linkedin.com/pulse/big-data-contrarians-martyn-jones?trk=mp-reader-card

Is big data really for you? Things to consider before diving in –https://www.linkedin.com/pulse/big-data-really-you-things-consider-before-diving-martyn-jones?trk=mp-reader-card

Big Data Explained to My Grandchildren – https://www.linkedin.com/pulse/big-data-explained-my-grandchildren-martyn-jones?trk=mp-reader-card

If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:

Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976

Many thanks.

Absolutely Fabulous Big Data Roles

03 Monday Aug 2015

Posted by Martyn Jones in Big Data, Consider this, good start, goodstart, Martyn Jones, Strategy

≈ 1 Comment

Tags

Big Data, Consider this, goodstart, Martyn Jones, Strategy

Plus ça change, plus c’est la même chose.

Jean-Baptiste Alphonse Karr

Prologue

I wrote a piece called ‘7 New Big Data Roles for 2015′. I published it on LinkedIn. Many people read it. Some people made suggestions. Others politely ignored it.

I listened to the suggestions, comment and criticisms, and revised the piece as a result.

So here, it is… I hope you like it. And if not, I might try again in six months’ time.

To begin at the beginning

I have been involved (all afternoon as a matter of fact) in an in-depth study of the changing face of IT, data architecture and data management, and the challenges that the profession faces.

In particular I have tried to focus on emerging and evolving roles and responsibilities, and in their significance, synergies and collaborative potential in a predictably high-speed, volatile and exotic future.

I know that many people will question the need to create new roles in statistical analysis, qualitative analysis, and data architecture and management. Therefore, I must admit that I also shy away from the invention of new terms, especially when they may seem to be superfluous and misleading. However, I feel that the spirit of the times is calling out for a revolution in how we view and appreciate the world of data professionals and the place of Big Data in the rich tapestry of life.

Some of the new roles detailed here may not be immediately familiar or intuitive, and some of the responsibilities may seem to be somewhat onerous or even trivial. Nevertheless, this is not accidental. As what has lead me here is the desire to formulate a coherent and cohesive response to the IT industries sea change with respect to disruptive and game-changing innovations such as Cloud data centres, the Internet of Things and Big Data.

Therefore, here is my take on what I see as being the new roles – 7+3 in all – and responsibilities within many if not all of the Next Generation Mega-Mega Data projects coming our way. The roles for discussion are:

  • Data Trader
  • Data Hound
  • Data Plumber
  • Data Butcher
  • Data Miners
  • Data Canary
  • Data Janitor
  • Data Cleaner
  • Data Pharmacist
  • Data Chef
  • Data Taster
  • Data Server
  • Data Whisperer
  • Data Czar
  • Data Shouterer

The roles, the responsibilities

Data Trader – The Data Trader is the highflying, shakin’, takin’ market maker of the wham-bam-thank-you-spam alternative-data universe. They are essentially the gears and oil of the data market, the oxygen of oxygen, the wheelers and dealers, introducing market providers of data to market consumers of data.

The Data Trader identifies potentially undervalued data, and price and quality discrepancies in alternative data sources, and then seeks to leverage these discrepancies in order to ‘monetise’ their valuable role in keeping the data market healthy. Data Traders also seek out data instruments on the instructions of a client. They may also issue and buy options and futures contracts on commoditised data, executed optionally and delivered later. Although it is technically feasible, Data Traders will rarely trade on their own account – especially if anyone of watching.

Official Endorsement: Gordon Gecko, featured above, supports The Big Data Contrarians (the professional group on LinkedIn for data, information, intellectual-capital and analytics professionals, from new recruits to chief executive officers). Gordon is on record as stating, “Before ‘The Big Data Contrarians’ came along, Big Data was all hat and no cattle! You know those hyperventilating hype guys? What schmucks! Who really needs more?” We heartily agree. So, Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Hound –Although the Data Hound is a special pedigree breed of data management role, the job of the Data Hound is essential to the work of the Data Trader.

When the Data Trader gets a new requirement for novel, fresh or new data, the job of the Data Hound is to search out the best, cheapest and most reliable sources for that data, and to identify the owners and vendors of that data.

Essentially, they assist in the data market-making responsibilities of the Data Trader.

However, there is more to the role than that. Only a Data Hound can bring infectious enthusiasm to a long saunter on the data landscape. Only a Data Hound can be such a perfect, patient distraction for the knowledge workers. Only a Data Hound can dispel all gloom, tension and work-stress with a single explosion of excitement every time you walk through the data portal.

Not for nothing shall the motto of the Data Hounds be Ad grandior data, Winalot!

Official Endorsement: Data Hound Coco, featured above, supports The Big Data Contrarians (the professional group on LinkedIn for data, information, intellectual-capital and analytics professionals, from new recruits to chief executive officers). Coco is on record as stating, “Before ‘The Big Data Contrarians’ came along, the Big Data groups on LinkedIn were kinda dog’s dinnerish, but now with one great Big Data group touching all the Big Data bases and keeping out the hype, who really needs more?” I heartily agree. So, Join The Big Data Contrarians here:https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Plumber – The Data Plumber designs, builds and maintains the infrastructure to ensure that any validly supplied data reaches the data preparation stage prior to its selection, analysis and consumption. The Data Plumber is charged with ensuring that the required data correctly gets from the data provider to the data consumer, first time, every time.

Typical responsibilities of the Data Plumber may include:

Reading drawings and specifications to determine the layout of data supply, information repositories and knowledge systems.

Detecting faults in data plumbing appliances and systems, and correctly diagnosing their causes.

Locating and marking positions for data pipe adapters, ports and channels, and fixtures in data centre walls, ceilings and floors.

Official Endorsement: Data Plumber Sam, featured above, supports The Big Data Contrarians, and is on record as stating, “Before ‘The Big Data Contrarians’ came along, Big Data was a mere overfed bagatelle.” I heartily agree. So, Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Butcher – The Data Butcher works in conjunction with the Data Chef. The Data Butcher selects and prepares the desired parts of the supplied data, which they then pass on to the Data Chef for data mining, ad-hoc predictive analysis and visualisation. The Data Butcher removes the fat data from the lean data, and provides quality data that can then be subsequently ‘sliced, diced and spiced’ in downstream analytics applications.

In years from now, IT archaeologists will marvel at the Tau influences inherent in the role of the Data Butcher in particular, and data architecture and management, in general. By way of evidence, the following is a philosophical anecdote from the future:

Once a Data Butcher was preparing a piece of Big Data for a customer who had been coming to the establishment for many years.

“Pardon me, Sir” the customer asked, “But isn’t that the same ETL you used last year?”. “Why, I do believe it is” came the reply, “Why do you ask?” “Well” said the customer, “Don’t you ever need to upgrade it or maybe go for a more sophisticated and sharper solution?”

“No…” replied the kindly Data Butcher “It’s the same ETL I’ve been using for the last 17 years”. He stared wistfully into the distance for a few moments, looking for inspiration, and then continued. “And I haven’t had to upgrade it, sunset it or change it even once. For, when I select, transform and integrate raw data, I allow the trusty ETL to find its own way through it without effort or stress. Just like Bill told me. And when I come to a tricky bit with lots of disconnected, superfluous and erroneous data, I just slow down and allow the mystery to solve itself and in no time the good data comes right through the process.”

Adapted from Chuang Tzu: The Basic Writings, 1964

Official Endorsement: Data Sources close to the government are off record as stating, “Before ‘The Big Data Contrarians’ came along, everyone was all pissy about pig data, and no one really knew what it was, never mind how to do it! But now that has all changed.” Well Said, Dave!

I heartily agree with David Cameron’s sentiments on this subject.

So, Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Miner – Without any doubt, this is the hardest, most arduous and intense jobs in the whole collection of our new Big Data roles. The Data Miner job is in logically and physically discovering, revealing and extracting the data that is most difficult to get at. Which often involves very risky situations. The Data Miner will be the person who extracts the data that has the highest information value. They do not plough the Big Data fields, climb the Big Data peaks, nor navigate the Big Data lakes, but they do know their way around the Big Data Pit. The data that the Data Miner extracts is difficult to ignite but once it gets going will deliver concentrated and deep business value during a much longer time and at a much slower depletion rate than any other type of data. The Big Data Miner delivers the south Wales anthracite of all data, the Kuwaiti oil of all data and the strongest trade winds of all data; this is why the work of the Data Miner will always have a certain cachet in the world of data.

Official Endorsement: Dai Bando had this to say of the Data Miners: “In essence, the Data Miners are the nobility of the Big Data. But they are also the ethical side of Big Data, the human side, the diligent side, and, the big hearted side.”, “They are also the biggest contrarians in Big Data.”

I heartily agree with Dai’s hearty sentiments on this most hearty of subjects.

So, Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Canary – Where would a Data Miner be without his Data Canary? The Data Canary is the arbiter of quality in the arduous and ambient world of Big Data Mining. We have all heard of the phrase “canary in the coal mine”, well the Data Canary is similar. To paraphrase Wikipedia, The “canary in the coal mine” is an allusion to caged canaries… that miners would carry underground with them. If dangerous gases such as carbon monoxide collected in the mine, it would affect the canaries well before affecting the miners, so it provided a type of early warning system.” Data Canaries are in no such mortal danger, but they still share the arduous and inhospitable corporate working environments as those experienced by the Data Miners.

Official Endorsement: The International League of Big Data Canary Workers (ILBDCW) says, “Join the Big Data Contrarians today!”

So, Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Pharmacist – The Data Pharmacist provides data remedies for data related ailments. Have you ingested more Big Data than you could manage? Have you put toxic Big Data through a viral business process? Are you suffering from a Big Data hangover? Then the Data Pharmacist will be at hand to sort things out. Data Pharmacists need strong mathematics skills to precisely prepare data medicines and explain dosing to the Big Data clients. Accuracy is critical to success in data pharmacies. Data Pharmacists need to accurately count data medicine and label data remedies correctly. Even minor errors can lead to incorrect usage. Data Pharmacists commonly multitask and deal with overwhelming activity. They continually take prescriptions from Big Data patients, fill them and consult with Big Data clients when they pick them up. Data Pharmacists need strong communication skills to interact with a variety of Big Data people in typical day. With Big Data customers, they have to listen, answer questions about prescriptions and clearly communicate proper use and side effects of Big Data medication. With data doctors, they need to listen well and make certain they get accurate Big Data prescription information. Therefore, Data Pharmacists needs good maths skills; a great attention to detail; a huge reserve of patience; and, excellent communications skills.

Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Caretaker – Also known as a Data Janitor or Data Custodian.

Data Caretakers look after installations such as data centres, clouds and data lakes. They make sure the installations and data are secure, clean and well maintained. If you like fixing things and enjoy Big Data DIY, you might consider becoming a Data Caretaker.

To become a Data Caretaker, you will need practical skills to carry out minor data centre repairs. You’ll need to be able to manage your own data caretaking workload. You’ll also need a good awareness of data governance health, data safety, data security and data hygiene issues.

Your skills and ability to do the job will often be more important than qualifications. Practical skills such as Python hacking, data scrambling and DIY data modelling would be useful. It could also be an advantage if you have relevant work experience.

Official Endorsement: The Union of Data Caretakers and Janitors (UDCJ) says, “Don’t delay! Join the Big Data Contrarians today!”

So, Join The Big Data Contrarians here: https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Cleaner – In many ways, the Data Cleaner role is a subset of the Data Janitor role. However, it is more than that. Behind the jovial nature, carefree exterior and personally disinterested motivation of the Data Cleaner lurks a surprise and a heart of gold. A thoroughly dedicated professional whose sole aim in their working life is to rid data, and the habitats in which it thrives, of toxic and viral elements that might otherwise place in peril the order of data things, unbalance the status data quo and disturb the nature of the data elements. The Data Cleaner ensures that the data is clean, respectable and fit for work.

Data Chef – If you’ve ever seen a great Chef working, up close and intimate (to use the west coast vernacular), then you will appreciate the need for the role of Data Chef.

First and foremost (or ‘primarily’, as Word tells me) the Data Chef is the curator of all the organisations data analytics ‘recipes’. They have the data analytics ‘knowledge’. Ideally, the Data Chef has a solid grounding in formal statistical methods and a solid appreciation of data architecture. A wide range of other skills may also augment this profile, such as an open attitude to Nouvelle analyse des données. The Data Chef also works in conjunction with the Data Trader and the Data Butcher to determine and identify prime data material in the data markets.

Based on the available prime data the Data Chef is able to determine a menu of data analytics approaches, even though these will change dynamically depending on what other accompanying data is also in season

Our resident Big Data Chef, Erik Ysbyty von Pimpollo says… “For me, The Big Data Contrarians is like the Michelin red guide to data. Indeed, what gourmet would be without that masterly tome of culinary references? So, you can join The Big Data Contrarians now, now or now”.

We agree! Thanks for the heads up, Erik! … Here we arehttps://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Taster – A Data Taster is a person that takes data (or information) to be provided to a person or entity to confirm that it is safe to issue. This is perhaps one of the oldest professions in data, coming, as it does, from the ancient Roman role of praegustatoror data unini. The person or entity to whom the data is going to be issued is usually an important person or body (for example, a regulatory reporting body or an organisational strategy group) or any person or body that could possibly be placed at risk if the data (or information) is erroneous, misleading or compromised. For example, the Data Taster verifies the outcomes of Big Data Analytics and confirms that the data is plausible and that the models used are valid so that they do not permit either the accidental or the intentional introduction of data contamination. The Data Taster may also be accountable for the preparation and provision of data. The hope is that the Data Taster will be conscientious and meticulous in preventing contamination from being introduced into data, in order to safeguard their own reputation and that of their organisation.

Welsh Hollyweird wizard and Spartacus ‘clanner’ Catherine Alpha Omega Zeta Beta Jones (not featured) says… “I like data, I like Wales, I like lovely-jubbly, rolly-polly, downy-clowny, bed-clothes snuggling data… like buttered crumpets in a blanket… and Marty… so I say… Join The Big Data Contrarians, or be a right dummy, innit! Result!”. We agree! Thanks for the heads up, Cath! … Here we arehttps://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Server – The Data Server is a role that is closely tied to the roles of Data Whisperer and Data Czar.

At a superficial level, the Data Server presents the data menu and takes the data orders, then serves what has been ordered. The Data Server may also advise data clients on the optimal choices of data, based on the data that is available and the data preferences of other clients.

Because the role of the Data Server requires that they know a little about everything and a lot about something, the two most popular career progression paths for Data Servers are moves into the role of Data Whisperer or Data Shouterer.

The Big Data Contrarians “No Data Server left behind” – Here we arehttps://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page

Data Whisperer – Is integrally associated with the roles of Data Server and Data Czar. This role is an extremely important key-stone position within an organisation.

The Data Whisperer is an explainer, a storyteller and a stand-up philosopher. The primary responsibility of the Data Whisperer is to avoid that any senior executive or regulatory body throws a ‘wobbly’ when they fail to correctly interpret the data that they are provided. Therefore, the responsibility of the Data Whisperer is to correctly socialise data analysis outcomes with the intended audiences for those outcomes, and to jointly present and explain those outcomes in plain and simple language. They are required to have courage, strength and a high degree of empathy both with the data and also with the consumers of that data.

The Big Data Contrarians “Data Whisperers are us” – Here we arehttps://www.linkedin.com/grp/home?gid=8338976 Link to group home page.

Data Czar – Typically this is a senior board role, comparable to that of CFO. Indeed, the role of Data Czar (or Data Tsar, for the British) may also be held by the CFO. The role itself is that of visible figurehead of all data architecture and management activities within an organisation. Although it bears some striking resemblance with the now widely discredited role of bygone days, which I will delve into in subsequent articles, its remit goes much further. The Data Czar has the ability and the empowerment to break down barriers, cut through red tape and knock down walls that create organisational silos. They can free and easily engage with and involve senior organisation players in their data campaigns and battles, gaining commitment, trust and willing complicity along the way. Naturally, the Data Czar can also call on the skills, talents, knowledge and experience of the other 9 roles identified here.

The Big Data Contrarians “A noble cause” – Here we arehttps://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

Data Shouterer (aka Data Shouter) – Finally, we come to the last of the roles. The Data Shouterer is primarily the role of the data evangelist, the extoller of grand data ‘truths’, the purveyor of serendipitous comfort, and the herald of a brave new data world.

When things go well the Data Shouterer is called upon to holler out the successes of data analysis from the rooftops.

When success is reluctant to come forth, they must be there to ‘big up’ the inherent potential for success, with brave tales of data buccaneering, ace information pilots and glorious exponents of the Art of Data.

The Big Data Contrarians “Shouting it from the rooftops in Finsbury Park” – Here we are https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

 To Conclude

There you have a brief explanation of the revised and expanded 7(+n) new Big Data roles for 2015. So, to conclude…

Necessity is the mother of invention and mother is the invention of necessity and just as the judicious use of parallel grep, awk and bash could have been reinvented, rebadged and released as perhaps justifiably the next best thing in data, so too must there be concomitant and relevant roles to suit the revolutionary data spirit of the times. No?

However, to paraphrase John McEnroe, “surely this piece cannot be serious?” To which I might reply, maybe yes, or maybe no, it simply ‘depends’. But depend on what?

The English writer George Orwell once mused that “The most effective way to destroy people is to deny and obliterate their own understanding of their history”, to which I could add, “This may occur whether the act is intentional, accidental or systemic”. I think it is important that when we look at any new IT industry trend or fad, that we do so with a reasonable knowledge of IT history and the evolution of IT technology, and with a good understanding of contemporary and legacy technologies and architectures. This, to my mind, is how we respect both the IT/Information Architecture and Management profession and those whom we seek to help.

Finally, dear reader, although as I state my intention is quite serious, please do take this piece with a modicum of sodium chloride and a pinch of reality.

Please don’t forget to check out my Big Data predictions for 2015.

Also, Join The Big Data Contrarians “Simply the best Big Data group and community on LinkedIn” – Here we are https://www.linkedin.com/grp/home?gid=8338976 ß Link to group home page.

In subsequent blog pieces I will be sharing my views on the evolution of information management in general, and the incorporation novel and innovative techniques, technologies and methods into well architected mainstream information supply frameworks, for primarily strategic and tactical objectives.

As always, please reach out and share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational, leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps you will even consider sending me a LinkedIn invite if you feel our data interests coincide. Also feel free to connect via Twitter, Facebook and the Cambriano Energy website.

For more on this and other topics, check out some of my other posts:

Not banking on Big Data? – https://www.linkedin.com/pulse/banking-big-data-martyn-jones?trk=prof-post

10 amazing reasons to join The Big Data Contrarians –https://www.linkedin.com/pulse/10-amazing-reasons-join-big-data-contrarians-martyn-jones?trk=prof-post

Amazing Data Warehousing with Hadoop and Big Data –https://www.linkedin.com/pulse/cloudera-kimball-dw-building-disinformation-factory-martyn-jones?trk=prof-post

The Big Data Contrarians: The Agora for Big Data dialogue –https://www.linkedin.com/pulse/big-data-contrarians-agora-dialogue-martyn-jones?trk=mp-reader-card

The Big Data Shell Game – https://www.linkedin.com/pulse/big-data-shell-game-martyn-jones?trk=mp-reader-card

Aligning Data Warehousing and Big Data –https://www.linkedin.com/pulse/aligning-data-warehousing-big-martyn-jones?trk=mp-reader-card

Big Data Luddites – https://www.linkedin.com/pulse/big-data-luddites-martyn-jones?trk=mp-reader-card

Data Warehousing Explained to Big Data Friends –https://www.linkedin.com/pulse/data-warehousing-explained-big-friends-martyn-jones?trk=mp-reader-card

Big Data, a promised land where the Big Bucks grow –https://www.linkedin.com/pulse/big-data-promised-land-where-bucks-grow-martyn-jones-6023459994031177728?trk=mp-reader-card

The Big Data Contrarians – https://www.linkedin.com/pulse/big-data-contrarians-martyn-jones?trk=mp-reader-card

Is big data really for you? Things to consider before diving in –https://www.linkedin.com/pulse/big-data-really-you-things-consider-before-diving-martyn-jones?trk=mp-reader-card

Big Data Explained to My Grandchildren – https://www.linkedin.com/pulse/big-data-explained-my-grandchildren-martyn-jones?trk=mp-reader-card

Acknowledgements

Many thanks to Richard Ordowich, Cari Jaquet, Claudia Pagliari.,  Error Warner,Rebecca Shomair, Rebecca Shomair, Almarie Meyer, Jennifer Christos, Joseph Adams, Terry Shen, Stephanie Vilner – Sheppard, and many others, for their valuable suggestions and support.

Amazing Data Warehousing with Hadoop and Big Data

26 Sunday Jul 2015

Posted by Martyn Jones in Big Data, Consider this, Data Warehousing, good start, goodstart, hadoop

≈ Leave a comment

Tags

Big Data, cloudera, enterprise data warehousing, goodstart, hadoop

Many thanks for reading, and don’t forget, please join The Big Data Contrarians.

Some time back, Bill Inmon, the father of Data Warehousing, took the Hadoop vendor Cloudera to task for putting out some confusing advertising.

In recent times, Cloudera have linked up with Ralph Kimball, who, as some in the data world will know, has been an eternal ‘rival’ of Bill Inmon.

For some, the name of Ralph Kimball has become synonymous with dimensional modelling, and although the Kimball Group once stated that Ralph did not invent the original basic concepts of facts and dimensions, Ralph has contributed much to the development of dimensional modelling and the innovative use of SQL. Subsequently, the Kimball Group reassessed, and are now labelling Ralph as the “Dimensional modelling inventor”.

Kimball and Cloudera have collaborated on a number of initiatives, such as a webinar and slide set, with particular emphasis on the theme of Hadoop and Data Warehousing.

Now, I do not know whether this is intentional or accidental, but this collaboration has produced a lot of disingenuous claims and dubious comparisons, so much so, that I get the impression that building the DW Disinformation Factory is becoming a cottage industry in its own right.

Personally, I can see scenarios in which Big Data complements Enterprise Data warehousing, and I have explained my vision and possible architectures for these scenarios. However, what some Hadoop vendors are alluding to in the Data Warehousing space, is actually quite mischievous and misleading and is not constructive in the least, in fact, the biggest side-effect is to muddy the Big Data and Data Warehousing waters even further. That is not good, either for the industry or for the customers, or indeed, for the professionals.

In one piece of content from Cloudera, we can read that…

“Dr. Kimball explains how Hadoop can be both:

A destination data warehouse, and also

An efficient staging and ETL source for an existing data warehouse”

On the first point? No, Hadoop will not be replacing Teradata, Oracle, EXASol or any other high-performance relational database management system.

On the second point. Hadoop could support a data source for Data Warehousing, as can many other technologies. However, there is no such animal as an ETL source. There are data sources and data targets, extractions, transformations and loads, and all that cool data management, but ETL is a technology, not a source.

I think Big Data may have a big future; it depends on how deeply the internet development culture pervades enterprise application development. A lot of what Big Data addresses is about is making up for shortfalls created by badly architected web applications and shoddy application development, in which data use and data persistence were at best workaround bodges, rather than being well designed and coherent approaches to data management.

Maybe this is some why people have a hard time explaining why they are considering using Hadoop technologies for Big Data. What would a CEO say if it was brought to their attention that Hadoop was being used in their business simply to make up for the fact that their internet applications are really shoddy examples of analysis, design, architecture and management? More to the point, what would the shareholders say if they understood the full ramifications behind the need to use Hadoop?

In many cases, I think that Hadoop can be an indication that your IT organisation did something very wrong in the past, and that in these cases Hadoop is the price one pays when you one does not want to bite the bullet and admit that to screwing up, big time.

In my opinion, it would make more sense to replace applications built on faulty architectures with robust and well-architected applications, rather than fix a problem by overmedicating the patient. This would mean that data generated and used by these applications could simply dovetail into standard decision-support data platforms, such as the Enterprise Data Warehouse.

As for Cloudera and their bizarre and babbling baloney about Hadoop replacing the Data Warehouse? I suggest they read a book in the subject of Building the Data Warehouse, and maybe buck up their ideas a bit. As Bill Inmon stated “You would think that the executives of Cloudera would have familiarized themselves with what a data warehouse is.”

As for recognised data professionals and influencers who support such Hadoop tripe? The less said the better. Eh, Ralphie?

That stated, maybe Cloudera, Kimball and the Big Data flim-flam merchants simply don’t care.

So go ahead, “turbocharge your Porsche – buy an elephant.”

Many thanks for reading. Don’t forget, please join The Big Data Contrarians. The best Big Data community on the planet.

You can’t hide your lyin’ Big Data

22 Wednesday Jul 2015

Posted by Martyn Jones in Big Data, Consider this, good start, Good Strat, goodstart, goodstrat

≈ Leave a comment

Tags

Big Data, good start, Good Strat, goodstart, goodstrat

As a child, I adored the USA rock band the Eagles, especially the musical talents of Joe Walsh. This explains the inspiration behind the title of this piece.

So, what’s going down at Ashley Madison?

Never heard of them? Off your radar? Surely not?

That stretches the bounds of incredulity. As even the people in Singapore’s Media Development Authority have heard of them. They even described their business site this way “it promotes adultery and disregards family values”, and subsequently will not allow them to operate in Singapore. Well, what a turn-up for the books.

On a more serious note, and as you might know, (from Wikipedia or some other ‘sites’,) Ashley Madison is a Canadian-based online dating service and social networking service marketed to people who are married or in a committed relationship. Its slogan is “Life is short. Have an affair.” It seems, if we are to believe various reports doing the rounds, that their Big Data has been compromised, big time.

Yes, I know, how could that possibly have happened, right?

According to some reports, Adison Mashley have around 37 million clients in the Big Data pool, and large caches of it have allegedly been stolen after an apparently successful hacking attempt was carried out. According to Krebs On Security, data stolen from the web site in question “have been posted online by an individual or group that claims to have completely compromised the company’s user databases, financial records and other proprietary information.”

But, again I ask, how can this happen?

I am not an avid fan of Big Data technology for core business use, and given the level of Big Data technology maturity, it sounds like a dopey idea. But each to their own.

What I will state is that my database management experience has tended to be associated with database technologies that can only be hacked as part of an inside job i.e. where people either know user IDs, passwords, IP addresses and layers of protection etc. or know of someone who does. Either someone who is a friend, part of the family (no, not that type of ‘family’) or someone who can be blackmailed into divulging the required access paths and security check workarounds.

However, taking a broader and more permissive view of this alleged hackerisation of Big Data, do we write it up as a Big Data success, i.e. The Amazing Big Data Affair? Put it down to a technical glitch and community faux pas? Or do we take a jaundiced view of the whole thing and keep it real? I await with baited breath for the enlightened opinions of the Big Data gurus.

Mitch ‘n’ Andy are not unfamiliar with ‘issues’ related to the use of people’s data. The Daily Dot carried a piece from contributing writer S. E. Smith with the headline ‘Why Ashley Madison is cheating on its users with Big Data’ in that piece, Smith states that “Like pretty much every other website on Earth, Ashley Madison spies on its users and crunches the data in a variety of ways to increase the bottom line.”

Belinda Luscombe writing in Time confirmed these suspicions with a piece titled ‘Cheaters’ Dating Site Ashley Madison Spied on Its Users’. She wrote:

In a study to be presented at the 109th Annual Meeting of the American Sociological Association in San Francisco on Saturday Aug. 16, Eric Anderson, a professor at the University of Winchester in England claims that women who seek extra-marital affairs usually still love their husbands and are cheating instead of divorcing, because they need more passion. “It is very clear that our model of having sex and love with just one other person for life has failed— and it has failed massively,” says Anderson.

“How does he know this? Because he spied on the conversations women were having on Ashley Madison, a website created for the purpose of having an affair. Professor Anderson, who as it turns out is a the “chief science officer” at Ashley Madison, looked at more than 4,000 conversations that 100 women were having with potential paramours. “I monitored their conversation with men on the website, without their knowing that I was monitoring and analyzing their conversations,” he says. “The men did not know either.”

Elsewhere, and as reported on Wikipedia, “Trish McDermott, a consultant who helped found Match.com, accused Ashley Madison of being a “business built on the back of broken hearts, ruined marriages, and damaged families.”

Wow, wow, and triple wow! What a way to run a dance hall!

Maybe they should reconsider their slogan, making it more snappy and apposite. How about “Life is short, we pimp your Big Data” as a starter? So go ahead, make your own and post it below. Have fun.

Many thanks for reading.

Oh, and one last thing before I go… GOOD-AD: Join The Big Data Contrarianshttps://www.linkedin.com/grp/home?gid=8338976

← Older posts

Top posts

  • The World's Best Data Quotes... Including Big Data quotes
    The World's Best Data Quotes... Including Big Data quotes
  • 12 Amazing Big Data Success Stories for 2016
    12 Amazing Big Data Success Stories for 2016
  • 5 Simple Tips to Help You Survive the Big Data Bullshit Revolution
    5 Simple Tips to Help You Survive the Big Data Bullshit Revolution
  • Big Data Predictions for 2017: How did we do?
    Big Data Predictions for 2017: How did we do?
  • Absolutely Fabulous Big Data Roles
    Absolutely Fabulous Big Data Roles
  • A data superhero is something to be
    A data superhero is something to be
  • Seven Magnificent Big Data Success Stories
    Seven Magnificent Big Data Success Stories
  • Post-truth, Fake-news and Big Data
    Post-truth, Fake-news and Big Data

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,845 other followers

Follow GOOD STRATEGY on WordPress.com

Names in the cloud

4th generation Data Warehousing All Data Ask Martyn Big Data Big Data 7s Big Data Analytics Business Intelligence business strategy Consider this dark data data architecture Data governance Data Lake data management data science Data Supply Framework Data Warehouse Data Warehousing goodstart goodstrat Good Strat Good Strategy IT strategy Martyn does Martyn Jones Martyn Richard Jones pig data Strategy The Amazing Big Data Challenge The Big Data Contrarians

Hours & Info

ES 28039
+353 0 892 055 113
Lunch: 11am - 2pm
Dinner: M-Th 5pm - 11pm, Fri-Sat:5pm - 1am

The Good Strat Archives

  • January 2018
  • December 2017
  • October 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • September 2016
  • August 2016
  • May 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014

The Stats

  • 44,359 hits

Recent posts

  • Big Data Predictions for 2017: How did we do? January 5, 2018
  • Three really stupid things you can do with a Data Lake December 28, 2017
  • If Relational Was the New Thing December 15, 2017
  • What if the Hadoop Ecosphere Were a Box of Chocolates December 12, 2017
  • Bullshit in Barcelona: How not to serve your constituents October 8, 2017
  • The Bastards of Big Data: You can’t blame Putin for all of this bullshit October 7, 2017
  • Business Data Explained: The Long Read October 7, 2017
  • UK Government, Global Charlies August 31, 2017
  • Brexit: A question for Jeremy Corbyn August 28, 2017
  • UK: Better off with Brexit August 25, 2017
Advertisements

Hours & Info

Martyn Jones
Cambriano Ltd
Balshagray Drive
GLASGOW G11
Scotland
+44 (0)7504 966742
Business hours
Follow GOOD STRATEGY on WordPress.com

Follow me on Twitter

My Tweets

Top Good Strat Posts & Pages

  • The World's Best Data Quotes... Including Big Data quotes
  • About
  • 12 Amazing Big Data Success Stories for 2016
  • 5 Simple Tips to Help You Survive the Big Data Bullshit Revolution
  • Big Data Predictions for 2017: How did we do?
  • Ask Martyn
  • Absolutely Fabulous Big Data Roles
  • A data superhero is something to be
  • Seven Magnificent Big Data Success Stories
  • Post-truth, Fake-news and Big Data

Good strat tag cloud

accountability advertising All Data Analytics aspiring tendencies in IM awareness Banking Behavioural Economics BI Big Data Bill Inmon Brexit BS Business business analysis Business Enablement business intelligence Business Management business strategy Challenges Commercial IT Consider this corporate assets Corporate IT Creativity data data analytics data architecture data integration data management Data Marts data science Data Warehouse Demagogism Dogma DW 3.0 Economics enterprise data warehousing EU Financial Goal Setting goodstart good start Good Strat goodstrat Good Strategy hadoop Information and Technology information management Information Technology IT business IT Strategy knowledge management leadership marketforces Marketing Martyn Jones Martyn Richard Jones MDM Offshoring operationalwareness Organisational Autism organisational awareness Outsourcing Pimps Politics project management Requirements management Risk Risk Management statistics Strategy trading traditional assets UK

Categories

  • 4th generation Data Warehousing
  • accountability
  • advertising
  • agile
  • AI
  • All Data
  • Analytics
  • Architecture
  • Artificial Intelligence
  • Ask Martyn
  • Assets
  • awareness
  • Banking
  • behaviour
  • Best principles
  • Big Data
  • Big Data 7s
  • Big Data Analytics
  • Books with influence
  • Brexit
  • BS
  • business
  • Business Intelligence
  • business strategy
  • Cambriano
  • China
  • Climate Change
  • Cloud
  • code of conduct
  • Commercial Analytics
  • community
  • Condiser this
  • Conservative Party
  • consider
  • Consider this
  • Creativity
  • dark data
  • data architecture
  • Data governance
  • Data Lake
  • data management
  • Data Mart
  • data science
  • Data Supply Framework
  • Data Warehouse
  • Data Warehousing
  • deceit
  • digital transformation
  • Diplomacy
  • disinformation
  • Dogma
  • DW 3.0
  • ECM
  • Economics
  • EDW
  • England
  • enterprise content management
  • ethics
  • Europe
  • European Union
  • Excellence
  • Excerpt
  • Executive
  • Extract
  • Federalism
  • fraud
  • Globalisation
  • good start
  • Good Strat
  • Good Strategy
  • Good Strategy Radio
  • goodstart
  • goodstartegy
  • goodstrat
  • goostart
  • governance
  • hadoop
  • hdfs
  • HR
  • humour
  • India
  • influencers
  • informatio Supply Framework
  • information
  • Information Management
  • Information Supply Frameowrk
  • Information Supply Framework
  • Infotrends
  • Inmon
  • IoT
  • IT fraud
  • IT strategy
  • java
  • Knowledge
  • knowledge management
  • Labour Party
  • leadership
  • Leadership 7s
  • LSE
  • Management
  • market forces
  • Marketing
  • Marty does
  • Martyn does
  • Martyn Jones
  • Martyn Richard Jones
  • Memory lane
  • Methodology
  • nationalism
  • nine competitive forces
  • Northern Ireland
  • offshore
  • Offshoring
  • operational
  • Outsourcing
  • Oxford
  • pain
  • Peeves
  • Philosophy
  • pig data
  • Plaid Cymru
  • Planning
  • Polemic
  • Politics
  • pomo
  • postmodern
  • POTUS
  • Process
  • Professional Networking
  • professionalism
  • project management
  • Project to Excel
  • public
  • Quiz
  • Rant
  • Remain
  • Risk
  • Rivalry
  • Russia
  • Ruth Davidson
  • Sales
  • satire
  • Scotland
  • Scottish National Party
  • sentiment analysis
  • SMILES
  • Snippet
  • SNP
  • Social
  • Social Media
  • Sociology
  • spoof
  • statistics
  • Stories
  • Strategy
  • structured intellectual capital
  • supply chain management
  • tactics
  • TEAM
  • technology
  • The Amazing Big Data Challenge
  • The Big Data Contrarians
  • The Greens
  • The Guardian
  • Trade
  • UK
  • Uncategorized
  • United Kingdom
  • USA
  • Value
  • Wales
  • wisdom

Create a free website or blog at WordPress.com.

Cancel
Privacy & Cookies: This site uses cookies from WordPress.com and selected partners.
To find out more, as well as how to remove or block these, see here: Our Cookie Policy