• Home
  • About
  • The Good Strategy Blog
  • Strategy
    • Data Warehousing
    • Ask Martyn
  • MARTYN
    • MARTYN’S MUSIC
    • Must-Read Books from Martyn
    • PODCASTS
    • MARTYN.ES

GOOD STRATEGY

~ DATA, INFORMATION & KNOWLEDGE

GOOD STRATEGY

Category Archives: goodstart

Amazing Data Warehousing with Hadoop and Big Data

26 Sun Jul 2015

Posted by Martyn Jones in Big Data, Consider this, Data Warehousing, good start, goodstart, hadoop

≈ Leave a comment

Tags

Big Data, cloudera, enterprise data warehousing, goodstart, hadoop


Many thanks for reading, and don’t forget, please join The Big Data Contrarians.

Some time back, Bill Inmon, the father of Data Warehousing, took the Hadoop vendor Cloudera to task for putting out some confusing advertising.

In recent times, Cloudera have linked up with Ralph Kimball, who, as some in the data world will know, has been an eternal ‘rival’ of Bill Inmon.

For some, the name of Ralph Kimball has become synonymous with dimensional modelling, and although the Kimball Group once stated that Ralph did not invent the original basic concepts of facts and dimensions, Ralph has contributed much to the development of dimensional modelling and the innovative use of SQL. Subsequently, the Kimball Group reassessed, and are now labelling Ralph as the “Dimensional modelling inventor”.

Kimball and Cloudera have collaborated on a number of initiatives, such as a webinar and slide set, with particular emphasis on the theme of Hadoop and Data Warehousing.

Now, I do not know whether this is intentional or accidental, but this collaboration has produced a lot of disingenuous claims and dubious comparisons, so much so, that I get the impression that building the DW Disinformation Factory is becoming a cottage industry in its own right.

Personally, I can see scenarios in which Big Data complements Enterprise Data warehousing, and I have explained my vision and possible architectures for these scenarios. However, what some Hadoop vendors are alluding to in the Data Warehousing space, is actually quite mischievous and misleading and is not constructive in the least, in fact, the biggest side-effect is to muddy the Big Data and Data Warehousing waters even further. That is not good, either for the industry or for the customers, or indeed, for the professionals.

In one piece of content from Cloudera, we can read that…

“Dr. Kimball explains how Hadoop can be both:

A destination data warehouse, and also

An efficient staging and ETL source for an existing data warehouse”

On the first point? No, Hadoop will not be replacing Teradata, Oracle, EXASol or any other high-performance relational database management system.

On the second point. Hadoop could support a data source for Data Warehousing, as can many other technologies. However, there is no such animal as an ETL source. There are data sources and data targets, extractions, transformations and loads, and all that cool data management, but ETL is a technology, not a source.

I think Big Data may have a big future; it depends on how deeply the internet development culture pervades enterprise application development. A lot of what Big Data addresses is about is making up for shortfalls created by badly architected web applications and shoddy application development, in which data use and data persistence were at best workaround bodges, rather than being well designed and coherent approaches to data management.

Maybe this is some why people have a hard time explaining why they are considering using Hadoop technologies for Big Data. What would a CEO say if it was brought to their attention that Hadoop was being used in their business simply to make up for the fact that their internet applications are really shoddy examples of analysis, design, architecture and management? More to the point, what would the shareholders say if they understood the full ramifications behind the need to use Hadoop?

In many cases, I think that Hadoop can be an indication that your IT organisation did something very wrong in the past, and that in these cases Hadoop is the price one pays when you one does not want to bite the bullet and admit that to screwing up, big time.

In my opinion, it would make more sense to replace applications built on faulty architectures with robust and well-architected applications, rather than fix a problem by overmedicating the patient. This would mean that data generated and used by these applications could simply dovetail into standard decision-support data platforms, such as the Enterprise Data Warehouse.

As for Cloudera and their bizarre and babbling baloney about Hadoop replacing the Data Warehouse? I suggest they read a book in the subject of Building the Data Warehouse, and maybe buck up their ideas a bit. As Bill Inmon stated “You would think that the executives of Cloudera would have familiarized themselves with what a data warehouse is.”

As for recognised data professionals and influencers who support such Hadoop tripe? The less said the better. Eh, Ralphie?

That stated, maybe Cloudera, Kimball and the Big Data flim-flam merchants simply don’t care.

So go ahead, “turbocharge your Porsche – buy an elephant.”

Many thanks for reading. Don’t forget, please join The Big Data Contrarians. The best Big Data community on the planet.

You can’t hide your lyin’ Big Data

22 Wed Jul 2015

Posted by Martyn Jones in Big Data, Consider this, good start, Good Strat, goodstart, goodstrat

≈ Leave a comment

Tags

Big Data, good start, Good Strat, goodstart, goodstrat


As a child, I adored the USA rock band the Eagles, especially the musical talents of Joe Walsh. This explains the inspiration behind the title of this piece.

So, what’s going down at Ashley Madison?

Never heard of them? Off your radar? Surely not?

That stretches the bounds of incredulity. As even the people in Singapore’s Media Development Authority have heard of them. They even described their business site this way “it promotes adultery and disregards family values”, and subsequently will not allow them to operate in Singapore. Well, what a turn-up for the books.

On a more serious note, and as you might know, (from Wikipedia or some other ‘sites’,) Ashley Madison is a Canadian-based online dating service and social networking service marketed to people who are married or in a committed relationship. Its slogan is “Life is short. Have an affair.” It seems, if we are to believe various reports doing the rounds, that their Big Data has been compromised, big time.

Yes, I know, how could that possibly have happened, right?

According to some reports, Adison Mashley have around 37 million clients in the Big Data pool, and large caches of it have allegedly been stolen after an apparently successful hacking attempt was carried out. According to Krebs On Security, data stolen from the web site in question “have been posted online by an individual or group that claims to have completely compromised the company’s user databases, financial records and other proprietary information.”

But, again I ask, how can this happen?

I am not an avid fan of Big Data technology for core business use, and given the level of Big Data technology maturity, it sounds like a dopey idea. But each to their own.

What I will state is that my database management experience has tended to be associated with database technologies that can only be hacked as part of an inside job i.e. where people either know user IDs, passwords, IP addresses and layers of protection etc. or know of someone who does. Either someone who is a friend, part of the family (no, not that type of ‘family’) or someone who can be blackmailed into divulging the required access paths and security check workarounds.

However, taking a broader and more permissive view of this alleged hackerisation of Big Data, do we write it up as a Big Data success, i.e. The Amazing Big Data Affair? Put it down to a technical glitch and community faux pas? Or do we take a jaundiced view of the whole thing and keep it real? I await with baited breath for the enlightened opinions of the Big Data gurus.

Mitch ‘n’ Andy are not unfamiliar with ‘issues’ related to the use of people’s data. The Daily Dot carried a piece from contributing writer S. E. Smith with the headline ‘Why Ashley Madison is cheating on its users with Big Data’ in that piece, Smith states that “Like pretty much every other website on Earth, Ashley Madison spies on its users and crunches the data in a variety of ways to increase the bottom line.”

Belinda Luscombe writing in Time confirmed these suspicions with a piece titled ‘Cheaters’ Dating Site Ashley Madison Spied on Its Users’. She wrote:

In a study to be presented at the 109th Annual Meeting of the American Sociological Association in San Francisco on Saturday Aug. 16, Eric Anderson, a professor at the University of Winchester in England claims that women who seek extra-marital affairs usually still love their husbands and are cheating instead of divorcing, because they need more passion. “It is very clear that our model of having sex and love with just one other person for life has failed— and it has failed massively,” says Anderson.

“How does he know this? Because he spied on the conversations women were having on Ashley Madison, a website created for the purpose of having an affair. Professor Anderson, who as it turns out is a the “chief science officer” at Ashley Madison, looked at more than 4,000 conversations that 100 women were having with potential paramours. “I monitored their conversation with men on the website, without their knowing that I was monitoring and analyzing their conversations,” he says. “The men did not know either.”

Elsewhere, and as reported on Wikipedia, “Trish McDermott, a consultant who helped found Match.com, accused Ashley Madison of being a “business built on the back of broken hearts, ruined marriages, and damaged families.”

Wow, wow, and triple wow! What a way to run a dance hall!

Maybe they should reconsider their slogan, making it more snappy and apposite. How about “Life is short, we pimp your Big Data” as a starter? So go ahead, make your own and post it below. Have fun.

Many thanks for reading.

Oh, and one last thing before I go… GOOD-AD: Join The Big Data Contrarianshttps://www.linkedin.com/grp/home?gid=8338976

Consider this: Big Data Luddites

21 Tue Jul 2015

Posted by Martyn Jones in Big Data, Consider this, good start, Good Strat, goodstart, goodstrat, Strategy

≈ Leave a comment

Tags

big dada, Consider this, good start, Good Strat, goodstart, goodstrat


Bore da, pobl dda. A hyfryd dydd ‘Big Data’* i bawb.

When it comes to Big Data, some people accuse me of being akin to a Luddite. Nothing could be further from the truth. Not that the facts matter. In the age of superficiality and surfaces there is as much wilfully cultivated obliviousness as there is unashamed and unabashed term abuse. Add the prevailing underlying current of anti-intellectualism into the mix, and we have an explosive combination that manifests itself in the alliterative combination of bluff, bluster and banality.

— JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

I was reticent about writing this article, because it’s a bit like arguing against the irrational, self-interested and wilfully obtuse. Or as Ben Goldacre would have it, “You cannot reason people out of a position that they did not reason themselves into.” Therefore, a lot of care needed to be exercised. Indeed, Mark Twain once stated, “Never argue with stupid people, they will drag you down to their level and then beat you with experience.” Now, I wouldn’t go that far, and I do try to be nicely diplomatic, most of the time, but I can see where he was coming from.

Anyway, without more ado let’s get a handle on what a Luddite is, in terms I hope that most will understand.

According to Wikipedia (yes, I know) The Luddites were:

“19th-century English textile workers who protested against newly developed labour-economizing technologies from 1811 to 1816. The stocking frames, spinning frames and power looms introduced during the Industrial Revolution threatened to replace the artisans with less-skilled, low-wage labourers, leaving them without work.”

So why do I get a feeling that some people think that I am a Big Data Luddite?

Here is Peter Powell of PDP Consulting Pty Ltd putting me in my place below the line on my piece titled 7 Amazing Big Data Myths:

“With all due respect – your post does sound a little like what I could envisage an exchange between a man riding a horse and a man driving one if the first automobiles….sorry.”

Although a respectable knowledge of the technology and its evolution would inform otherwise, I assume that this means that I would be the “man riding a horse”…  An interesting piece of conjecture indeed, even if hat in lacks in accuracy is made up for by the inexplicable certainty of belief. Still, it’s fascinating to discover just how many ‘experts’ think that this stuff – the sort of stuff I was doing in the mid to late eighties at Sperry and later Unisys – is bleeding edge innovation,

Sassoon Kosian a Sr. Director of Data Science at AIG, had this to tell me on my piece entitled Amazing Big Data Success Stories:

“Yes, cynical indeed… here is another amazing Big Data success story. You go on your computer, type in any search phrase and get instantaneous and highly relevant results. It is so amazing that a word has been coined. Guess what that is…”

What to say? There goes a person who seems to believe that the history of search starts and ends with the Google web search engine. Something slightly less than a munificently inapposite comment, only outdone by its tragically disconnected banality.

More recently, Bernice Blaar had this to say about my take on Big Data in general and The Big Data Contrarians in particular.

“Master Jones may well be the great and ethical strategy data architecture and management guru that the chattering-class Guardian-reading wine-sipping luvvies drool over, but he is also a brazen Big Data Luddite. No, actually far worse than a Luddite, he`s a Neddite, because with his ‘facts’ and ‘logic’ (what a laugh, you can prove anything with facts, can’t you [tou}???) he is undermining the very foundation of the Big Data work, shirk and skive ethic that has been so hardily fought for by the likes of self-sacrificing champions and evangelists of the Big Data revolution, to wit, such as those bold, proud and fine upstanding members Bernard Marr, Martin Fowler and Tom Davenport, for example, and the brave sycophants that worship at their feet. Martyn is worse than Bob Hoffman, Dave Trott, Jeremy Hardy, Mark Steel, Tab C Nesbitt and Bill Inmon, all rolled into one. He may be a great strategist, but I wouldn’t hire him. Contrarian Luddite!”

And then followed it up with this broadside:

“The Big Data Contrarians group are nothing more than a bunch of over-educated clown-shoes who are trying to scupper the hard-work of decent people out to earn a crust from leveraging the promise of a bright future. In a decent society of capital and consumers, they would be banned off the face of the internets.”

How does one reciprocate such flattering flatulence? How can one possible respond to such a long concatenation of meaningless clichés? Though to be fair, I quite liked being referred to as a Neddite, whatever that is.

Anyway, to set the record straight, this is where I stand.

A contrarian is a person who takes up a contrary position, especially a position that is opposed to that of the majority, regardless of how unpopular it may be.

Like others, I am a Big Data Contrarian, not because I am contrary to the effective use of large volumes, varieties and velocities of data, but because I am contrary to the vast quantities of hype, disinformation and biased mendaciousness surrounding aspects of Big Data and some of the attendant technologies and service providers that go with the terrain. I don’t mind people guilding the lily (to use an English aphorism for exaggeration), but I do draw the line at straight out deception., which could lead to unintended consequences, such as creating false expectations, diverting scarce resources to wasteful projects or doing people out of a livelihood. That’s just not tight.

Does that make me a Luddite (or a Neddite)? I don’t think so, but do make sure that your opinion is your own and is arrived at through reason, not some other persons bullying hype. As I wrote elsewhere some moments ago “If you have to lie like an ethically challenged weasel to sell Big Data then clearly there is something amiss.”

As always I would love to hear your opinions and comments on this subject and others, and also please feel free to reach out and connect, so we can keep the conversation going, here on LinkedIn or elsewhere (such as Twitter).

Many thanks for reading.

 

— JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

Photograph: Delegates at my Big Data Summer Camp in Carmarthen (Wales).

*Data mawr

Data Warehousing Explained to Big Data Friends

20 Mon Jul 2015

Posted by Martyn Jones in Big Data, Big Data Analytics, Consider this, Data Warehousing, good start, Good Strat, goodstart, goodstrat

≈ Leave a comment

Tags

Big Data, enterprise data warehousing, good start, Good Strat, goodstart, goodstrat


Okay, before we get started I have to declare the real intent for posting this piece. It is to get you to join The Big Data Contrarians professional group here on LinkedIn.

To apply to join the best Big Data community on the web simply navigate to this address http://www.linkedin.com/grp/home?gid=8338976 (or paste it into your browser) and request membership, the process is quick and painless and well worth the effort.

Now for the rest of the news…

There are many common misconceptions amongst the Big Data collective about Data Warehousing. There are common fallacies that need clearing up in order avoid unnecessary confusion, avoidable risks and the damaging perpetuation of disinformation.

Big Picture

In the dim and distant past of business IT, the best information that senior executives could expect from their computer systems were operational reports typically indicating what went right or wrong or somewhere in between.  Applied statistical brilliance made up for what data processing lacked in processing power, up to a point, because even heavy lifting statistics requires computing horsepower, which in those days was really a question of serious capital expenditure, which not all companies were willing to commit to.

Then, and curiously coincidentally, people around the world started to posit the need for using data and information to address significant business challenges, to act as input into the processes of strategy formulation, choice and execution. Reports would no longer just be for the Financial Directors or the paper collectors, but would support serious business decision making.

Many initiatives sprang up to meet the top-level decision-making data requirements; they were invariably expensive attempts, with variable outcomes. Some approaches were quite successful, but far too many failed, until the advent of Data Warehousing.

Back then, most of the data that could potentially aid decision-making was in operational systems. Both an advantage and a problem. Data in operational systems was like having data in gaol. Getting data into operational systems was relatively easy, getting it out and moving it around was a nightmare. However, one of the advantages of operational data is that it was generally stored in a structured format, even if data quality was frequently of a dubious nature, and ideas such as subject orientation and integration were far from being widespread.

Of course, data also came in from external sources, but usually via operational databases as well. An example of such data is instrument pricing in financial services.

Therefore, briefly, a lot of Data Warehousing started as a means to provide data to support strategic decision-making. Data Warehousing ways not about counting cakes, widgets or people, which was the purview of operational reporting, or to measure sentiment, likes or mouse behaviour, but to assist senior executives, address the significant business challenges of the day.

Who’s your Daddy?

Bill Inmon, the father of Data Warehousing, defines it as being “a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process.”

Subject Oriented: The data in the Data Warehouse is organised conceptually (the big canvas), logically (detailing the big picture and) and physically (detailing how it is implemented) by subjects of interest to the business, such as customer and product.

The thing to remember about subject areas is that they are not created ad-hoc by IT according to the sentiments of the time, e.g. during requirements gathering, but through a deeper understanding of the business, its processes and its pertinent business subject areas.

Integrated: All data entering the data warehouse is subject to normalisation and integration rules and constraints to ensure that the data stored is consistently and contextually unambiguous.

Time Variant:  Time variance gives us the ability to view and contrast data from multiple viewpoints over time. It is an essential element in the organisation of data within the data warehouse and dependent data marts.

Non-Volatile:  The data warehouse represents structured and consistent snapshots of business data over time. Once a data snapshot is established, it is rarely if ever modified.

Management Decision Making: This is the principal focus of Data Warehousing, although Data Warehouses have secondary uses, such as complementing operational reporting and analysis.

In plain language, if what your business has or is planning to have does not fully satisfy the Inmon criteria then it probably is not a Data Warehouse, but another form of data-store.

The thing to remember about informed management decision making is that it needs to be as good as required but it does not need to achieve technical perfection. This observation underlies the fact that Data Warehouse is a business process, and not an obsessive search for zero defects or the application of so called ‘leading edge’ technologies – faddish, appropriate or not.

JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

Some Basic Terms

Before we delve into the meaning of Data Warehousing, there are a couple of terms that need to be understood first, so, by way of illustration:

Let’s follow the numbers in the simplification of the process.

  1. We gather specific and well-bound data requirements from a specific business area. These are requirements by talking to business people and in understanding their requirements from a business as well as a data sourcing and data logistics perspective. Here we must remember at all times not to over-promise or to set expectations too high. Be modest.
  2. These business requirements are typically captured in a dimensional data model and supporting documentation. Remember that all requirements are subject to revision at a later data, usually in a subsequent iteration of a requirements gathering to implementation cycle.
  3. We identify the best source(s) for the required data and we record basic technical, management and quality details. We ensure that we can provide data to the quality required. Note that data quality does not mean perfection but data to the required quality tolerance levels.
  4. Data Warehouse data models modified as required to accommodate any new data at the atomic level.
  5. We define, document and produce the means (ETL) for getting data from the source and into the target Data Warehouse. Here we also pay especial attention to the four characteristics of Data Warehousing. ETL is an acronym for Extract (the data from source / staging), Transform (the data, making it subject oriented, integrated, and time-variant) and Load (the data into the Data Warehouse and Data Mart).
  6. We define, document and produce the means for getting data from the Data Warehouse into the Data Mart. In short, a bit more ETL.
  7. User acceptance testing. NB Users must ideally be involved in all parts of the end-to-end process that involves business requirements, participation and validation.

This is a very simplified view, but it serves to convey the fundamental chain of events. The most important aspect being that we start (1) and end (7) with the user, and we fully involve them in the non-technical aspects of the process.

JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

Business, Enterprise and Technology

Essentially, a Data Warehouse is a business driven, enterprise centric and technology based solution for continual quality improvement in the sourcing, integration, packaging and delivery of data for strategic, tactical and operational modelling, reporting, visualisation and decision-making.

Business Driven

A data warehouse is business centric and nothing happens unless there is a business imperative for doing so. This means that there is no second-guessing the data requirements of the business users, and every piece of data in the data warehouse should be traceable to a tangible business requirement. This tangible business requirement is usually a departmental or process specific dimensional data model produced together in requirements workshops with the business. We build the Data Warehouse over time in iterative steps, based on the criteria that the requirements should be small enough to be delivered in a short timeframe and large enough to be significant.

Typically, a Data Warehouse iteration results in a new Data Mart or the revision of an existing Data Mart.

Enterprise Centric

As we build up the collection of Data Marts, we are also building up the central logical store of data known as the Enterprise Data Warehouse that serves as a structured, coherent and cohesive central clearing area for data that supports enterprise decision making. Therefore, whilst we are addressing specific departmental and process requirements through Data Marts we are also building up an overall view of the enterprise data.

Technology Based

By technology, I mean technology in the broadest sense of techniques, methods, processes and tools, and not just a question of products, brands or badges.

Unfortunately, there is a popular misconception that Data Warehousing is primarily about competing popular and commercial available technology products. It isn’t, but they do play an important role.

Architecture

The following is an example of a very high-level Data Warehouse architecture diagram.

Methodologies

Various methodologies support the building, expansion and maintenance of a Data Warehouse. Here is one example of a professional data integration methodology, produced, maintained and used by Cambriano Energy.

And here is an information value-chain map as used by Cambriano Energy as part of its Iter8 process management. There are alternatives, many of which do a satisfactory job.

Last but not least, this was (from memory) the way that Bill Inmon’s Prism Solutions ETL company used to view the iterative EDW building process.

JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

Keeping it Shortish

At this point, I decided to cut short further explanations on aspects on Data Warehousing. However, if you have any question then please address them to me and I will do my best (or something close) to answer them.

That’s all folks

Hold this thought for another time: If you think you can replace a Data Warehouse, that is not a Data Warehouse, with another approach to ‘Data Warehousing’ that doesn’t produce a Data Warehouse, for as fast and cheap as one can do it, then you still don’t have a Data Warehouse to show for all of your efforts. That is not a great place to be.

Therefore, you see, Data Warehousing was never about a haphazard approach to providing random structured, semi-structured and unstructured data of various qualities, provenance, volumes, varieties and velocities, to whomever was of a mind to want it.

Many thanks for reading.

 If you want to connect then please send a request. I you have any questions or comments then fire them off below. Cheers :-)

Oh… and one last thing before I go… DON’T FORGET TO JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

 

The Big Data Contrarians – New Big Data Community

03 Fri Jul 2015

Posted by Martyn Jones in Big Data, community, Good Strat, goodstart, goodstrat

≈ Leave a comment

Tags

Big Data, community, Good Strat, goodstart, goodstrat


Friends, peers and colleagues, lend me your bandwidth and 10 minutes of your time.  Gather around and let me tell you about the greatest, most interesting and fantastically diverse Big Data and Data community right here in our very midst on this amazing LinkedIn community.

We have a new Big Data/Data group, and the group is aptly named The Big Data Contrarians, and yet it is neither a ‘me too’ group, of which there are too many to mention, or a ‘belief circle’, of which the less said, the better. Not, The Big Data Contrarians group is a place for cool opinion pieces, creative abrasion, practical insight and (within the realms of the possible) BS free comment.

However, before going into more detail about the group, I would like to digress for a moment.

Like many people, I take a lot of inspiration from outside my own professional spheres of practice, principles and technologies, and this is no less true when it comes to advertising.

Two of my real influencers – the real kind not the LinkedIn kind – are advertising legends Dave Trott (also author of Predatory Thinking) and Bob Hoffman (the Ad Contrarian), who are exceptionally experienced, talented and creative people, of the NoBS (no flim-flam) kind. Indeed, it was after reading some of Bob’s and Dave’s recent articles that I decided to get this group registered on LinkedIn, which, love it or loath it, is where many of us connect.

So, I hear you ask “What’s The Big Data Contrarians, Mart?”

Okay, to be fair, The Big Data Contrarians group is about far more than just being contrarian and a legitimate means of inciting discussion, for as reasonable as that is. It’s also about arguing against or openly rejecting mistakenly cherished and contrived Big Data beliefs and ‘institutions’ and established Big Data hype, speculation and opinion. It’s about separating Big Data fads, fantasises and folk-tales from Big Data reality.

What we seek to understand and convey is where, when, how and for what ends data (including Big Data) can be used to derive legitimate benefits. Moreover, stated from a position of reason and facts, and not simply projected as an issue of Big Data faith, speculation and clairvoyance.

On the other side, we can call out the Big Data hype for what it is, and just as Bob Hoffman calls out the social media and advertising BS babblers in his trade, this too lends a platform for people to do the same with the disreputable and dubious practices of Data gurus, courtesans and ‘influencers’.

“So, Mart, is being a Big Data Contrarian a bit like being a Big Data Luddite?”

Well, not really, but the problem with having so many people who are new to IT is that the past is a mystery top them, so anything that is new to them is actually taken as new, whether it is new or not.

Those who know will know that technologies of distributed file stores and search over unstructured data has been around for quite some time, and some of the “new” technologies that we big-up today, are actually simple developments of data technologies that go back to the seventies and eighties, or maybe even before.

However, this is not essentially about being anti-technology or even in advances in the application of technology, but of understanding that it isn’t helpful for the media, the big industry players and their indentured acolytes, to railroad, cajole and bully businesses into buying Big Data technology they don’t need, to solve Big Data problems and opportunities they don’t have.

That said, it’s up to the members of The Big Data Contrarians to decide on what shape the community should take, and as it is an open forum in democratic terms, the members have equal rights in presenting their own opinions, lessons learned and other insights.

So, if you haven’t yet drunk the Big Data kool-aid, come on down to The Big Data Contrarians, the place for everyone interested in Big Data/Data and its many potential uses.

Many thanks for reading.

Of course, this piece will also not feature on LinkedIn’s Big Data channel, because apparently that channel editor (naming no names) doesn’t like anyone raining on their particular Big Data flim-flam parade.

#BigData #BigDataChannel

On Not Knowing Sentiment Analysis

12 Tue May 2015

Posted by Martyn Jones in Big Data, Big Data Analytics, Consider this, good start, goodstart, sentiment analysis

≈ Leave a comment

Tags

All Data, Analytics, aspiring tendencies in IM, awareness, good start, Good Strat, goodstart, Martyn Jones, Strategy


If you know all about Sentiment Analysis, you’ve come to the right place. Because I don’t have a clue if what I know about it is accurate or not.

I started to do a bit research into this Sentiment Analysis lark, in particular with the theoretical idea of using it to analyse and draw conclusions from comments on Pulse – assuming that this is what it can be used for.

To begin at the beginning, which is good place to start, I read the piece on Wikipedia, and this was how it began:

“Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials.

Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation (see appraisal theory), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).” Source: Wikipedia Link:http://en.wikipedia.org/wiki/Sentiment_analysis

Well, that’s a fairly intuitive description. I could have almost have guessed as much.

But, back to the aim of analysing sentiment in Pulse comments, where to start and what to do.

What would sentiment analysis make of these:

On the death of an IT-business celebrity. What would sentiment analysis make of the very emotive comments of desolation, sadness and poignancy of people who didn’t personally know the departed, even remotely, or maybe didn’t even know of them until after they had ‘shuffled off life’s mortal coil’? How would that work? What would sentiment analysis make of the maudlin aphorisms, surrogate grief and bizarre sorrow of people separated by more degrees than Kofi Anan and Mork from Ork.  What additional insight does sentiment analysis tell us when these comments are analysed along with the body of the text and other comments that triggers these comments?

In a similar vein, how does sentiment analysis catch instances of sycophancy? Especially considering the fact that some of it is so ‘in your face’ and blatant that it often times seems to be a bad parody of a bad parody. “Oh, Ricky, why are you such a sexy brainbox?” How does it work in those situations?

Worse than that is the preening, gushing and obtuse texts of massive, errm… fabulators[i]. If it wasn’t about Big Data or Strategy or IT, it would be about something else, usually about the writer themselves. “I give Rafa and Rodge tips on tennis! I went to the University of the Universe and got a first! I challenged Superman to a race, and won! I have read the entire works of Dan Brown, 25 times…Neeeh!” What would sentiment analysis do with that sort of gold?

Also, what does sentiment analysis do with texts so ambiguously daft that they could mean anything? Okay, it might be able to pick up a few trigger words here or there, “rubbish”, “of”, “load”, “a”, “what”, etc. However, how does it know when “excellent” is being used in a way that means anything but excellent? For example, “Excellent Big Data job there”, with the silent “if you want a job doing properly then do it yourself”.

Finally, for the purpose of this little piece, what would sentiment analysis do with term abuse, if it could actually identify it? Going back to the use of the terms such as Big Data or Strategy, how can sentiment analysis discern between the dopey and wrong-headed use of the term, and when it is actually being used in a coherent, cohesive and consistent way, in line more or less with its formal definition? I suppose we can always write a mountain of rules to help us out:

If topic in focus of piece is strategy

And context of topic is business

And author of piece is Richard Rumelt

Then the credibility of text is good (with a certainty of 100%)

But you and try and maintain a rule base with isntances like that. It soon becomes a management nightmare.

Alternatively, maybe it could be used to analyse this text. It’ll have its work cut out, that’s for sure. Does sentiment analysis do sarcasm and cynicsm?

Anyway! I bet you might know how this sentiment analysis works, don’t you? On the other hand, if not, then it will be someone else who ‘knows’. But of course, all will not be revealed, because it’s a secret so powerful, that in the wrong hands it could be used to dominate the entire galaxy.

Only joking; and many thanks for reading.

[i]To engage in the composition of fables or stories, especially those featuring a strong element of fantasy: “a land which … had given itself up to dreaming, to fabulating, to tale-telling” (Lawrence Durrell).

lang: en_US

Big Data Tales: Bernice and the Martians

12 Tue May 2015

Posted by Martyn Jones in Big Data, Consider this, good start, goodstart, goostart, Martyn Jones, Stories

≈ Leave a comment

Tags

Big Data, big data analytics, good start, goodstart


Bernice and the Martians, BATM for short, were an incredibly popular progressive-rock band.

Their first big commercial success came with the release of their first album and their planned promotional tour, which took in all continents.

The manager of the band was none other than effable polymath, Renaissance man and good all-round rogue, Ricky Jonesy – an obsessive control freak, lover of fine wines and darling of predictive analytics. He really loved his numbers, his social media and his sentiment analysis.

In fact, much of the early success BATM came about due to Ricky’s unparalleled passion for the ‘Big Data’.

Ricky was the band’s architect. He had major input into their material: what they composed; how they composed; their stage sets and lighting; where they performed; the way they played; how they dressed; were photographed; spoke; walked; and, ate and drank. In short, he controlled the whole BATM enchilada. It was like being in data-driven heaven.

As I said, their first album, a progressive-rock masterpiece called ‘Your Hole’, achieved major critical acclaim even before it was bolting out of the stalls and across the interwebs. Overnight the band became big property, and their notional market value ran higher than Twitter on steroids.

The band members were really please. The presses interviewed Bernice right, left and centre and he made no bones about the fact that a major part of their success was due to Ricky and his Big Data mojo.

Articles about the phenomenon appeared in all the major social media sites. Facebook, LinkedIn and BubbaToons. Ricky was named Supreme Data Scientist of the year by the Gardener Group, hailed as a messiah by the Big Data Front and lauded by all and sundry.

Then the band went on tour. Blazing a trail of ones and zeros across the face of the planet.

They were 5 gigs into their tour and Ricky decided to call a band meeting.

“Hi, guys” said Ricky “I’ve been analysing the stats, and I see that those yokes Big Blokes in Tights are trending strongly on the social media, coinciding with the release of their new single Never Stick A Banger In Your Ear”.

“Oh, whoa” chimes in Bernice, “tell us what we gotta do then, Ricky”.

Back comes Ricky. “Well, this is what I thought we might do”

“We take the old Fester and Ailin song Tropical Diseases, we practice it as much as can, and then we play it at the next gig in Birmingham, this weekend”

“But, Ricky!” pipes up Marty Smarty, “it’s an Irish country and western song. It doesn’t fit in with what we do, does it? And, anyway, we only have three days to get it prepared.”

Ricky responds. “Ah, you don’t want to be worrying your little head over that. Trust me. Learn the song. It’ll be great. The public will love it.”

So, BATM learn the song. It’s perfect. At the Saturday gig, they play it as the encore. The fans love it to bits and there’s not a cold cigarette lighter in the place.

Then they fly off to Palma de Mallorca for a bit of a rest before their next gig in Madrid.

The guys and gals are lounging at the poolside at the legendary Don Pimpón Espinete Plaza complex. The weather is glorious, the food is glorious, the scenery is glorious, and even the orchestra is glorious.

Then along comes Ricky, calling yet another band meeting.

“Hi, guys” said Ricky “I’ve been analysing the stats again, and I see that those yokes Spanky’s Magic Piano are trending strongly on the social media with their cover version of Engel Humpadink’s The Monkey Song”

“Oh yeah, what’s that mean for us, Ricky” chimes in Halo Popette, the bands keyboardist.

Back comes Ricky. “Well, this is what I thought we might do”

“We take the old Fester and Ailin song There’s A Dead Man Up The Chimney, and we rewrite it in the style of Tom Jones when he made that album of his, Little Fockers, was it? Then we practice it as much as can, until it’s perfect, and then we play it at the next gig in Madrid, this weekend”

“But, Ricky!” pipes up Brian McGarsical, “It’s a bit of an odd one isn’t it? I mean to say, it doesn’t fit in with what we do, does it? And, anyway, we only have four days to get it prepared.”

Ricky responds as fast as a chalked-up cat going down a drainpipe. “Ah, you don’t want to be worrying your little head over that. Trust me. Learn the song. It’ll be great. The public will love it. And anyways, it will fit nicely on the playlist, up there with Tropical Diseases.”

The band rewrite the song, and practice the Bedejaysus out of it. Ricky likes it so much that he gets the stats to confirm that this has to be number one on the next gig playlist.

Come the day of the gig, and BATM kick off, not with a progressive-rock anthem, but with There’s A Dead Man Up The Chimney. A group of young people at the front clearly are loving this new sound, but quite a few people are starting at the stage in fright, and it’s not from skunk induced paranoia either.

Two guys are having a conversation at the back of the hall.

“Yo, lunchbox, hurry this gig up, I thought this band was all progressive-rock and stuff, not this wiener schnitzel stuff.”

“No comment.”

Having divided the crowd with their first song, they play songs from their album. Again, they encore with Tropical Diseases. The crowd at the front go wild. The progressive rockers look on, bemused.

“Well, that was a mixed bag” says Bernice.

“Take it from your man Ricky. It all went fine lads. Just needs some fine-tuning of the songs and the analytics need to be a bit more real time. Take me word for it.”

Back comes a unison of “Okay, Ricky. We believe yas!”

So, off they go to Bonn, to prepare for the following weeks gig at the Live Music Hall in Cologne.

The band goes out visiting the museums, they have lunch at Brauhaus Bonnsch, and after a leisurely walk along the banks of Rhine they are taking a beer or three in a lovely little beer garden close to the United Nations campus.

Then out of the blue, a familiar voice can be heard.

“Hi, guys! We’re all goin’ on a summer ‘oliday”. It’s the voice of Ricky. “Anyway, Good news guys. I’ve been analysing the amazin’ Big Data stats again, and I see that those mensch Die Zahnarzt are trending strongly on the social media, especially on Swotter and Titter, with their amazon’ cover version of Podge and Rodge’s chillout mix of Currywurst and Microchips.”

Silence. No one says a word for the best part of infinity.

Ricky continues… “As you’re not going to ask, lads, I’ll tell you. We take the old song A Great Day for the Washing, and we rewrite it in the style of techno-Buddah-bar-chill. Then we practice it as much as can, until it’s perfect, and then we bang it out at the next gig in Cologne, this Friday. Innit. Come on lads, it’s 20 minutes of stage magic, and it’s a breeze.”

Come the day of the gig, and the band arrive early at the hall. Ricky is already there. He’s changed the stage set completely and has a new wardrobe for the lads – Bavarian romantic. They’ll soon be all Princed and Smiley Virused up to the eyeballs, wrecking ball included.

and BATM kick off, not with a progressive-rock anthem or chill, but again with There’s A Dead Man Up The Chimney. Again, a group of young people at the front clearly are loving this new sound, but quite a few people are starting at the stage in drug induced awe. Then they follow that up with A Great Day for Washing. By the time they get to the encore of There’s A Dead Man Up The Chimney, boisterous arguments are breaking out everywhere and empty crisp packets and used sticks of chalk are being thrown at the stage. It’s a disaster.

Four guys are having a conversation at the back of the hall.

“I liked the first song”

“No! The first was terrible. Minging! I want my prog rock back.”

“It’s like the choice of leprosy or the plague.”

“Down with this sort of thing.”

Next day Bernice calls an urgent meeting of the band.

Ricky kicks off.

“Well, lads bit of mid-week game yesterday wasn’t it?”

Bernice comes back with a “You can say that again, Rick”

“Don’t worry, I have analysed the social-media Big Data from all of the concerts, and we’re doing good guys. It’s in the analytics”

“We have to go back to our roots and drop all the changes we made”

A stranger in the lounge where they are having the meeting walks up to them and in simple language explains to them what has happened.

“You created a great product, a great brand, with some interesting progressive music”

“Your music was acclaimed and your world tour was eagerly anticipated by all your fans”

“But then you went wrong”

“You became data driven, dopey and data driven”

“You chased fads, tendencies and styles, and it became a mish-mash”

“People don’t want mish-mashes. Not your base. They wanted good progressive music”

“You’ve lost all credibility. No, you’re just an eccentric band of brothers and sisters that no one will really want to see more than once, if at all”

“Your former fan base is acutely embarrassed by you. That’s your bread, butter, vodka and caviar… in your terms”

“Data drive, Big Data, Big Data analytics in real time?”

“You people have no idea the damage you can do, and so easily”

To be continued…

Many thanks for reading.

Consider this: The Big Data Workout

01 Fri May 2015

Posted by Martyn Jones in Big Data, Consider this, good start, goodstart, Stories

≈ Leave a comment

Tags

Big Data, Consider this, data architecture, data management, good start, goodstart, Martyn Richard Jones


To begin at the beginning

Miss Piggy said, “Never eat more than you can lift”. That statement is no less true today, especially when it comes to Big Data. Continue reading →

Consider this: Taming Big Data

01 Fri May 2015

Posted by Martyn Jones in Big Data, Big Data Analytics, Consider this, Good Strat, goodstart

≈ 2 Comments

Tags

accountability, Consider this, good start, goodstart, Martyn Richard Jones


Simply stated, the best application of Big Data is in systems and methods that will significantly reduce the data footprint.

Why would we want to reduce the data footprint?

  • Years of knowledge and experience in information management strongly suggests that more data does not necessarily lead to better data.
  • The more data there is to generate, move and manage, the greater the development and administrative overheads.
  • The more data we generate, store, replicate, move and transform, the bigger the data, energy and carbon footprints will become.

Continue reading →

Consider this: Care to Listen

30 Thu Apr 2015

Posted by Martyn Jones in Consider this, good start, goodstart

≈ Leave a comment

Tags

Consider this, good start, goodstart


OLYMPUS DIGITAL CAMERA

The importance of listening well

I joined Sperry Univac in March of 1980. The previous year the Sperry Corporation had embarked on a revolutionary and innovative programme of coordinated advertising, PR and training ever seen in IT. Continue reading →

← Older posts
Newer posts →

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 639 other subscribers.

Top posts

  • X Is Dying In Europe: Here's Why - Revisited - 2026/02/16
  • Understanding the Data Warehouse Dilemma - 2026/02/07
  • X Is Dying In Europe: Here's Why
  • Brexit is Bullshit
  • Top Countries Known for Arrogance and Ignorance
  • An Open Letter to Mansoor Hussain Laghari
  • Fixing the Data Warehouse - 2026/02/10
  • Top 12 AI Grifters, Charlatans & Rogues to Avoid in 2026 - 2026/01/14
  • Leadership 7s: Management Talking Points #1
  • Top Influencer Mode - Masterclass Content

Recent Comments

Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn de Tours's avatarMartyn de Tours on The Perpetual Victim: How Prof…
Tiffany's avatarTiffany on Consider this: Data Made …
Unknown's avatarThe Case for a Globa… on REVEALING WEALTH: USING BIG DA…
Follow GOOD STRATEGY on WordPress.com

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Names in the cloud

All Data Ask Martyn awareness Big Data Big Data 7s Big Data Analytics Business Intelligence business strategy Consider this dark data data architecture Data governance Data Lake data management data science Data Supply Framework Data Warehouse Data Warehousing Good Strat goodstrat Good Strategy Inform, educate and entertain. IT strategy Martyn Jones Martyn Richard Jones pig data Politics Strategy The Amazing Big Data Challenge The Big Data Contrarians

Recent articles

  • X Is Dying In Europe: Here’s Why – Revisited – 2026/02/16 Feb 16, 2026
  • The Promised Banality of Evil – Revisited Feb 16, 2026
  • Grok, What Do You Make of Martyn Rhisiart Jones’ Take on Big Data? Feb 15, 2026
  • Consider This: In Praise of Shadow-Apps – 2026/02/16 Feb 15, 2026
  • Building the Data Logistics Hub: Pieces and Parts – 2026/02/15 – Part 3 Feb 14, 2026
  • Building the Data Logistics Hub: The Strategy – 2026/02/14 – Part 2 Feb 14, 2026
  • Celtic Mysticism Meets Valentine’s Day Feb 13, 2026

Hours & Info

Spain
+34 692 376 698
martyn.jones@martyn.es
Lunch: 13:30pm - 14:30pm
Dinner: M-Th 20:00pm - 21:00pm, Fri-Sat:21:00pm - 22:00pm

The Stats

  • 118,758 hits

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com
Log in

Hours & Info

Martyn Richard Jones
Madrid, Spain
+34 692 376 698
martyn.jones@martyn.es
10:00 - 17:00
Follow GOOD STRATEGY on WordPress.com
  • X Is Dying In Europe: Here’s Why – Revisited – 2026/02/16
  • The Promised Banality of Evil – Revisited
  • Grok, What Do You Make of Martyn Rhisiart Jones’ Take on Big Data?
  • Consider This: In Praise of Shadow-Apps – 2026/02/16
  • Building the Data Logistics Hub: Pieces and Parts – 2026/02/15 – Part 3

Top Good Strat Posts & Pages

  • X Is Dying In Europe: Here's Why - Revisited - 2026/02/16
  • Understanding the Data Warehouse Dilemma - 2026/02/07
  • X Is Dying In Europe: Here's Why
  • Good Strategy: With Martyn Rhisiart Jones, Sir Afilonius Rex and Lila de Alba.
  • Brexit is Bullshit
  • Top Countries Known for Arrogance and Ignorance
  • An Open Letter to Mansoor Hussain Laghari
  • Fixing the Data Warehouse - 2026/02/10
  • Top 12 AI Grifters, Charlatans & Rogues to Avoid in 2026 - 2026/01/14
  • Leadership 7s: Management Talking Points #1

Good strat tag cloud

AI All Data Analytics Artificial Intelligence Behavioural Economics BI Big Data bigdata blog books bullshit Business business analysis Business Enablement business intelligence Business Management business strategy chatgpt cloud Consider this data data integration data management data science Data Warehouse Data Warehousing Demagogism digital-marketing Dogma Donald Trump enterprise data warehousing espanol EU fe fiction gaza goodstart good start Good Strat goodstrat Good Strategy hamas history ia information Information and Technology information management Information Technology israel IT Strategy jesus knowledge leadership llm machine learning Marketing Martyn Jones Martyn Richard Jones News Offshoring Organisational Autism palestine Philosophy poesia Poetry Politics Russia Spain statistics Strategy technology trump USA Wales writing

Categories

  • accountability
  • advertising
  • agile
  • agile way of working
  • agile@scale
  • AI
  • All Data
  • Analytics
  • anthropology
  • Architecture
  • Artificial Intelligence
  • Ask Martyn
  • Assets
  • awareness
  • bad strategy
  • Banking
  • behaviour
  • Best principles
  • Big Data
  • Big Data 7s
  • Big Data Analytics
  • blockchain
  • Books with influence
  • Brexit
  • BS
  • business
  • Business Intelligence
  • business strategy
  • Cambriano
  • Cambridge Analytica
  • China
  • Climate Change
  • Cloud
  • code of conduct
  • Commercial Analytics
  • community
  • Condiser this
  • Conservative Party
  • consider
  • Consider this
  • Consultation
  • Creativity
  • Culture
  • dark data
  • data
  • data architecture
  • Data governance
  • data hub
  • Data Lake
  • data management
  • Data Mart
  • data mesh
  • data science
  • Data Supply Framework
  • Data Warehouse
  • Data Warehousing
  • deceit
  • deep learning
  • Democracy
  • digital transformation
  • Diplomacy
  • disinformation
  • Dogma
  • Duties
  • DW 3.0
  • ECM
  • Economics
  • EDW
  • England
  • enterprise content management
  • ethics
  • EU
  • Europe
  • European Union
  • Excellence
  • Excerpt
  • Executive
  • Extract
  • Federalism
  • films
  • Financial Industry
  • fraud
  • Freedoms
  • Globalisation
  • good start
  • Good Strat
  • Good Strategy
  • Good Strategy Radio
  • goodstart
  • goodstartegy
  • goodstrat
  • goostart
  • governance
  • hadoop
  • hdfs
  • HR
  • humour
  • India
  • influencers
  • Inform, educate and entertain.
  • informatio Supply Framework
  • information
  • Information Management
  • Information Supply Frameowrk
  • Information Supply Framework
  • Infotrends
  • Inmon
  • instruments
  • IoT
  • IT Circus
  • IT fraud
  • IT strategy
  • IT World
  • iterations
  • java
  • Knowledge
  • knowledge management
  • Labour Party
  • leadership
  • Leadership 7s
  • life
  • listening
  • literature
  • Love
  • LSE
  • machine learning
  • Management
  • market forces
  • Marketing
  • Marty does
  • Martyn does
  • Martyn Jones
  • Martyn Richard Jones
  • Masterclass
  • media
  • Memory lane
  • Methodology
  • nationalism
  • nine competitive forces
  • no limits
  • Northern Ireland
  • obituary
  • Obligations
  • offshore
  • Offshoring
  • operational
  • Outsourcing
  • Oxford
  • pain
  • Parliament
  • Peeves
  • Personal Integrity Key
  • Philosophy
  • pig data
  • PIK
  • PIR
  • Plaid Cymru
  • Planning
  • poem
  • poems
  • Poetry
  • Polemic
  • political science
  • Politics
  • pomo
  • postmodern
  • POTUS
  • PPE
  • Process
  • Professional Networking
  • professionalism
  • project management
  • Project to Excel
  • prose
  • public
  • Public Integrity Record
  • Quiz
  • Rant
  • Referendum
  • Remain
  • RIghts
  • Risk
  • Rivalry
  • romance
  • Russia
  • Ruth Davidson
  • Sales
  • satire
  • Scotland
  • Scottish National Party
  • scrum
  • sentiment analysis
  • SMILES
  • Snippet
  • SNP
  • Social
  • Social Media
  • Sociology
  • Spain
  • spoof
  • statistics
  • Stories
  • Strategy
  • structured intellectual capital
  • supply chain management
  • tactics
  • Tax avoidance
  • Tax evasion
  • TEAM
  • technology
  • The Amazing Big Data Challenge
  • The Big Data Contrarians
  • The Greens
  • The Guardian
  • The hidden wealth of nations
  • Trade
  • UK
  • Uncategorized
  • United Kingdom
  • USA
  • Valentine
  • Value
  • Wales
  • wisdom

Blog at WordPress.com.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy
  • Subscribe Subscribed
    • GOOD STRATEGY
    • Join 137 other subscribers.
    • Already have a WordPress.com account? Log in now.
    • GOOD STRATEGY
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...