• Home
  • About
  • The Good Strategy Blog
  • Strategy
    • Data Warehousing
    • Ask Martyn

GOOD STRATEGY

~ for every significant challenge

GOOD STRATEGY

Tag Archives: Good Strat

Too much information

16 Wednesday Mar 2016

Posted by Martyn Jones in 4th generation Data Warehousing, All Data, Ask Martyn, Big Data, Big Data 7s, Big Data Analytics, business strategy, dark data, Data governance, Data Lake, data management, data science, Data Supply Framework, Data Warehouse, Data Warehousing, Good Strat, Good Strategy, goodstrat, IT strategy, Marty does, Martyn does, Martyn Jones, Martyn Richard Jones, pig data, Strategy, The Amazing Big Data Challenge, The Big Data Contrarians

≈ Leave a comment

Tags

Big Data, Business Enablement, business intelligence, Business Management, Data Warehouse, Good Strat, Information Technology, Martyn Jones, Martyn Richard Jones, Organisational Autism, Strategy

Martyn Richard Jones

I have questions about data.

Most of us who have more than a cursory knowledge of the English language have heard of the phrase ‘too much information’. We know what it means, even if we don’t always know when to apply it.

For those who don’t know, or are unsure, the Urban Dictionary describes ‘too much information’ as “An expression of exasperation and disgust when a person is divulging personal details of his sex life, toilet habits, or anything the listener finds disgusting, uninteresting, and unwelcome.”[1]

Sum, sum. Just because we know it, doesn’t mean we should share it or even try and remember it, never mind go about analysing the hell out of it.

This is where Big Data comes in. Continue reading →

The Digital Document Lifecycle

01 Tuesday Dec 2015

Posted by Martyn Jones in data architecture, Data governance, data management, ECM, good start, Good Strat, Good Strategy, governance, Management, Martyn Jones, Martyn Richard Jones, Uncategorized

≈ Leave a comment

Tags

Content Management, ECM, Good Strat, Martyn Jones, Strategy

The Digital Document Lifecycle

MARTYN RICHARD JONES

To begin at the beginning

This is a story of the life of a digital document. Its purpose is to explain the process of analysing, designing, building, testing and delivering content rich business artefacts in today’s digital age.

Continue reading →

You can’t hide your lyin’ Big Data

22 Wednesday Jul 2015

Posted by Martyn Jones in Big Data, Consider this, good start, Good Strat, goodstart, goodstrat

≈ Leave a comment

Tags

Big Data, good start, Good Strat, goodstart, goodstrat

As a child, I adored the USA rock band the Eagles, especially the musical talents of Joe Walsh. This explains the inspiration behind the title of this piece.

So, what’s going down at Ashley Madison?

Never heard of them? Off your radar? Surely not?

That stretches the bounds of incredulity. As even the people in Singapore’s Media Development Authority have heard of them. They even described their business site this way “it promotes adultery and disregards family values”, and subsequently will not allow them to operate in Singapore. Well, what a turn-up for the books.

On a more serious note, and as you might know, (from Wikipedia or some other ‘sites’,) Ashley Madison is a Canadian-based online dating service and social networking service marketed to people who are married or in a committed relationship. Its slogan is “Life is short. Have an affair.” It seems, if we are to believe various reports doing the rounds, that their Big Data has been compromised, big time.

Yes, I know, how could that possibly have happened, right?

According to some reports, Adison Mashley have around 37 million clients in the Big Data pool, and large caches of it have allegedly been stolen after an apparently successful hacking attempt was carried out. According to Krebs On Security, data stolen from the web site in question “have been posted online by an individual or group that claims to have completely compromised the company’s user databases, financial records and other proprietary information.”

But, again I ask, how can this happen?

I am not an avid fan of Big Data technology for core business use, and given the level of Big Data technology maturity, it sounds like a dopey idea. But each to their own.

What I will state is that my database management experience has tended to be associated with database technologies that can only be hacked as part of an inside job i.e. where people either know user IDs, passwords, IP addresses and layers of protection etc. or know of someone who does. Either someone who is a friend, part of the family (no, not that type of ‘family’) or someone who can be blackmailed into divulging the required access paths and security check workarounds.

However, taking a broader and more permissive view of this alleged hackerisation of Big Data, do we write it up as a Big Data success, i.e. The Amazing Big Data Affair? Put it down to a technical glitch and community faux pas? Or do we take a jaundiced view of the whole thing and keep it real? I await with baited breath for the enlightened opinions of the Big Data gurus.

Mitch ‘n’ Andy are not unfamiliar with ‘issues’ related to the use of people’s data. The Daily Dot carried a piece from contributing writer S. E. Smith with the headline ‘Why Ashley Madison is cheating on its users with Big Data’ in that piece, Smith states that “Like pretty much every other website on Earth, Ashley Madison spies on its users and crunches the data in a variety of ways to increase the bottom line.”

Belinda Luscombe writing in Time confirmed these suspicions with a piece titled ‘Cheaters’ Dating Site Ashley Madison Spied on Its Users’. She wrote:

In a study to be presented at the 109th Annual Meeting of the American Sociological Association in San Francisco on Saturday Aug. 16, Eric Anderson, a professor at the University of Winchester in England claims that women who seek extra-marital affairs usually still love their husbands and are cheating instead of divorcing, because they need more passion. “It is very clear that our model of having sex and love with just one other person for life has failed— and it has failed massively,” says Anderson.

“How does he know this? Because he spied on the conversations women were having on Ashley Madison, a website created for the purpose of having an affair. Professor Anderson, who as it turns out is a the “chief science officer” at Ashley Madison, looked at more than 4,000 conversations that 100 women were having with potential paramours. “I monitored their conversation with men on the website, without their knowing that I was monitoring and analyzing their conversations,” he says. “The men did not know either.”

Elsewhere, and as reported on Wikipedia, “Trish McDermott, a consultant who helped found Match.com, accused Ashley Madison of being a “business built on the back of broken hearts, ruined marriages, and damaged families.”

Wow, wow, and triple wow! What a way to run a dance hall!

Maybe they should reconsider their slogan, making it more snappy and apposite. How about “Life is short, we pimp your Big Data” as a starter? So go ahead, make your own and post it below. Have fun.

Many thanks for reading.

Oh, and one last thing before I go… GOOD-AD: Join The Big Data Contrarianshttps://www.linkedin.com/grp/home?gid=8338976

Consider this: Big Data Luddites

21 Tuesday Jul 2015

Posted by Martyn Jones in Big Data, Consider this, good start, Good Strat, goodstart, goodstrat, Strategy

≈ Leave a comment

Tags

big dada, Consider this, good start, Good Strat, goodstart, goodstrat

Bore da, pobl dda. A hyfryd dydd ‘Big Data’* i bawb.

When it comes to Big Data, some people accuse me of being akin to a Luddite. Nothing could be further from the truth. Not that the facts matter. In the age of superficiality and surfaces there is as much wilfully cultivated obliviousness as there is unashamed and unabashed term abuse. Add the prevailing underlying current of anti-intellectualism into the mix, and we have an explosive combination that manifests itself in the alliterative combination of bluff, bluster and banality.

— JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

I was reticent about writing this article, because it’s a bit like arguing against the irrational, self-interested and wilfully obtuse. Or as Ben Goldacre would have it, “You cannot reason people out of a position that they did not reason themselves into.” Therefore, a lot of care needed to be exercised. Indeed, Mark Twain once stated, “Never argue with stupid people, they will drag you down to their level and then beat you with experience.” Now, I wouldn’t go that far, and I do try to be nicely diplomatic, most of the time, but I can see where he was coming from.

Anyway, without more ado let’s get a handle on what a Luddite is, in terms I hope that most will understand.

According to Wikipedia (yes, I know) The Luddites were:

“19th-century English textile workers who protested against newly developed labour-economizing technologies from 1811 to 1816. The stocking frames, spinning frames and power looms introduced during the Industrial Revolution threatened to replace the artisans with less-skilled, low-wage labourers, leaving them without work.”

So why do I get a feeling that some people think that I am a Big Data Luddite?

Here is Peter Powell of PDP Consulting Pty Ltd putting me in my place below the line on my piece titled 7 Amazing Big Data Myths:

“With all due respect – your post does sound a little like what I could envisage an exchange between a man riding a horse and a man driving one if the first automobiles….sorry.”

Although a respectable knowledge of the technology and its evolution would inform otherwise, I assume that this means that I would be the “man riding a horse”…  An interesting piece of conjecture indeed, even if hat in lacks in accuracy is made up for by the inexplicable certainty of belief. Still, it’s fascinating to discover just how many ‘experts’ think that this stuff – the sort of stuff I was doing in the mid to late eighties at Sperry and later Unisys – is bleeding edge innovation,

Sassoon Kosian a Sr. Director of Data Science at AIG, had this to tell me on my piece entitled Amazing Big Data Success Stories:

“Yes, cynical indeed… here is another amazing Big Data success story. You go on your computer, type in any search phrase and get instantaneous and highly relevant results. It is so amazing that a word has been coined. Guess what that is…”

What to say? There goes a person who seems to believe that the history of search starts and ends with the Google web search engine. Something slightly less than a munificently inapposite comment, only outdone by its tragically disconnected banality.

More recently, Bernice Blaar had this to say about my take on Big Data in general and The Big Data Contrarians in particular.

“Master Jones may well be the great and ethical strategy data architecture and management guru that the chattering-class Guardian-reading wine-sipping luvvies drool over, but he is also a brazen Big Data Luddite. No, actually far worse than a Luddite, he`s a Neddite, because with his ‘facts’ and ‘logic’ (what a laugh, you can prove anything with facts, can’t you [tou}???) he is undermining the very foundation of the Big Data work, shirk and skive ethic that has been so hardily fought for by the likes of self-sacrificing champions and evangelists of the Big Data revolution, to wit, such as those bold, proud and fine upstanding members Bernard Marr, Martin Fowler and Tom Davenport, for example, and the brave sycophants that worship at their feet. Martyn is worse than Bob Hoffman, Dave Trott, Jeremy Hardy, Mark Steel, Tab C Nesbitt and Bill Inmon, all rolled into one. He may be a great strategist, but I wouldn’t hire him. Contrarian Luddite!”

And then followed it up with this broadside:

“The Big Data Contrarians group are nothing more than a bunch of over-educated clown-shoes who are trying to scupper the hard-work of decent people out to earn a crust from leveraging the promise of a bright future. In a decent society of capital and consumers, they would be banned off the face of the internets.”

How does one reciprocate such flattering flatulence? How can one possible respond to such a long concatenation of meaningless clichés? Though to be fair, I quite liked being referred to as a Neddite, whatever that is.

Anyway, to set the record straight, this is where I stand.

A contrarian is a person who takes up a contrary position, especially a position that is opposed to that of the majority, regardless of how unpopular it may be.

Like others, I am a Big Data Contrarian, not because I am contrary to the effective use of large volumes, varieties and velocities of data, but because I am contrary to the vast quantities of hype, disinformation and biased mendaciousness surrounding aspects of Big Data and some of the attendant technologies and service providers that go with the terrain. I don’t mind people guilding the lily (to use an English aphorism for exaggeration), but I do draw the line at straight out deception., which could lead to unintended consequences, such as creating false expectations, diverting scarce resources to wasteful projects or doing people out of a livelihood. That’s just not tight.

Does that make me a Luddite (or a Neddite)? I don’t think so, but do make sure that your opinion is your own and is arrived at through reason, not some other persons bullying hype. As I wrote elsewhere some moments ago “If you have to lie like an ethically challenged weasel to sell Big Data then clearly there is something amiss.”

As always I would love to hear your opinions and comments on this subject and others, and also please feel free to reach out and connect, so we can keep the conversation going, here on LinkedIn or elsewhere (such as Twitter).

Many thanks for reading.

 

— JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

Photograph: Delegates at my Big Data Summer Camp in Carmarthen (Wales).

*Data mawr

Data Warehousing Explained to Big Data Friends

20 Monday Jul 2015

Posted by Martyn Jones in Big Data, Big Data Analytics, Consider this, Data Warehousing, good start, Good Strat, goodstart, goodstrat

≈ Leave a comment

Tags

Big Data, enterprise data warehousing, good start, Good Strat, goodstart, goodstrat

Okay, before we get started I have to declare the real intent for posting this piece. It is to get you to join The Big Data Contrarians professional group here on LinkedIn.

To apply to join the best Big Data community on the web simply navigate to this address http://www.linkedin.com/grp/home?gid=8338976 (or paste it into your browser) and request membership, the process is quick and painless and well worth the effort.

Now for the rest of the news…

There are many common misconceptions amongst the Big Data collective about Data Warehousing. There are common fallacies that need clearing up in order avoid unnecessary confusion, avoidable risks and the damaging perpetuation of disinformation.

Big Picture

In the dim and distant past of business IT, the best information that senior executives could expect from their computer systems were operational reports typically indicating what went right or wrong or somewhere in between.  Applied statistical brilliance made up for what data processing lacked in processing power, up to a point, because even heavy lifting statistics requires computing horsepower, which in those days was really a question of serious capital expenditure, which not all companies were willing to commit to.

Then, and curiously coincidentally, people around the world started to posit the need for using data and information to address significant business challenges, to act as input into the processes of strategy formulation, choice and execution. Reports would no longer just be for the Financial Directors or the paper collectors, but would support serious business decision making.

Many initiatives sprang up to meet the top-level decision-making data requirements; they were invariably expensive attempts, with variable outcomes. Some approaches were quite successful, but far too many failed, until the advent of Data Warehousing.

Back then, most of the data that could potentially aid decision-making was in operational systems. Both an advantage and a problem. Data in operational systems was like having data in gaol. Getting data into operational systems was relatively easy, getting it out and moving it around was a nightmare. However, one of the advantages of operational data is that it was generally stored in a structured format, even if data quality was frequently of a dubious nature, and ideas such as subject orientation and integration were far from being widespread.

Of course, data also came in from external sources, but usually via operational databases as well. An example of such data is instrument pricing in financial services.

Therefore, briefly, a lot of Data Warehousing started as a means to provide data to support strategic decision-making. Data Warehousing ways not about counting cakes, widgets or people, which was the purview of operational reporting, or to measure sentiment, likes or mouse behaviour, but to assist senior executives, address the significant business challenges of the day.

Who’s your Daddy?

Bill Inmon, the father of Data Warehousing, defines it as being “a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process.”

Subject Oriented: The data in the Data Warehouse is organised conceptually (the big canvas), logically (detailing the big picture and) and physically (detailing how it is implemented) by subjects of interest to the business, such as customer and product.

The thing to remember about subject areas is that they are not created ad-hoc by IT according to the sentiments of the time, e.g. during requirements gathering, but through a deeper understanding of the business, its processes and its pertinent business subject areas.

Integrated: All data entering the data warehouse is subject to normalisation and integration rules and constraints to ensure that the data stored is consistently and contextually unambiguous.

Time Variant:  Time variance gives us the ability to view and contrast data from multiple viewpoints over time. It is an essential element in the organisation of data within the data warehouse and dependent data marts.

Non-Volatile:  The data warehouse represents structured and consistent snapshots of business data over time. Once a data snapshot is established, it is rarely if ever modified.

Management Decision Making: This is the principal focus of Data Warehousing, although Data Warehouses have secondary uses, such as complementing operational reporting and analysis.

In plain language, if what your business has or is planning to have does not fully satisfy the Inmon criteria then it probably is not a Data Warehouse, but another form of data-store.

The thing to remember about informed management decision making is that it needs to be as good as required but it does not need to achieve technical perfection. This observation underlies the fact that Data Warehouse is a business process, and not an obsessive search for zero defects or the application of so called ‘leading edge’ technologies – faddish, appropriate or not.

JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

Some Basic Terms

Before we delve into the meaning of Data Warehousing, there are a couple of terms that need to be understood first, so, by way of illustration:

Let’s follow the numbers in the simplification of the process.

  1. We gather specific and well-bound data requirements from a specific business area. These are requirements by talking to business people and in understanding their requirements from a business as well as a data sourcing and data logistics perspective. Here we must remember at all times not to over-promise or to set expectations too high. Be modest.
  2. These business requirements are typically captured in a dimensional data model and supporting documentation. Remember that all requirements are subject to revision at a later data, usually in a subsequent iteration of a requirements gathering to implementation cycle.
  3. We identify the best source(s) for the required data and we record basic technical, management and quality details. We ensure that we can provide data to the quality required. Note that data quality does not mean perfection but data to the required quality tolerance levels.
  4. Data Warehouse data models modified as required to accommodate any new data at the atomic level.
  5. We define, document and produce the means (ETL) for getting data from the source and into the target Data Warehouse. Here we also pay especial attention to the four characteristics of Data Warehousing. ETL is an acronym for Extract (the data from source / staging), Transform (the data, making it subject oriented, integrated, and time-variant) and Load (the data into the Data Warehouse and Data Mart).
  6. We define, document and produce the means for getting data from the Data Warehouse into the Data Mart. In short, a bit more ETL.
  7. User acceptance testing. NB Users must ideally be involved in all parts of the end-to-end process that involves business requirements, participation and validation.

This is a very simplified view, but it serves to convey the fundamental chain of events. The most important aspect being that we start (1) and end (7) with the user, and we fully involve them in the non-technical aspects of the process.

JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

Business, Enterprise and Technology

Essentially, a Data Warehouse is a business driven, enterprise centric and technology based solution for continual quality improvement in the sourcing, integration, packaging and delivery of data for strategic, tactical and operational modelling, reporting, visualisation and decision-making.

Business Driven

A data warehouse is business centric and nothing happens unless there is a business imperative for doing so. This means that there is no second-guessing the data requirements of the business users, and every piece of data in the data warehouse should be traceable to a tangible business requirement. This tangible business requirement is usually a departmental or process specific dimensional data model produced together in requirements workshops with the business. We build the Data Warehouse over time in iterative steps, based on the criteria that the requirements should be small enough to be delivered in a short timeframe and large enough to be significant.

Typically, a Data Warehouse iteration results in a new Data Mart or the revision of an existing Data Mart.

Enterprise Centric

As we build up the collection of Data Marts, we are also building up the central logical store of data known as the Enterprise Data Warehouse that serves as a structured, coherent and cohesive central clearing area for data that supports enterprise decision making. Therefore, whilst we are addressing specific departmental and process requirements through Data Marts we are also building up an overall view of the enterprise data.

Technology Based

By technology, I mean technology in the broadest sense of techniques, methods, processes and tools, and not just a question of products, brands or badges.

Unfortunately, there is a popular misconception that Data Warehousing is primarily about competing popular and commercial available technology products. It isn’t, but they do play an important role.

Architecture

The following is an example of a very high-level Data Warehouse architecture diagram.

Methodologies

Various methodologies support the building, expansion and maintenance of a Data Warehouse. Here is one example of a professional data integration methodology, produced, maintained and used by Cambriano Energy.

And here is an information value-chain map as used by Cambriano Energy as part of its Iter8 process management. There are alternatives, many of which do a satisfactory job.

Last but not least, this was (from memory) the way that Bill Inmon’s Prism Solutions ETL company used to view the iterative EDW building process.

JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

Keeping it Shortish

At this point, I decided to cut short further explanations on aspects on Data Warehousing. However, if you have any question then please address them to me and I will do my best (or something close) to answer them.

That’s all folks

Hold this thought for another time: If you think you can replace a Data Warehouse, that is not a Data Warehouse, with another approach to ‘Data Warehousing’ that doesn’t produce a Data Warehouse, for as fast and cheap as one can do it, then you still don’t have a Data Warehouse to show for all of your efforts. That is not a great place to be.

Therefore, you see, Data Warehousing was never about a haphazard approach to providing random structured, semi-structured and unstructured data of various qualities, provenance, volumes, varieties and velocities, to whomever was of a mind to want it.

Many thanks for reading.

 If you want to connect then please send a request. I you have any questions or comments then fire them off below. Cheers 🙂

Oh… and one last thing before I go… DON’T FORGET TO JOIN THE BIG DATA CONTRARIANS: http://www.linkedin.com/grp/home?gid=8338976

 

The Big Data Contrarians – New Big Data Community

03 Friday Jul 2015

Posted by Martyn Jones in Big Data, community, Good Strat, goodstart, goodstrat

≈ Leave a comment

Tags

Big Data, community, Good Strat, goodstart, goodstrat

Friends, peers and colleagues, lend me your bandwidth and 10 minutes of your time.  Gather around and let me tell you about the greatest, most interesting and fantastically diverse Big Data and Data community right here in our very midst on this amazing LinkedIn community.

We have a new Big Data/Data group, and the group is aptly named The Big Data Contrarians, and yet it is neither a ‘me too’ group, of which there are too many to mention, or a ‘belief circle’, of which the less said, the better. Not, The Big Data Contrarians group is a place for cool opinion pieces, creative abrasion, practical insight and (within the realms of the possible) BS free comment.

However, before going into more detail about the group, I would like to digress for a moment.

Like many people, I take a lot of inspiration from outside my own professional spheres of practice, principles and technologies, and this is no less true when it comes to advertising.

Two of my real influencers – the real kind not the LinkedIn kind – are advertising legends Dave Trott (also author of Predatory Thinking) and Bob Hoffman (the Ad Contrarian), who are exceptionally experienced, talented and creative people, of the NoBS (no flim-flam) kind. Indeed, it was after reading some of Bob’s and Dave’s recent articles that I decided to get this group registered on LinkedIn, which, love it or loath it, is where many of us connect.

So, I hear you ask “What’s The Big Data Contrarians, Mart?”

Okay, to be fair, The Big Data Contrarians group is about far more than just being contrarian and a legitimate means of inciting discussion, for as reasonable as that is. It’s also about arguing against or openly rejecting mistakenly cherished and contrived Big Data beliefs and ‘institutions’ and established Big Data hype, speculation and opinion. It’s about separating Big Data fads, fantasises and folk-tales from Big Data reality.

What we seek to understand and convey is where, when, how and for what ends data (including Big Data) can be used to derive legitimate benefits. Moreover, stated from a position of reason and facts, and not simply projected as an issue of Big Data faith, speculation and clairvoyance.

On the other side, we can call out the Big Data hype for what it is, and just as Bob Hoffman calls out the social media and advertising BS babblers in his trade, this too lends a platform for people to do the same with the disreputable and dubious practices of Data gurus, courtesans and ‘influencers’.

“So, Mart, is being a Big Data Contrarian a bit like being a Big Data Luddite?”

Well, not really, but the problem with having so many people who are new to IT is that the past is a mystery top them, so anything that is new to them is actually taken as new, whether it is new or not.

Those who know will know that technologies of distributed file stores and search over unstructured data has been around for quite some time, and some of the “new” technologies that we big-up today, are actually simple developments of data technologies that go back to the seventies and eighties, or maybe even before.

However, this is not essentially about being anti-technology or even in advances in the application of technology, but of understanding that it isn’t helpful for the media, the big industry players and their indentured acolytes, to railroad, cajole and bully businesses into buying Big Data technology they don’t need, to solve Big Data problems and opportunities they don’t have.

That said, it’s up to the members of The Big Data Contrarians to decide on what shape the community should take, and as it is an open forum in democratic terms, the members have equal rights in presenting their own opinions, lessons learned and other insights.

So, if you haven’t yet drunk the Big Data kool-aid, come on down to The Big Data Contrarians, the place for everyone interested in Big Data/Data and its many potential uses.

Many thanks for reading.

Of course, this piece will also not feature on LinkedIn’s Big Data channel, because apparently that channel editor (naming no names) doesn’t like anyone raining on their particular Big Data flim-flam parade.

#BigData #BigDataChannel

Let’s talk strat! Business Strategy and IT

22 Friday May 2015

Posted by Martyn Jones in business strategy, Consider this, Good Strat, goodstrat, IT strategy, Strategy

≈ Leave a comment

Tags

business strategy, Good Strat, IT Strategy, Strategy

I used to work for an affable person from Chicago. His two favourite phrases were “Let’s talk strat” and “Brought your cheque book with you?”

There are many misconceptions about strategy. But, I particularly want to address two things:

  • What is business strategy?
  • What is IT (information technology) strategy?

So, without more ado, let’s get the baby off the ground.

Continue reading →

On Not Knowing Sentiment Analysis

12 Tuesday May 2015

Posted by Martyn Jones in Big Data, Big Data Analytics, Consider this, good start, goodstart, sentiment analysis

≈ Leave a comment

Tags

All Data, Analytics, aspiring tendencies in IM, awareness, good start, Good Strat, goodstart, Martyn Jones, Strategy

If you know all about Sentiment Analysis, you’ve come to the right place. Because I don’t have a clue if what I know about it is accurate or not.

I started to do a bit research into this Sentiment Analysis lark, in particular with the theoretical idea of using it to analyse and draw conclusions from comments on Pulse – assuming that this is what it can be used for.

To begin at the beginning, which is good place to start, I read the piece on Wikipedia, and this was how it began:

“Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials.

Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation (see appraisal theory), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).” Source: Wikipedia Link:http://en.wikipedia.org/wiki/Sentiment_analysis

Well, that’s a fairly intuitive description. I could have almost have guessed as much.

But, back to the aim of analysing sentiment in Pulse comments, where to start and what to do.

What would sentiment analysis make of these:

On the death of an IT-business celebrity. What would sentiment analysis make of the very emotive comments of desolation, sadness and poignancy of people who didn’t personally know the departed, even remotely, or maybe didn’t even know of them until after they had ‘shuffled off life’s mortal coil’? How would that work? What would sentiment analysis make of the maudlin aphorisms, surrogate grief and bizarre sorrow of people separated by more degrees than Kofi Anan and Mork from Ork.  What additional insight does sentiment analysis tell us when these comments are analysed along with the body of the text and other comments that triggers these comments?

In a similar vein, how does sentiment analysis catch instances of sycophancy? Especially considering the fact that some of it is so ‘in your face’ and blatant that it often times seems to be a bad parody of a bad parody. “Oh, Ricky, why are you such a sexy brainbox?” How does it work in those situations?

Worse than that is the preening, gushing and obtuse texts of massive, errm… fabulators[i]. If it wasn’t about Big Data or Strategy or IT, it would be about something else, usually about the writer themselves. “I give Rafa and Rodge tips on tennis! I went to the University of the Universe and got a first! I challenged Superman to a race, and won! I have read the entire works of Dan Brown, 25 times…Neeeh!” What would sentiment analysis do with that sort of gold?

Also, what does sentiment analysis do with texts so ambiguously daft that they could mean anything? Okay, it might be able to pick up a few trigger words here or there, “rubbish”, “of”, “load”, “a”, “what”, etc. However, how does it know when “excellent” is being used in a way that means anything but excellent? For example, “Excellent Big Data job there”, with the silent “if you want a job doing properly then do it yourself”.

Finally, for the purpose of this little piece, what would sentiment analysis do with term abuse, if it could actually identify it? Going back to the use of the terms such as Big Data or Strategy, how can sentiment analysis discern between the dopey and wrong-headed use of the term, and when it is actually being used in a coherent, cohesive and consistent way, in line more or less with its formal definition? I suppose we can always write a mountain of rules to help us out:

If topic in focus of piece is strategy

And context of topic is business

And author of piece is Richard Rumelt

Then the credibility of text is good (with a certainty of 100%)

But you and try and maintain a rule base with isntances like that. It soon becomes a management nightmare.

Alternatively, maybe it could be used to analyse this text. It’ll have its work cut out, that’s for sure. Does sentiment analysis do sarcasm and cynicsm?

Anyway! I bet you might know how this sentiment analysis works, don’t you? On the other hand, if not, then it will be someone else who ‘knows’. But of course, all will not be revealed, because it’s a secret so powerful, that in the wrong hands it could be used to dominate the entire galaxy.

Only joking; and many thanks for reading.

[i]To engage in the composition of fables or stories, especially those featuring a strong element of fantasy: “a land which … had given itself up to dreaming, to fabulating, to tale-telling” (Lawrence Durrell).

lang: en_US

Big Data’s Virtuous Circus

20 Friday Mar 2015

Posted by Martyn Jones in Big Data, Consider this, data management, good start, goodstart

≈ Leave a comment

Tags

Big Data, data architecture, data management, good start, Good Strat, Good Strategy, goodstart, Martyn Jones, Martyn Richard Jones

Many people come up to me in the street and ask me what Big Data is all about. It has happened to me so many times in the past that I am convinced that it might just happen to you as well. I know sort of thing, I read the Big Data tealeaves. Nothing gets past me.

The first time a complete stranger came up to me in public and said “Hello, will you tell me what this Big Data lark is all about then?” I was lost for words, you just ask my Aunt Dolly, he can vouch for that, no problem. Later that day I read a book – it was my dad’s book – and I then decided to adopt a strategy.

Therefore, in the spirit of springtime goodwill to all men and women, I have put together this blog piece in that hope that it will enlighten, help and entertain.

What is big data?

Big Data can be characterised by the 10 Vs – yes, 10, not 4. Which, in my book, is more than enough to bring up-to-speed the average Big Data John or Jane that one meets on the street, and who naturally wish to be informed of such matters.

In layperson’s terms this a series of landmarks and pointers in the analytics space used to frame and guide the didactic aspects of Big Data.

The fundamental Vs of the Big Data canon are these:

  • Vagueness
  • Volume
  • Variety
  • Virility
  • Velocity
  • Vendible
  • Vaticination
  • Voracity
  • Vanity

So, let me now explain what each of these characteristics mean to those who might know and for those who might want to know.

Vagueness: This is perhaps the trickiest of questions to address, given the vast panorama that is cast before this incredibly complex yet easily graspable concept. But let me state this, and let there be no mistake about it. At this point in time, what makes Big Data vague is also what makes Big Data specific, explicit and certain. That is to say, in order to ‘come to an understanding’ of Big Data, it is necessary to completely embrace the dialectic of knowing the unknowable. So belief is an absolute essential element – belief and data, that is.

Volume – If there ever was a time to “pump up the volume”, we have it here with Big Data.

Big, voluminous, gorgeously rotund and infinite. Big Data is called Big Data because there is a lovely, roly-poly, likeable never-ending load of it. Its volumes can be measured in zeta-bytes, which you can be assured, is a helluva lot of data.

Variety – As they might say down my way, “variety is the spice of life, innit”. This is what makes Big Data so special. So appealing.

Because before Big Data there was absolutely no variety in anything, at all. We lived in a bland world, bereft of detail, nuance and diversity. Nothing could be measured, analysed or explained, because we lacked Big Data. We were ignorant. So ignorant and stupid that we couldn’t see the sense of putting the diapers next to the beer, or of offering three for the price of two.

Fortunately, today this is no longer the case if we don’t want it to be, and thanks to Big Data we have a veritable sensorial explosion. No longer is IT just a couple of symbols scribbled in crayon on someone’s school notebook.

Virility – Move over Smart Data, the new kid on the block is Big Data.

If Big Data were described in the manner of a religious text, it would be accompanied by a never ending narrative of begets.

So, what does that mean?

Simply stated, Big Data creates itself, in and of itself. The more Big Data you have, the more Big Data gets created. It’s like a self-fulfilling prophecy in 360 degree, high-definition, poly-faceted and all-encompassing knowing. The sort of thing that governments would pay an arm and a leg to get their mitts on.

Velocity – Velocity is of the essence. Velocity kills the competition. More velocity, less haste.

We demand that service is ‘velocious’. ‘Everything’ must be ‘now’, or it’s too late.

This means we need to be able to handle Big Data at velocity – at the speed of need.

Charles Babbage once stated (or maybe it was more than once) that “whenever the work is itself light, it becomes necessary, in order to economize time, to increase the velocity.”

But remember, we are dealing with mega-velocity here, so don’t drink and drive the Big Data Steamship, Star-ship or Mustang.

Vendible – If you can sell it, and sell it as Big Data, then it ‘is’ Big Data. If you can’t, then it’s not. The saleability of Big Data proves its existence.

So, what are the vendible aspects of Big Data?

Let’s leave that easy question for another day. But for now I can confidently state that it is used to mobilise armies of commentators, industry analysts, publicists, punters, writers, bloggers, gurus, futurologists, conference organisers, conference speakers, educators, customer relationship managers, salespeople, marketers and admen.

Vaticination – Edmund Burke is down on record as stating that “you can never plan the future by the past”. Now Burke may have been a clever person when it came to many things, but he wasn’t exactly a whiz when it came to Big Data.

There are people in the world who are in no doubt that Big Data provides the sort of visionary and predictive powers only previously obtainable through ritual sacrifice, magic potions and the casting of spells. Others are highly critical of the understatement implicit in this belief.

For many, Big Data will make the Oracle of Delphi look like a mere call centre.

This is why the power of vaticination plays a characteristically important role in the world of Big Data.

Voracity – This is based on the quasi-rationalist argument that Big Data is big and it has an omnipresent and insatiable self-fulfilling desire.

Big Data comes with an attendant requirement for hardware, even if it is a whole load of consumer hardware tacked together in a magnificent and miraculous mesh of magic.

Big Data can be characterised by voracity, but this comes hand in hand with the ‘ventripotent’ IT industry.

Veracity – The eminence of the data being captured for Big Data handling can vary significantly. The quality or lack of quality of the data naturally has the potential to impact the accuracy of analysis using that data.

Before Big Data arrived on the scene we knew nothing about Data Quality or data verification. This is why ETL and Data Cleansing tools lacked the power to effectively quality check and verify data, to ensure that any erroneous or anomalous data was rejected or flagged.

But now, with the sophistication of tools such as ‘grep’ and ‘awk’ at our disposal, we have the power in our hands to ensure nothing ‘dodgy’ gets into the analytical mix.

Vanity – In my opinion, to fully grasp the underlying and profound meaning of Big Data, it is essential for us to understand the difference between vanity and conceit. Max Counsell claimed that “Vanity is the flatterer of the soul”. Goethe characterised vanity as being “a desire for personal glory”. After an incident with an Anarchist (presumably a Big Data Anarchist), Blackadder remarked to Baldrick that “The criminal’s vanity always makes them make one tiny but fatal mistake. Theirs was to have their entire conspiracy printed and published in plain manuscript”.

That’s all folks!

So that ends the brief rundown of the defining characteristics of Big Data.

So, to summarise. That, which has passed before, necessarily divulges both the upside and downside of Big Data. By reaching out, opening up the kimono and relating the 10 Vs we are disclosing that which cannot be disclosed, exhibiting the absence of essential essence, and thereby opening up the entire field, discipline, profession, science and art to examination, questioning and ridicule.

Many thanks for reading.

7 Signals that someone has quit

14 Saturday Mar 2015

Posted by Martyn Jones in Consider this, good start, Good Strat, goodstart, Martyn Richard Jones

≈ Leave a comment

Tags

careers, Consider this, good start, Good Strat, Good Strategy, goodstart, Martyn Jones, Martyn Richard Jones, quit

You are the boss. You are the leader, coach and manager, and there are some things that you just got to learn, like it or not. One of these skills is to be able to identify when someone has quit. “How dare they?” I here you ask.

The first time I quit a job and didn’t tell anybody was when I was in the RAF working as a fighter pilot in World War 2, and I accidentally bombed Newport in South Wales, and was given a stern talking to for my troubles. Well, I didn’t actually quit and I was never in the armed forces and I was born into the era of the Beat Generation, but that’s by the by, it’s just there for effect, to create some artificial empathy between me and those who have actually quit a job and not told anyone about it. Myself, I would never do such a thing. Although to be fair, Newport has looked like it has been freshly bombed with dark green, brown and grey shades of poster paints and self-raising flour, since forever. Continue reading →

← Older posts
Follow GOOD STRATEGY on WordPress.com

Top posts

  • Data Trailblazers: 2022 Vision
  • The World's Best Data Quotes... Including Big Data quotes
  • Reality Check: Data Mesh and Data Warehousing  
  • Mario Benedetti, 1920 To 2009
  • Postmodern Digital Stories: We've never seen anything like this before
  • Bullshit at the Data Lakehouse
  • Myth-busting: Data Mesh and Data Warehousing - Revisited

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,336 other subscribers

Names in the cloud

4th generation Data Warehousing All Data Ask Martyn Big Data Big Data 7s Big Data Analytics Business Intelligence business strategy Consider this dark data data architecture Data governance Data Lake data management data science Data Supply Framework Data Warehouse Data Warehousing Good Strat goodstrat Good Strategy IT strategy Martyn does Martyn Jones Martyn Richard Jones pig data Politics Strategy The Amazing Big Data Challenge The Big Data Contrarians

The Good Strat Archives

  • January 2022
  • December 2021
  • November 2021
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • July 2019
  • June 2019
  • May 2019
  • December 2018
  • January 2018
  • December 2017
  • October 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • September 2016
  • August 2016
  • May 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014

The Stats

  • 98,696 hits

Recent posts

  • Data Trailblazers: 2022 Vision January 2, 2022
  • Tea with The Data Contrarian: Afilonius Rex December 10, 2021
  • Reality Check: Data Mesh and Data Warehousing   December 5, 2021
  • Myth-busting: Data Mesh and Data Warehousing – Revisited November 25, 2021
  • Heaven help us! Have you seen the latest Virtual Data Warehouse bullshit? June 26, 2020
  • DATA! STRATEGY, INNOVATION AND VALUE BULLSHIT June 9, 2020
  • Big data’s unvirtuous circus and twelve v-words May 17, 2020
  • Laughing at Big Data – What’s on the inside May 16, 2020
  • Why I called bullshit on the data lakehouse nonsense May 16, 2020
  • Laugh at Big Data – download my ebook for free on 17th May. May 16, 2020

Hours & Info

Martyn Richard Jones
Madrid, Spain
+33 767 120 160
10:00 - 17:00
Follow GOOD STRATEGY on WordPress.com

Follow me on Twitter

My Tweets

Top Good Strat Posts & Pages

  • The Good Strategy Company
  • Data Trailblazers: 2022 Vision
  • The World's Best Data Quotes... Including Big Data quotes
  • Reality Check: Data Mesh and Data Warehousing  
  • Mario Benedetti, 1920 To 2009
  • About
  • Postmodern Digital Stories: We've never seen anything like this before
  • Bullshit at the Data Lakehouse
  • Myth-busting: Data Mesh and Data Warehousing - Revisited

Good strat tag cloud

accountability advertising All Data Analytics aspiring tendencies in IM awareness Banking Behavioural Economics BI Big Data Bill Inmon Brexit BS Business business analysis Business Enablement business intelligence Business Management business strategy Challenges Commercial IT Consider this corporate assets Corporate IT Creativity data data analytics data architecture data integration data management Data Marts data science Data Warehouse Demagogism Dogma DW 3.0 Economics enterprise data warehousing EU Financial Goal Setting goodstart good start Good Strat goodstrat Good Strategy hadoop Information and Technology information management Information Technology IT business IT Strategy knowledge management leadership marketforces Marketing Martyn Jones Martyn Richard Jones MDM Offshoring operationalwareness Organisational Autism organisational awareness Outsourcing Pimps Politics project management Requirements management Risk Risk Management statistics Strategy trading traditional assets UK

Categories

  • 4th generation Data Warehousing
  • accountability
  • advertising
  • agile
  • agile way of working
  • agile@scale
  • AI
  • All Data
  • Analytics
  • anthropology
  • Architecture
  • Artificial Intelligence
  • Ask Martyn
  • Assets
  • awareness
  • bad strategy
  • Banking
  • behaviour
  • Best principles
  • Big Data
  • Big Data 7s
  • Big Data Analytics
  • blockchain
  • Books with influence
  • Brexit
  • BS
  • business
  • Business Intelligence
  • business strategy
  • Cambriano
  • Cambridge Analytica
  • China
  • Climate Change
  • Cloud
  • code of conduct
  • Commercial Analytics
  • community
  • Condiser this
  • Conservative Party
  • consider
  • Consider this
  • Consultation
  • Creativity
  • dark data
  • data architecture
  • Data governance
  • data hub
  • Data Lake
  • data management
  • Data Mart
  • data mesh
  • data science
  • Data Supply Framework
  • Data Warehouse
  • Data Warehousing
  • deceit
  • deep learning
  • Democracy
  • digital transformation
  • Diplomacy
  • disinformation
  • Dogma
  • Duties
  • DW 3.0
  • ECM
  • Economics
  • EDW
  • England
  • enterprise content management
  • ethics
  • EU
  • Europe
  • European Union
  • Excellence
  • Excerpt
  • Executive
  • Extract
  • Federalism
  • Financial Industry
  • fraud
  • Freedoms
  • Globalisation
  • good start
  • Good Strat
  • Good Strategy
  • Good Strategy Radio
  • goodstart
  • goodstartegy
  • goodstrat
  • goostart
  • governance
  • hadoop
  • hdfs
  • HR
  • humour
  • India
  • influencers
  • informatio Supply Framework
  • information
  • Information Management
  • Information Supply Frameowrk
  • Information Supply Framework
  • Infotrends
  • Inmon
  • instruments
  • IoT
  • IT Circus
  • IT fraud
  • IT strategy
  • IT World
  • iterations
  • java
  • Knowledge
  • knowledge management
  • Labour Party
  • leadership
  • Leadership 7s
  • life
  • listening
  • literature
  • LSE
  • machine learning
  • Management
  • market forces
  • Marketing
  • Marty does
  • Martyn does
  • Martyn Jones
  • Martyn Richard Jones
  • media
  • Memory lane
  • Methodology
  • nationalism
  • nine competitive forces
  • no limits
  • Northern Ireland
  • obituary
  • Obligations
  • offshore
  • Offshoring
  • operational
  • Outsourcing
  • Oxford
  • pain
  • Parliament
  • Peeves
  • Personal Integrity Key
  • Philosophy
  • pig data
  • PIK
  • PIR
  • Plaid Cymru
  • Planning
  • poem
  • poems
  • Poetry
  • Polemic
  • political science
  • Politics
  • pomo
  • postmodern
  • POTUS
  • Process
  • Professional Networking
  • professionalism
  • project management
  • Project to Excel
  • prose
  • public
  • Public Integrity Record
  • Quiz
  • Rant
  • Referendum
  • Remain
  • RIghts
  • Risk
  • Rivalry
  • Russia
  • Ruth Davidson
  • Sales
  • satire
  • Scotland
  • Scottish National Party
  • scrum
  • sentiment analysis
  • SMILES
  • Snippet
  • SNP
  • Social
  • Social Media
  • Sociology
  • spoof
  • statistics
  • Stories
  • Strategy
  • structured intellectual capital
  • supply chain management
  • tactics
  • Tax avoidance
  • Tax evasion
  • TEAM
  • technology
  • The Amazing Big Data Challenge
  • The Big Data Contrarians
  • The Greens
  • The Guardian
  • The hidden wealth of nations
  • Trade
  • UK
  • Uncategorized
  • United Kingdom
  • USA
  • Value
  • Wales
  • wisdom

Blog at WordPress.com.

  • Follow Following
    • GOOD STRATEGY
    • Join 131 other followers
    • Already have a WordPress.com account? Log in now.
    • GOOD STRATEGY
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy