• Home
  • About
  • The Good Strategy Blog
  • Strategy
    • Data Warehousing
    • Ask Martyn
  • Must-Read Books from Martyn
  • MARTYN’S MUSIC
  • PODCASTS

GOOD STRATEGY

~ for every significant challenge

GOOD STRATEGY

Category Archives: Good Strat

7 Signals that someone has quit

14 Saturday Mar 2015

Posted by Martyn Jones in Consider this, good start, Good Strat, goodstart, Martyn Richard Jones

≈ Leave a comment

Tags

careers, Consider this, good start, Good Strat, Good Strategy, goodstart, Martyn Jones, Martyn Richard Jones, quit


You are the boss. You are the leader, coach and manager, and there are some things that you just got to learn, like it or not. One of these skills is to be able to identify when someone has quit. “How dare they?” I here you ask.

The first time I quit a job and didn’t tell anybody was when I was in the RAF working as a fighter pilot in World War 2, and I accidentally bombed Newport in South Wales, and was given a stern talking to for my troubles. Well, I didn’t actually quit and I was never in the armed forces and I was born into the era of the Beat Generation, but that’s by the by, it’s just there for effect, to create some artificial empathy between me and those who have actually quit a job and not told anyone about it. Myself, I would never do such a thing. Although to be fair, Newport has looked like it has been freshly bombed with dark green, brown and grey shades of poster paints and self-raising flour, since forever. Continue reading →

What’s all the fuss about Dark Data? Big Data’s New Best Friend

10 Tuesday Mar 2015

Posted by Martyn Jones in All Data, Big Data, Consider this, dark data, Good Strat

≈ Leave a comment

Tags

All Data, Big Data, dark data, data architecture, data management, Good Strat, Martyn Jones, Martyn Richard Jones


What is Dark Data?

Dark data, what is it and why all the fuss?

First, I’ll give you the short answer. The right dark data, just like its brother right Big Data, can be monetised – honest, guv! There’s loadsa money to be made from dark data by ‘them that want to’, and as value propositions go, seriously, what could be more attractive?

Let’s take a look at the market.

Gartner defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes” (IT Glossary – Gartner)

Techopedia describes dark data as being data that is “found in log files and data archives stored within large enterprise class data storage locations. It includes all data objects and types that have yet to be analyzed for any business or competitive intelligence or aid in business decision making.” (Techopedia – Cory Jannsen)

Cory also wrote that “IDC, a research firm, stated that up to 90 percent of big data is dark data.”

In an interesting whitepaper from C2C Systems it was noted that “PST files and ZIP files account for nearly 90% of dark data by IDC Estimates.” and that dark data is “Very simply, all those bits and pieces of data floating around in your environment that aren’t fully accounted for:” (Dark Data, Dark Email – C2C Systems)

Elsewhere, Charles Fiori defined dark data as “data whose existence is either unknown to a firm, known but inaccessible, too costly to access or inaccessible because of compliance concerns.” (Shedding Light on Dark Data – Michael Shashoua)

Not quite the last insight, but in a piece published by Datameer, John Nicholson wrote that “Research firm IDC estimates that 90 percent of digital data is dark.” And went on to state that “This dark data may come in the form of machine or sensor logs” (Shine Light on Dark Data – Joe Nicholson via Datameer)

Finally, Lug Bergman of NGDATA wrote this in a sponsored piece in Wired: “It” – dark data – “is different for each organization, but it is essentially data that is not being used to get a 360 degree view of a customer.

Say what?

Okay, let’s see if we can be a bit more specific about the content of dark data?

Items on the dark data ticket include: Email; Instant messages; documents; Sharepoint content; content of collaboration databases; ZIP files; log files; archived sensor and signal data; archived web content; aged audit trails; operational database backups – full and incremental; roll-back, redo and spooled data files; sunsetted applications (code and documentation); partially developed and then abandoned applications; and, code snippets.

Most importantly, dark data is data that is not actively in use, is underutilised, or is something else. Seriously.

What can you do with it?

So, the conclusion that some have come to is this: there is a vast collection of data in various formats waiting to be monetised.

Personally, the idea that really grabs my attention is the potential ability to do novel forensic research on email. If only to find out what happened in the past.

For example, maybe it would be fascinating to see how significant challenges were identified, flagged and discussed; how strategic responses to those challenges were formulated, chosen and executed; and, how the outcomes of all of that process were reflected in email communications.

I think that this line of work can be very interesting for some people, and that interesting insights may be uncovered, but I would hate to have to put a tangible value on it, if only to avoid adding to the already galactic magnitudes of nonsense and hype surrounding certain data topics.

There are other more mundane uses of dark data.

Imagine that you are just about to embark on a Data Warehouse project (you really are a late adopter aren’t you), and you want establish a base collection of historical data. Where do you get that historical data from?

Right! Operational databases are not characteristically used to store significant amounts of historical reference data and historical transactions beyond a certain time window; there are performance and other reasons for keeping OLTP systems as lean as possible, so, initial loads of historical data is typically recreated in the Data Warehouse from backups, audit trails or logs.

Dark data and data governance

You don’t need a Chief Data Officer in order to be able to catalogue all your data assets. However, it is still good idea to have a reliable inventory of all your business data, including the euphemistically termed Big Data and dark data.

If you have such an inventory, you will know:

What you have, where it is, where it came from, what it is used in, what qualitative or quantitative value it may have, and how it relates to other data (including metadata) and the business.

What needs to be kept, and for how long, and what can be safely discarded, and when.

The risks associated with the retention or loss of that data.

If you don’t have such a catalogue and have never done a data inventory then a full data inventory and audit seems to be your new best friend.

What does it mean?

Simply stated, you may have dark data that has value, or it may be a simple collection of worthless digital nostalgia. But if you don’t know what you have, it may pay to find out what’s there, and if necessary, to let it go.

There is no point in hoarding unneeded and unwanted rubbish data. That is simply not good data management.

Finally a word on all the fuss surrounding dark data.

Failure to monetize when there is value to be obtained from dark data is one thing, claiming that value can be invariably obtained whilst actually not knowing what the data is, or how it could be monetised, is just adding to the mountain of data related ‘nonsense and hype’ doing the rounds these days. Please consider not adding to that mountain.

That’s all folks

British Rail, the national UK rail Company, used to be notorious for the number of delays and cancellations to services, and their reasons for failing to meet their obligations became stranger and stranger.

In winter, it would snow and there would be problems. And people would ask ‘how come you couldn’t deal with the snow this year, we’ve had snow for centuries?’ And back came the answers ‘Yes, Sir, but this year it was the wrong type of snow’. In autumn (the fall), it was ‘the wrong types of leaves, and ‘the wrong type of rain’, and in Summer, the ‘wrong type of sunshine’ and so on and so forth.

I hope this will not be the excuse from the Big Data and dark data pundits and punters when the much-vaunted and ‘almost’ guaranteed monetisation isn’t frequently realised.

‘Of course Big Data gives you big dollar benefits, it was just littered with the wrong type of data’ or ‘you just weren’t trying hard enough’.

Many thanks for reading.

Consider this: Big Data is not Data Warehousing

06 Friday Mar 2015

Posted by Martyn Jones in Big Data, Consider this, Data Warehousing, Good Strat, hadoop, hdfs, Martyn Jones

≈ 4 Comments

Tags

Big Data, enterprise data warehousing, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones


Hold this thought: To paraphrase the great Bob Hoffman, just when you think that if the Big Data babblers were to generate one more ounce of bull**** the entire f****** solar system would explode, what do they do? Exceed expectations.

I am a mild mannered person, but if there is one thing that irks me, it is when I hear variations on the theme of “Data Warehousing is Big Data”, “Big data is in many ways an evolution of data warehousing” and “with Big Data you no longer need a Data Warehouse”.

Big Data is not Data Warehousing, it is not the evolution of Data Warehousing and it is not a sensible and coherent alternative to Data Warehousing. No matter what certain vendors will put in their marketing brochures or stick up their noses.

In spite of all of the high-visibility screw-ups that have carried the name of Data Warehousing, even when they were not Data Warehouse projects at all, the definition, strategy, benefits and success stories of data warehousing are known, they are in the public domain and they are tangible.

Data Warehousing is a practical, rational and coherent way of providing information needed for strategic and tactical option-formulation and decision-making.

Data Warehousing is a strategy driven, business oriented and technology based business process.

We stock Data Warehouses with data that, in one way or another, comes from internal and optional external sources, and from structured and optional unstructured data. The process of getting data from a data source to the target Data Warehouse, involves extraction, scrubbing, transformation and loading, ETL for short.

Data Warehousing’s defining characteristics are:

Subject Oriented: Operational databases, such as order processing and payroll databases and ERP databases, are organized around business processes or functional areas. These databases grew out of the applications they served. Thus, the data was relative to the order processing application or the payroll application. Data on a particular subject, such as products or employees, was maintained separately (and usually inconsistently) in a number of different databases. In contrast, a data warehouse is organized around subjects. This subject orientation presents the data in a much easier-to-understand format for end users and non-IT business analysts.

Integrated: Integration of data within a warehouse is accomplished by making the data consistent in format, naming and other aspects. Operational databases, for historic reasons, often have major inconsistencies in data representation. For example, a set of operational databases may represent “male” and “female” by using codes such as “m” and “f”, by “1” and “2”, or by “b” and “g”. Often, the inconsistencies are more complex and subtle. In a Data Warehouse, on the other hand, data is always maintained in a consistent fashion.

Time Variant: Data warehouses are time variant in the sense that they maintain both historical and (nearly) current data. Operational databases, in contrast, contain only the most current, up-to-date data values. Furthermore, they generally maintain this information for no more than a year (and often much less). In contrast, data warehouses contain data that is generally loaded from the operational databases daily, weekly, or monthly, which is then typically maintained for a period of 3 to 10 years. This is a major difference between the two types of environments.

Historical information is of high importance to decision makers, who often want to understand trends and relationships between data. For example, the product manager for a Liquefied Natural Gas soda drink may want to see the relationship between coupon promotions and sales. This is information that is almost impossible – and certainly in most cases not cost effective – to determine with an operational database.

Non-Volatile: Non-volatility means that after the data warehouse is loaded there are no changes, inserts, or deletes performed against the informational database. The Data Warehouse is, of course, first loaded with cleaned, integrated and transformed data that originated in the operational databases.

We build Data Warehouses iteratively, a piece or two at a time, and each iteration is primarily a result of business requirements, and not technological considerations.

Each iteration of a Data Warehouse is well bound and understood – small enough to be deliverable in a short iteration, and large enough to be significant.

Conversely, Big Data is characterised as being about:

Massive volumes: so great are they that mainstream relational products and technologies such as Oracle, DB2 and Teradata just can’t hack it, and

High variety: not only structured data, but also the whole range of digital data, and

High velocity: the speed at which data is generated, transmitted and received.

These are known as the three Vs of Big Data, and they are subject to significant and debilitating contradictions, even amongst the gurus of Big Data (as I have commented elsewhere: Contradictions of Big Data).

From time to time, Big Data pundits slam Data Warehousing for not being able to cope with the Big Data type hacking that they are apparently used to carrying out, but this is a mistake of those who fail to recognise a false Data Warehouse when they see one.

So let’s call these false flag Data Warehouse projects something else, such as Data Doghouses.

“Data Doghouse, meet Pig Data.”

Failed or failing Data Doghouses fail for the same reasons that Big Data projects will frequently fail. Both will almost invariably fail to deliver artefacts on time and to expectations; there will be failures to deliver value or even simply to return a break even in costs versus benefits; and of course, there will be failures to deliver any recognisable insight.

Failure happens in Data Doghousing (and quite possibly in Big Data as well) because there is a lack of coherent and cohesive arguments for embarking on such endeavours in the first place; a lack of real business drivers; and, a lack of sense and sensibility.

There is also a willing tendency to ignore the advice of people who warn against joining in the Big Data hubris. Why do some many ignore the ulterior motives of interested parties who are solely engaged in riding on the faddish Big Data bandwagon to maximise the revenue they can milk off punters? Why do we entertain pundits and charlatans who ‘big up’ Big Data whilst simultaneously cultivating an ignorance of data architecture, data management and business realities?

Some people say that the main difference between Big Data and Data Warehousing is that Big Data is technology, and Data Warehousing is architecture.

Now, whilst I totally respect the views of the father of Data Warehousing himself, I also think that he was being far too kind to the Big Data technology camp. However, of course, that is Bill’s choice.

Let me put it this way, if Oracle gave me the code for Oracle 3, I could add 256 bit support, parallel processing and give it an interface makeover, and it would be 1000 times better than any Big Data technology currently in the market (and that version of Oracle is from about 1983).

Therefore, Data Warehousing has no serious competing paragon. Data Warehousing is a real architecture, it has real process methodologies, it is tried and proven, it has success stories that are no secrets, and these stories include details of data, applications and the names of the companies and people involved, and we can point at tangible benefits realised. It’s clear, it’s simple and it’s transparent.

Just like Big Data, right?

Well, no.

See what I mean?

Therefore, the next time someone says to you that Big Data will replace Data Warehousing or that Data Warehousing is Big Data, or any variations on that sort of ‘stupidity’ theme, you can now tell them to take a hike, in the confidence that you are on the side of reason.

Many thanks for reading.

More perspectives on Big Data

Aligning Big Data: http://www.linkedin.com/pulse/aligning-big-data-martyn-jones

Big Data and the Analytics Data Store: http://www.linkedin.com/pulse/big-data-analytics-store-martyn-jones

A Modern Manager’s Guide to Big Data:http://www.linkedin.com/pulse/managers-guide-big-data-context-martyn-jones

Core Statistics coexisting with Data Warehousing

Accomodating Big Data

And a big thank you to Bill Inmon (the father of Data Warehousing and of DW 2.0)

Contradictions of Big Data – Short

01 Sunday Mar 2015

Posted by Martyn Jones in Big Data, Consider this, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones

≈ Leave a comment

Tags

Big Data, data management, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones


Please note: This is an edited version of a previous piece with a similar name, but focusing solely on the three main Vs of Big Data.

What we’ve been told

We’ve been told that business Big Data is the greatest thing since sliced bread, and that its major characteristics are:

  • massive volumes – so great are they that mainstream relational products and technologies such as Oracle, DB2 and Teradata just can’t hack it, and
  • high variety – not only structured data, but also the whole range of digital data, and
  • high velocity – the speed at which data is generated, transmitted and received

Which is a simple and straightforward means of classification. Big Data is about massive volumes, high variety and high velocity. Right?

It’s not about big

I have never bought into the idea that more data is necessarily better data, or that it provides better focus or leads to increased insight, in fact I have been quite vocal with my contrarian opinion, but now this view is getting some additional support, and from some surprising corners.

In a recent blog piece on IBM’s Big Data and Analytics Hub (Big data: Think Smarter, not bigger), Bernard Marr wrote that “the truth is, it isn’t how big your data is, it’s what you do with it that matters!”

Over at Fierce Big Data it was Pam Baker who stated that “the term big data is unfortunate because it’s really not about the size of the data”. (Big data is not about petabytes, but complex computing).

Elsewhere, SAS echoed similar sentiments on their web site: “The real issue is not that you are acquiring large amounts of data. It’s what you do with the data that counts.”

Well, apparently Big Data isn’t about “massive volumes” of data.

Strike 1!

It’s not about variety

It is claimed that 20% of digital data is structured, it is based on the problematic suggestion that structured data is uniquely relational.

It is also said that unstructured data includes CSV files and XML data, and this makes up far more than the 20% of the data generated. But this definition is wrong.

If anything, CSV data is structured, and XML data is highly structured, and it’s typically regular ASCII data. So there it does not add variety, even though it is not structured in the ways that some someone might expect, especially if that someone lacks the required knowledge and experience. Simply stated, CSV data is structured, it’s just that it lacks rich metadata, but that doesn’t make it unstructured.

“But”, I hear you say “what about all the non-textual data such as multi-media, and what about the masses of unstructured textual data?”

Take it from me, most businesses will not be basing their business strategies on the analysis of a glut of selfies, juvenile twittering, home videos of cute kittens, or the complete works of William Shakespeare. Almost all business analysis (whether done by a professional statistician or a data scientist) will continue to be carried out using structured data obtained primarily from internal operational systems and external structured data providers.

Variety, Sir? No problem.

Strike two!

It’s not even about velocity

So, if we accept that Big Data isn’t really about the massive data volumes or high data variety then that leaves us with velocity. Because if it isn’t about record breaking VLDB or significant data variety, then for most commercial businesses the management of data velocity becomes either less of an issue or just is no issue.

Even in some extreme circumstances, one can explore the suggestion that data sampling can remove issues with data volume as well as velocity.

However, the fact that some software vendors and IT service suppliers set up this‘straw man’ velocity argument and then knock it down with the ‘amazing powers’ of their products and services, is quite another matter.

So, is it really about velocity?

Strike three!

So what is it really about?

Big Data is a dopey term, applied necessarily ambiguously to a surfeit of tenuously connected vagaries, and its time has come and gone. Let’s dump the Big Data moniker, and the 3 Vs along with it, and embrace the fact that data is data, there will always be more of it.

So, let’s consider ‘all data’ and principally for its time and place utility.

If there is something that you are not sure about or have questions with then please leave a comment below or email me.

Thanks very much for reading.

Consider this: Big Data and the Pot of Tea

17 Tuesday Feb 2015

Posted by Martyn Jones in Big Data, Consider this, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones, Strategy

≈ Leave a comment

Tags

Analytics, Big Data, data management, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones


To begin at the beginning

Hold this thought: Big Data is King.

Is there just nothing that Big Data isn’t capable of fixing? From terrorism, world hunger, Ebola, HIV, fraud, money laundering and hiring the ‘right’ people through to winning the lottery, curing hangovers, arranging entrapment and finding the love of your life. Big Data is King. Continue reading →

A brief introduction to Knowledge Management

14 Saturday Feb 2015

Posted by Martyn Jones in Analytics, Big Data, Consider this, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones, statistics

≈ Leave a comment

Tags

Big Data, Consider this, data management, information manageemnt, knowledge management


A helpful slideset that is used to explain the purposes, positions and roles of Knowledge Management.

A brief introduction to Knowledge Management from Martyn Richard Jones

Enjoy! Please tell me what you think about this slide deck. Many thanks for viewing. 

Big Data, a promised land where the Big Bucks grow

12 Thursday Feb 2015

Posted by Martyn Jones in Big Data, Consider this, Good Strat, Information Management, Martyn Jones

≈ 2 Comments

Tags

Analytics, Big Data, Good Strat, Martyn Jones, statistics


Consider this. Many people come up to me in the street, and, apropos of nothing, they ask me how they can make money from Big Data.

Normally I would send such people to see a specialist – no, not a guru, but a sort of health specialist, but because this has happened to me so many times now, I eventually decided to put pen to paper, push the envelope, open up the kimono, and to record my advice for posterity and the great grandchildren.

So, here are my top seven tips for cashing in quick on the new big thing on the block.

1 – A business opportunity for faith

Like every new religion, trend or fad, Big Data has its own founding myths, theology and liturgy, and there is money to be made in it; loadsa lovely jubbly money. By predicating and evangelising Big Data you will be welcomed with open arms into the Big Data faith, and will receive all the attendant benefits that will miraculously and mysteriously fall upon you and your devout friends. Go on, I dare you. Be a Big Data guru, a shepherd to a flock of sheep, and enjoy the wealth, health and happiness that most surely will come your way. You too can look cool in red Prada slippers, a flattering and flowing gown and matching accessories.

2 – Acquire it, multiply it, weigh it, mark it up and sell it on

Simply stated, this is about acquiring other people’s data, by sacred means or profane, marking it up and then selling it on. The value you add is that you act as a trusted conduit, a conduit for good. You may care to enrich the data, swop the order of data, replicate and embellish data, make stuff up, etc. which all serves to ‘add value’ to the data. You may even consider adding nuggets of value to the data, just for kicks and giggles. My best friend’s favourite is injecting the good old ‘diaper and beer’ and ‘friends and family’ clichés into every Big Data collection, as it never fails to thrill, please and delight.

3 – Anything can be anything

The good thing about making money from Big Data is that it doesn’t need to be anything to do with Big Data. Make a 20GB Enterprise Data Warehouse? Call it a Big Data success. Sell 20 boxes of dodgy doughnuts down the alternative market? Proclaim a Big Data triumph. Sell your digital porn stash to your best mate? Point to the incredible invisible hand of the Big Data market at work. See what I’m doing there. Anything can be anything, and you too can cash in on that opportunity, big time.

4 – Big Data Patronage

Tense, nervous headaches? Do you like making up stories about Big Data, or for that matter anything else? Are you a natural born fibber but are strapped for cash? Then worry no longer. If you get a Big Data patron you will be sorted for ‘life’; get two and you’ll be sorted for the afterlife as well. With a Big Data patron you can get the most tenuous, crappiest and superficial of pieces published, promoted and vaunted – globally. Can’t make it up yourself, then outsource and offshore it, after all, just get the keywords right for SEO ranking and the gullible will flock to you in droves. The down side of this profession is that you will be targeted for writing half-truths, quarter-truths and downright lies, and you will be pilloried as a purveyor of rank hyperbole. But don’t worry, take heart and never lose the faith, you will be in good company. As one Big Data guru was want to say ” If you repeat a lie often enough, people will believe it, and you will even come to believe it yourself.” Amen! brother.

5 – Big Data Certification

By 2016 there will be global demand for 30 billion Big Data professionals. Are you prepared to cash in on that inevitability? No? Then consider this.

One of my best friends makes his living as a completely phony Big Data Scientist. For two hundred bucks he can make you a Data Scientist or a Big Data guru. Some guys give you an education but this guy gives you immediate access to high paying jobs, sex and a life in the city. Moreover, for an extra 250 bucks you can also become a certified Big Data Trainer, which will allow you to do unto others what has been done unto you.

6 – Creative Technology Reuse

Big Data has heralded in the biggest innovations known in the history of computing, and arguably in the entire history of humankind. One of those new inventions has been the now widely acclaimed and revolutionary ‘flat file data base’ (FFDB), and this has been accompanied with developments in low level operating system primitives that allow for the processing of these collections and hierarchies of FFDBs. So, if one has a mind to do so, one can get some real business leverage off of these new tendencies by borrowing 21st century technology found in old operating system hacks from the sixties and seventies and eighties and nineties and… Well, the point is that in order to get serious funding it is no longer good enough to have a half page business plan, it is also necessary to eke out ‘stuff’ that works within the new paradigms of Big Data and Big Data Analytics. For my next venture I will be looking for serious funding for my ‘Arbitrary Dawdle Down Data Street’ (AD3S) Big Data Analytics platform, a platform designed to support virtual 1k bit processing and the massively parallel provision of global regular expression search and match (S&M), concatenation and listing, and cooperative data-driven and streamed data extraction and reporting. I’m hoping to attract the attention of governments, the EU, the Manic Street Preachers, the UN, China, Vladimir Putin, the DOD, HP, Oracle, Gartner, Lana Del Rey, Deloitte and IBM. So, this is going to be absolutely massive. Word!

7 – Big Data Brokerage

According to leading management consultants and industry watchers Gartner, McKinsey and Deloitte, data needs to be managed and accounted like any other asset, such as money. To get into a similar view-point requires a massive leap of faith, but it is a conversion that might drive dividends. One avenue to be explored in eking out value from the apparently massively valuable Big Data lakes, silos and pools is through the operation of a Big Data Brokerage. A Big Data Brokerage is a business whose main responsibility is to be an intermediary that puts Big Data buyers and Big Data sellers together in order to facilitate a transaction. Big Data Brokerage companies are compensated via commission after the Big Data transaction has been successfully completed. They may also charge introductory fees. Just imagine the wealth of business opportunities in that. You could become the Goldman Sachs of data.

That’s it folks!

I hope you enjoyed this piece and would be pleased to hear your views on this and other subjects.

Whilst I understand the attraction and even the need of creating a new and significant growth industry, I would also advise a degree of restraint, and whilst I see that “Big Data” (the consideration of the potential value of All Data) has its allure, I also think that some good sense and informed caution should also prevail.

Thank you so much for reading.

Martyn Richard Jones

How to position Big Data

12 Thursday Feb 2015

Posted by Martyn Jones in Big Data, Consider this, Good Strat, Martyn Jones

≈ Leave a comment

Tags

Big Data, Good Strat, Martyn Jones


To begin at the beginning

Fueled by the new fashions on the block, principally Big Data, the Internet of Things, and to a lesser extent Cloud computing, there’s a debate quietly taking please over what statistics is and is not, and where it fits in the whole new brave world of data architecture and management. For this piece I would like to put aspects of this discussion into context, by asking what ‘Core Statistics’ means in the context of the DW 3.0 Information Supply Framework.

Core Statistics on the DW 3.0 Landscape

The following diagram illustrates the overall DW 3.0 framework:

There are three main concepts in this diagram: Data Sources; Core Data Warehousing; and, Core Statistics.

Data Sources: All current sources, varieties, velocities and volumes of data available.

Core Data Warehousing: All required content, including data, information and outcomes derived from statistical analysis.

Core Statistics: This is the body of statistical competence, and the data used by that competence. A key data component of Core Statistics is the Analytics Data Store, which is designed to support the requirements of statisticians.

The focus of this piece is on Core Statistics. It briefly looks at the aspect of demand driven data provisioning for statistical analysis and what ‘statistics’ means in the context of the DW 3.0 framework.

Demand Driven Data Provisioning

The DW 3.0 Information Supply Framework isn’t primarily about statistics it’s about data supply. However, the provision of adequate, appropriate and timely demand-driven data to statisticians for statistical analysis is very much an integral part of the DW 3.0 philosophy, framework and architecture.

Within DW 3.0 there are a number of key activities and artifacts that support the effective functioning of all associated processes. Here are some examples:

All Data Investigation: An activity centre that carries out research into potential new sources of data and analyses the effectiveness of existing sources of data and its usage. It is also responsible for identifying markets for data owned by the organization.

All Data Brokerage: An activity that focuses on all aspects of matching data demand to data supply, including negotiating supply, service levels and quality agreements with data suppliers and data users. It also deals with contractual and technical arrangements to supply data to corporate subsidiaries and external data customers.

All Data Quality: Much of the requirements for clean and useable data, regardless of data volumes, variety and velocity, have been addressed by methods, tools and techniques developed over the last four decades. Data migration, data conversion, data integration, and data warehousing have all brought about advances in the field of data quality. The All Data Quality function focuses on providing quality in all aspects of information supply, including data quality, data suitability, quality and appropriateness of data structures, and data use.

All Data Catalogue: The creation and maintenance of a catalogue of internal and external sources of data, its provenance, quality, format, etc. It is compiled based on explicit demand and implicit anticipation of demand, and is the result of an active scanning of the ‘data markets’, ‘potential new sources’ of data and existing and emerging data suppliers.

All Data Inventory: This is a subset of the All Data Catalogue. It identifies, describes and quantifies the data in terms of a full range of metadata elements, including provenance, quality, and transformation rules. It encompasses business, management and technical metadata; usage data; and, qualitative and quantitative contribution data.

Of course there are many more activities and artifacts involved in the overall DW 3.0 framework.

Yes, but is it all statistics?

Statistics, it is said, is the study of the collection, organization, analysis, interpretation and presentation of data. It deals with all aspects of data, including the planning of data collection in terms of the design of surveys and experiments; learning from data, and of measuring, controlling, and communicating uncertainty; and it provides the navigation essential for controlling the course of scientific and societal advances[i]. It is also about applying statistical thinking and methods to a wide variety of scientific, social, and business endeavors in such areas as astronomy, biology, education, economics, engineering, genetics, marketing, medicine, psychology, public health, sports, among many.

Core Statistics supports micro and macro oriented statistical data, and metadata for syntactical projection (representation-orientation); semantic projection (content-orientation); and, pragmatic projection (purpose-orientation).

The Core Statistics approach provides a full range of data artifacts, logistics and controls to meet an ever growing and varied demand for data to support the statistician, including the areas of data mining and predictive analytics. Moreover, and this is going to be tough for some people to accept, the focus of Core Statistics is on professional statistical analysis of all relevant data of all varieties, volumes and velocities, and not, for example, on the fanciful and unsubstantiated data requirements of amateur ‘analysts’ and ‘scientists’ dedicated to finding causation free correlations and interesting shapes in clouds.

That’s all folks

This has been a brief look at the role of DW 3.0 in supplying data to statisticians.

One key aspect of the Core Statistics element of the DW 3.0 framework is that it renders irrelevant the hyperbolic claims that statisticians are not equipped to deal with data variety, volumes and velocity.

Even with the advent of Big Data alchemy is still alchemy, and data analysis is still about statistics.

If you have any questions about this aspect of the framework then please feel free to contact me, or to leave a comment below.

Many thanks for reading.

Catalogue under: #bigdata #technology big data, predictive analytics

[i] Davidian, M. and Louis, T. A., 10.1126/science.1218685

Big Data Will Save the World

12 Thursday Feb 2015

Posted by Martyn Jones in Big Data, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones

≈ Leave a comment

Tags

Big Data, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones


Good morning fellow consumers; here’s a pop quiz question: What does Big Data have in common with Robitussin? Think about, take your time.

Okay, times up!

Robitussin is a legal pharmaceutical product commonly associated with coughs, colds and flu combinations. Continue reading →

The Faustian side of IT – Part 1

12 Thursday Feb 2015

Posted by Martyn Jones in Good Strat, goodstartegy, Martyn Jones

≈ Leave a comment

Tags

Big Data, Good Strat, Good Strategy, Martyn Jones, Martyn Richard Jones


It’s no wonder that truth is stranger than fiction. Fiction has to make sense.
Mark Twain

It’s Friday morning in London’s trendy Canary Wharf, and I have been asked to facilitate a local meeting of the Digital Violence and Dogma Victims Group, the self-help recovery chain for those who have fallen foul of the pernicious and debilitating effects of IT dogma, organisational autism and insider thuggery and blackmail.

There are twelve of us in the old church hall. We sit in a circle, to facilitate communication. After a more formal welcome and brief introduction the floor is opened up for people to talk about whatever they want to talk about. There is silence. This is normal. There are a few new faces.

“Pantxo!” I look across at Pantxo; he is staring out the window at the falling rain. He can usually talk the legs off a giraffe, but today he is having none of it. Sensing that things are not going too well, I enter into my routine of floridly and inanely relating well-worn anecdotes from the distant annals of IT history.

As I am entering my tenth lap of the track of tedium, one of the new members picks up enough courage to chime in, first nervously and then with the increasing confidence of someone who knows exactly what they are talking about and precisely what they are going to say.

“Hello. My name is Crème”. A woman in a blue adidas tracksuit looks around the room.

“Yes. My name is Crème Brûlée; you may well have heard of it from twitter, the tabloids and the TV… oh, and the novel Absolute Beginners… I used to be the CIO of a major household name.”

She pauses and looks into the middle distance, searching for the truth, tip toeing around the pain.

“This is a bit embarrassing – awkward maybe would be a better word – but what I want to unburden upon you all today is the story of how I outsourced my Data Warehouse, my Business Intelligence, my Big Data, my MDM, my CRM, my family and my life”.

She takes a deep breath and continues; making a point of looking at each of her fellow members in turn as she does so.

“About five summers ago, I feeling a bit lost, which was unusual for me, a strange and novel experience, so I decided that I really needed to do something to turbo-charge my career prospects and to get things moving faster in my part of the organisation. I wanted to excel, and I wanted to be seen doing so, by the right people, and recognised as such.”

“In the spring of that year I had been to a management conference with some of our senior IT management team, some of whom are also here today. Okay, I won’t single out any one of you, because you know who you are.”

“As part of the week-long conference we were wined and dined, stroked and cajoled, flattered and sweet talked by a whole entourage of sales execs from the technology and service providers. They were telling us that the future was in outsourcing and offshoring as much as we could, yes even Big Data and Data Warehousing and Analytics, and they were bewitching us with stories of future successes, of IT paradise and professional nirvana. We in turn wanted to believe, needed to believe, desired to believe. All of this was reinforced by the so called independent industry analysts who insisted, in their agnostic way, that we should seize the moment, with courage, determination and illusion.”

“When I got back to the ranch my mind became occupied with other things, but I didn’t entirely forget the compelling messages that I had brought away with me from the conference.”

“Nothing happened for a couple of months until, one day and out of the blue, things came to a head.”

“We had recently acquired a media news and entertainments business – Media Macaroni International, and we were planning on integrating their general ledger into the corporate IT landscape. One morning I received a call from the CIO of the newly acquired company, inviting me to their site for a meet and greet event.”

“So I moseyed on down to Tinsel Town and got a briefing from not only the CIO, but the full board of directors of Media Macaroni, the ‘hasta la pasta’ of Big Data Analytics ad-hoc performance alignment.”

“To cut a long story short, they basically put me on the spot. Either I integrate the entire Data Warehousing, MIS, Big Data, Analytics and MDM across the expanded corporate body in 9 to 15 months, or we would have serious problems of convergence and market credibility. The message couldn’t have been clearer. Either I got our act together and made this acquisition work, or what looked like a humongous hot potato could land in my lap anytime soon. It was a career changing risk that I needed to address.”

“I told the directors there and then that the mission was going to be incredibly difficult to fulfil. However, the mood quickly changed.”

“Their CIO looks across the table and tells me that he can help me out of my hole. My hole? What the freak! You see, we have employed a service company that does most of our IT work for us, and according to us at Media Macaroni they are simply the bee’s knees, the best thing since chopped liver on rye, the biz.”

“So ‘who are these guys’, I ask. And after a brief hiatus that seemed to last forever, back came the ominous response: The Taffia Connection.”

To be continued…

Many thanks for reading.

Channel: #IT #BigData

As always, please share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps even send me aLinkedIn invite. Also feel free to connect via Twitter, Facebook and the Cambriano Energy website.


File under: Good Strat, Good Strategy, Martyn Richard Jones, Martyn Jones, Cambriano Energy, Iniciativa Consulting, Iniciativa para Data Warehouse, Tiki Taka Pro

← Older posts
Newer posts →

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 635 other subscribers

Top posts

  • 2026 Tech Trends: The Rise of Hyper-Hype - 2026/01/08
  • X Is Dying In Europe: Here's Why
  • A Brief History of Data Warehousing - 2026/01/07
  • The Magical Celtic Welshness of the Number Ten - 2026/01/10
  • The Welsh Nine - 2026/01/09
  • Wales will be Wales - 2026/01/06
  • An Open Letter to Mansoor Hussain Laghari
  • Top Countries Known for Arrogance and Ignorance
  • Template for Blog Article
  • Mobile Device Revolution: Five Trends for 2026

Recent Comments

Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn de Tours's avatarMartyn de Tours on The Perpetual Victim: How Prof…
Tiffany's avatarTiffany on Consider this: Data Made …
Unknown's avatarThe Case for a Globa… on REVEALING WEALTH: USING BIG DA…
Follow GOOD STRATEGY on WordPress.com

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Names in the cloud

All Data Ask Martyn awareness Big Data Big Data 7s Big Data Analytics Business Intelligence business strategy Consider this dark data data architecture Data governance Data Lake data management data science Data Supply Framework Data Warehouse Data Warehousing Good Strat goodstrat Good Strategy Inform, educate and entertain. IT strategy Martyn Jones Martyn Richard Jones pig data Politics Strategy The Amazing Big Data Challenge The Big Data Contrarians

Hours & Info

Spain
+33 767 120 160
martyn.jones@martyn.es
Lunch: 13:30pm - 14:30pm
Dinner: M-Th 20:00pm - 21:00pm, Fri-Sat:21:00pm - 22:00pm

The Good Strat Archives

  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • March 2023
  • January 2022
  • December 2021
  • November 2021
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • July 2019
  • June 2019
  • May 2019
  • December 2018
  • January 2018
  • December 2017
  • October 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • September 2016
  • August 2016
  • May 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014

The Stats

  • 112,637 hits

Recent posts

  • The Welsh Nine – 2026/01/09 January 10, 2026
  • The Magical Celtic Welshness of the Number Ten – 2026/01/10 January 10, 2026
  • 2026 Tech Trends: The Rise of Hyper-Hype – 2026/01/08 January 10, 2026
  • A Brief History of Data Warehousing – 2026/01/07 January 10, 2026
  • Template for Blog Article January 10, 2026
  • Wales will be Wales – 2026/01/06 January 10, 2026
  • The Wisdom of Three – 2026/01/03 January 9, 2026
  • Independent Wales – 2026/01/05 January 9, 2026
  • Meet the Euro Press – 2026/01/04 January 9, 2026
  • Nine Absurd AI Use Cases We Don’t Need January 9, 2026

Recent Comments

Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn Jones's avatarMartyn Jones on The BBC in Crisis: Navigating…
Martyn de Tours's avatarMartyn de Tours on The Perpetual Victim: How Prof…
Tiffany's avatarTiffany on Consider this: Data Made …
Unknown's avatarThe Case for a Globa… on REVEALING WEALTH: USING BIG DA…

Archives

Categories

  • accountability
  • advertising
  • agile
  • agile way of working
  • agile@scale
  • AI
  • All Data
  • Analytics
  • anthropology
  • Architecture
  • Artificial Intelligence
  • Ask Martyn
  • Assets
  • awareness
  • bad strategy
  • Banking
  • behaviour
  • Best principles
  • Big Data
  • Big Data 7s
  • Big Data Analytics
  • blockchain
  • Books with influence
  • Brexit
  • BS
  • business
  • Business Intelligence
  • business strategy
  • Cambriano
  • Cambridge Analytica
  • China
  • Climate Change
  • Cloud
  • code of conduct
  • Commercial Analytics
  • community
  • Condiser this
  • Conservative Party
  • consider
  • Consider this
  • Consultation
  • Creativity
  • Culture
  • dark data
  • data
  • data architecture
  • Data governance
  • data hub
  • Data Lake
  • data management
  • Data Mart
  • data mesh
  • data science
  • Data Supply Framework
  • Data Warehouse
  • Data Warehousing
  • deceit
  • deep learning
  • Democracy
  • digital transformation
  • Diplomacy
  • disinformation
  • Dogma
  • Duties
  • DW 3.0
  • ECM
  • Economics
  • EDW
  • England
  • enterprise content management
  • ethics
  • EU
  • Europe
  • European Union
  • Excellence
  • Excerpt
  • Executive
  • Extract
  • Federalism
  • films
  • Financial Industry
  • fraud
  • Freedoms
  • Globalisation
  • good start
  • Good Strat
  • Good Strategy
  • Good Strategy Radio
  • goodstart
  • goodstartegy
  • goodstrat
  • goostart
  • governance
  • hadoop
  • hdfs
  • HR
  • humour
  • India
  • influencers
  • Inform, educate and entertain.
  • informatio Supply Framework
  • information
  • Information Management
  • Information Supply Frameowrk
  • Information Supply Framework
  • Infotrends
  • Inmon
  • instruments
  • IoT
  • IT Circus
  • IT fraud
  • IT strategy
  • IT World
  • iterations
  • java
  • Knowledge
  • knowledge management
  • Labour Party
  • leadership
  • Leadership 7s
  • life
  • listening
  • literature
  • LSE
  • machine learning
  • Management
  • market forces
  • Marketing
  • Marty does
  • Martyn does
  • Martyn Jones
  • Martyn Richard Jones
  • media
  • Memory lane
  • Methodology
  • nationalism
  • nine competitive forces
  • no limits
  • Northern Ireland
  • obituary
  • Obligations
  • offshore
  • Offshoring
  • operational
  • Outsourcing
  • Oxford
  • pain
  • Parliament
  • Peeves
  • Personal Integrity Key
  • Philosophy
  • pig data
  • PIK
  • PIR
  • Plaid Cymru
  • Planning
  • poem
  • poems
  • Poetry
  • Polemic
  • political science
  • Politics
  • pomo
  • postmodern
  • POTUS
  • Process
  • Professional Networking
  • professionalism
  • project management
  • Project to Excel
  • prose
  • public
  • Public Integrity Record
  • Quiz
  • Rant
  • Referendum
  • Remain
  • RIghts
  • Risk
  • Rivalry
  • Russia
  • Ruth Davidson
  • Sales
  • satire
  • Scotland
  • Scottish National Party
  • scrum
  • sentiment analysis
  • SMILES
  • Snippet
  • SNP
  • Social
  • Social Media
  • Sociology
  • Spain
  • spoof
  • statistics
  • Stories
  • Strategy
  • structured intellectual capital
  • supply chain management
  • tactics
  • Tax avoidance
  • Tax evasion
  • TEAM
  • technology
  • The Amazing Big Data Challenge
  • The Big Data Contrarians
  • The Greens
  • The Guardian
  • The hidden wealth of nations
  • Trade
  • UK
  • Uncategorized
  • United Kingdom
  • USA
  • Value
  • Wales
  • wisdom

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com
Log in

Hours & Info

Martyn Richard Jones
Madrid, Spain
+34 692 376 698
martyn.jones@martyn.es
10:00 - 17:00
Follow GOOD STRATEGY on WordPress.com

Top Good Strat Posts & Pages

  • Innovative Strategies for Modern Governance
  • 2026 Tech Trends: The Rise of Hyper-Hype - 2026/01/08
  • X Is Dying In Europe: Here's Why
  • A Brief History of Data Warehousing - 2026/01/07
  • The Magical Celtic Welshness of the Number Ten - 2026/01/10
  • The Welsh Nine - 2026/01/09
  • Wales will be Wales - 2026/01/06
  • An Open Letter to Mansoor Hussain Laghari
  • Top Countries Known for Arrogance and Ignorance
  • Template for Blog Article

Good strat tag cloud

1 2 3 4 AI All Data Analytics Artificial Intelligence Behavioural Economics BI Big Data bigdata blog books Business business analysis Business Enablement business intelligence Business Management business strategy chatgpt cloud Consider this data data integration data management data science Data Warehouse Demagogism digital-marketing Dogma Donald Trump enterprise data warehousing espanol EU fe fiction gaza goodstart good start Good Strat goodstrat Good Strategy hamas history ia information Information and Technology information management Information Technology israel IT Strategy jesus knowledge leadership life llm machine learning Marketing Martyn Jones Martyn Richard Jones News Offshoring Organisational Autism palestine Philosophy poesia Politics Russia Spain statistics Strategy technology trump writing

Categories

  • accountability
  • advertising
  • agile
  • agile way of working
  • agile@scale
  • AI
  • All Data
  • Analytics
  • anthropology
  • Architecture
  • Artificial Intelligence
  • Ask Martyn
  • Assets
  • awareness
  • bad strategy
  • Banking
  • behaviour
  • Best principles
  • Big Data
  • Big Data 7s
  • Big Data Analytics
  • blockchain
  • Books with influence
  • Brexit
  • BS
  • business
  • Business Intelligence
  • business strategy
  • Cambriano
  • Cambridge Analytica
  • China
  • Climate Change
  • Cloud
  • code of conduct
  • Commercial Analytics
  • community
  • Condiser this
  • Conservative Party
  • consider
  • Consider this
  • Consultation
  • Creativity
  • Culture
  • dark data
  • data
  • data architecture
  • Data governance
  • data hub
  • Data Lake
  • data management
  • Data Mart
  • data mesh
  • data science
  • Data Supply Framework
  • Data Warehouse
  • Data Warehousing
  • deceit
  • deep learning
  • Democracy
  • digital transformation
  • Diplomacy
  • disinformation
  • Dogma
  • Duties
  • DW 3.0
  • ECM
  • Economics
  • EDW
  • England
  • enterprise content management
  • ethics
  • EU
  • Europe
  • European Union
  • Excellence
  • Excerpt
  • Executive
  • Extract
  • Federalism
  • films
  • Financial Industry
  • fraud
  • Freedoms
  • Globalisation
  • good start
  • Good Strat
  • Good Strategy
  • Good Strategy Radio
  • goodstart
  • goodstartegy
  • goodstrat
  • goostart
  • governance
  • hadoop
  • hdfs
  • HR
  • humour
  • India
  • influencers
  • Inform, educate and entertain.
  • informatio Supply Framework
  • information
  • Information Management
  • Information Supply Frameowrk
  • Information Supply Framework
  • Infotrends
  • Inmon
  • instruments
  • IoT
  • IT Circus
  • IT fraud
  • IT strategy
  • IT World
  • iterations
  • java
  • Knowledge
  • knowledge management
  • Labour Party
  • leadership
  • Leadership 7s
  • life
  • listening
  • literature
  • LSE
  • machine learning
  • Management
  • market forces
  • Marketing
  • Marty does
  • Martyn does
  • Martyn Jones
  • Martyn Richard Jones
  • media
  • Memory lane
  • Methodology
  • nationalism
  • nine competitive forces
  • no limits
  • Northern Ireland
  • obituary
  • Obligations
  • offshore
  • Offshoring
  • operational
  • Outsourcing
  • Oxford
  • pain
  • Parliament
  • Peeves
  • Personal Integrity Key
  • Philosophy
  • pig data
  • PIK
  • PIR
  • Plaid Cymru
  • Planning
  • poem
  • poems
  • Poetry
  • Polemic
  • political science
  • Politics
  • pomo
  • postmodern
  • POTUS
  • Process
  • Professional Networking
  • professionalism
  • project management
  • Project to Excel
  • prose
  • public
  • Public Integrity Record
  • Quiz
  • Rant
  • Referendum
  • Remain
  • RIghts
  • Risk
  • Rivalry
  • Russia
  • Ruth Davidson
  • Sales
  • satire
  • Scotland
  • Scottish National Party
  • scrum
  • sentiment analysis
  • SMILES
  • Snippet
  • SNP
  • Social
  • Social Media
  • Sociology
  • Spain
  • spoof
  • statistics
  • Stories
  • Strategy
  • structured intellectual capital
  • supply chain management
  • tactics
  • Tax avoidance
  • Tax evasion
  • TEAM
  • technology
  • The Amazing Big Data Challenge
  • The Big Data Contrarians
  • The Greens
  • The Guardian
  • The hidden wealth of nations
  • Trade
  • UK
  • Uncategorized
  • United Kingdom
  • USA
  • Value
  • Wales
  • wisdom

Blog at WordPress.com.

  • Subscribe Subscribed
    • GOOD STRATEGY
    • Join 133 other subscribers
    • Already have a WordPress.com account? Log in now.
    • GOOD STRATEGY
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy