Martyn Rhisiart Jones
Bonn, Germany, 2014

Many people come up to me in the street and ask me what big-data is all about. I have experienced this numerous times before. I am sure it might just happen to you as well. I know sort of thing, I read the big-data tea leaves. Nothing gets past me.
The first time a complete stranger approached me in public, he greeted me. He then asked: “Hello, will you tell me what this big-data lark is all about then?” I was lost for words, and you just ask my Aunt Dolly, he can vouch for that, no problem. Later that day, I read a book. It was my dad’s book, with lots of pages and words. I then decided to adopt a strategy for explaining big-data.
In the spirit of springtime and goodwill to all men and women, I have created this blog piece. I hope it will enlighten, help, and entertain.
My first question is this, what is big-data?
Big data can be characterised by the eleven V words – yes, eleven, not four or three, but eleven. That, in my book, is more than enough. It can bring up-to-speed the average big-data John or Jane. These are people one meets on the street who naturally wish to be informed of such matters.
In layperson’s terms, this represents a series of landmarks and pointers in the analytics space. These are used to frame and guide the didactic aspects of big-data.
The twelve fundamental V-word characteristics of the big-data canon are:
- Vagueness.
- Volume.
- Variety.
- Virility.
- Velocity.
- Vendibility.
- Vaticination.
- Voracity.
- Vanity.
- Vintage.
- Vulgarity.
- Virtuosity.
What do those characteristic tenets mean? Let’s take a look at them one by one.
Vagueness: Addressing this question is perhaps the trickiest. It presents a vast panorama. This concept is incredibly complex yet easily graspable. But let me state this, and let there be no mistake about it. The question is as cunning as a cunning fox, and the answers, even more so. At this point, what makes big-data vague is also what makes big-data specific, explicit and certain. To ‘come to an understanding’ of big-data, it is necessary to approach the concept of knowing the unknowable. This requires embracing a dialectical method. So belief is an essential element – belief and a lot of data, that is.
Volume: If there ever was a time to ‘pump up the volume’, we have it here with big-data.
Big, voluminous, gorgeously rotund and infinite. Big data is called big-data because there is a lovely, roly-poly, likeable never-ending load of it. Its volumes can be measured in zeta-bytes, which you can be assured, is a helluva lot of data.
Variety: As they might say down my way, “variety is the spice of life, innit?” This spice is what makes the subject of big-data so exclusive and so appealing.
Because before big-data, there was no variety in anything, at all. We lived in a bland world, bereft of detail, nuance and diversity. Nothing could be measured, analysed or explained, because we lacked big-data. We were ignorant and stupid. We couldn’t see the sense of putting the diapers next to the beer. Offering three for the price of two did not occur to us. We also didn’t understand giving a 50% discount on the second of two identical items.
Fortunately, today, this is no longer the case. If we choose otherwise, we can avoid it. Thanks to big data, we have a veritable sensorial explosion. No longer is IT just a couple of symbols scribbled in crayon on someone’s school notebook.
Virility: Move over smart data; the new kid on the block is big-data.
Who would have imagined? Fourteen begets in the Bible, and how many precipitations in big-data? Bazillions, I bet.
Big data creates itself, in and of itself. The more big-data you have, the more big-data gets generated. It’s like a self-fulfilling prophecy in 360 degrees, high-definition, poly-faceted and all-encompassing knowing. The sort of thing that governments would pay an arm and a leg to get their mitts on.
Virginia Woolf was both a great English stand-up philosopher and a basketball coach. To paraphrase her, “It was the stupidity of big-data punditry that impressed me. The dopes, having made those convenient bullshit lines of machine learning, speed along them unquestioning.”
I rest my case.
Velocity: Velocity is of the essence. Speed kills the competition, so adopt the mantra, more velocity, less haste.
We demand that the service is ‘velocious’, that is, quickly delicious. ‘Everything’ must be ‘now’, or it’s too late.
It means that we need to be able to handle big-data at velocity – at the speed of need.
Charles Babbage once stated that work ‘itself light, requires increased velocity to economize time.’ He mentioned this more than once.
But remember, we are dealing with mega-velocity here, so don’t drink and drive the big-data steamship, Star-ship or Mustang.
Vendibility: If you can sell it, and sell it as big-data, then it ‘is’ big-data. If you can’t, then it’s not. The saleability of big-data proves its existence.
So, what are the vendible aspects of big-data?
Let’s leave that easy question for another day. But for now, I can confidently state that it is used to mobilise armies of commentators, industry analysts, and publicists. It also mobilises punters, writers, bloggers, gurus, and futurologists. Moreover, it involves conference organisers, conference speakers, educators, and customer relationship managers. Finally, it includes salespeople, marketers, and admen.
Vaticination: Edmund Burke is down on record as stating that “you can never plan the future by the past.” Burke was an intelligent person in many areas. However, he wasn’t a whiz when it came to big data or his unstructured Python code.
Some people in the world are certain that big-data gives visionary and predictive powers. These powers were once thought obtainable only through ritual sacrifice, magic potions, and casting spells and runes. Others are highly critical of the idle understatement implicit in this belief.
For many, big-data will make the Oracle of Delphi look like a mere local call-centre for perturbed Athenians.
This implicit ambiguity and obscurity exist. These are why the power of vaticination plays a vital role in the world of big data.
Voracity: As George Bernard Shaw might have stated: “Man is the only animal which esteems itself rich.” This is based on its big-data quantity. Its value is also measured by the veracity of this data.
This is based on the quasi-rationalist argument. It states that big-data is significant. It argues that it has an omnipresent and insatiable self-fulfilling desire.
Big data requires dedicated hardware. This is the case even if it involves consumer hardware. The consumer hardware might be combined in a magnificent and miraculous mesh of magic. This is good for business.
Big data can be characterised by voracity, but this comes hand in hand with the ventripotent IT industry.
Veracity: The eminence of the data being captured for big-data handling can vary significantly. The quality of the data can significantly impact the accuracy of analysis. A lack of data quality can also influence results.
We knew nothing about data quality or data verification. We were like simple troglodytes. Before big-data arrived on the scene, we were destined to encounter real data intelligence, whether we wanted it or not. This transitional narrative is why ETL and data cleansing tools were powerless. They could not effectively quality check and verify data. This made it impossible to ensure that any erroneous or anomalous data was rejected or flagged.
Fortunately these days, we have sophisticated tools at our disposal. Tools such as grep and awk empower us. We can ensure nothing ‘dodgy’ gets into the analytical mix.
To paraphrase Robert Louis Stevenson, “Truth in big-data and reason, not truth to the conjecture, is the true veracity.”
Vanity: To fully grasp big-data’s underlying meaning, we must understand pride. Distinguishing it from conceit is crucial. In my opinion, this understanding is essential. We must also comprehend the difference between satisfaction and narcissism. Max Counsell claimed that “Vanity is the flatterer of the soul.” Goethe characterised vanity as being “a desire for personal glory.” After an incident with an anarchist, likely a big-data anarchist, Blackadder remarked to Baldrick. He said that “The criminal’s vanity always makes them make one tiny but fatal mistake.” Theirs was to have their entire conspiracy printed and published in plain manuscript.
Vintage: When it comes to data, vintage (data not wine) is the big V. It is something most people tend to find complicated and confusing. But the bottom line is that it’s all quite unpretentious. A data’s vintage lets you know the year the data items were selected. The best vintage data are the data that we have available.
To paraphrase Carl Young “Big data are born in a given event and in a specific place. They have the qualities of the year and season in which they are created, much like years of vintage records. Big data analytics do not lay claim to anything more.”
Vulgarity: Vulgarity is the data’s state of being vulgar.
Vulgarity is big data’s profane to data warehousing’s sacred. It is big data’s explicit to data warehousing’s nuance. It is big data’s “try something dirty” to data warehousing’s “this is what you asked for.”
Have you heard about good taste, simplicity, beauty, manners, politeness and refinement? You have? Well, forget about all of that bullshit.
Big data lacks sophistication and taste. It’s unrefined, coarse and rude. Big data can say what it wants to say and with absolute impunity. Big data can embarrass, shame and wallow in its base crudities. It is marking big-data out from the rest of the data pack. Big data is gigantic, thuggish and uncouth. It’s discourteous, rude and uncivil. It is insolent, cheeky and brazen. And it’s all yours.
This is why some priggish and puritanical data scientists have such a hard time with big-data. But they are wrong. It’s not dirty data. It is a vulgar data.
And to paraphrase that great stand-up philosopher and man-about-town E. M. Forster “It is the vice of a vulgar mind to be thrilled by big-data.”
Virtuousness: Moral and ethical principles are absolutely essential in the entire life-cycle of data. They hold particular importance for data about people. This ensures we are doing the right thing right. Virtuousness defines big-data in several different ways. By its absence. By its slight yet irritating presence. Or by the embarrassing reminder of its modern-day irrelevance to the usage of data in many parts of the world. Think of virtuousness as being one of big data’s anti-patterns.
The rap
So that ends the brief rundown of the twelve defining characteristics of big-data.
To summarise. That, which has passed before, necessarily divulges both the upside and downside characteristics of big-data. I have reached out to you for this reason. I did not open up the kimono or push the plain brown envelope. I related the twelve V-words, and in no uncertain terms. In doing so, I chose to disclose the undisclosed. I exhibited the absence of essential essence. In doing this, I have opened up the entire field. Every discipline, profession, science, and art is subject to examination and questioning. They are also open to ridicule, especially ridicule. Welsh ridicule. The worst possible kind of ridicule.
I hope, above all, that this risk-fraught disclosure of the twelve holier-than-thou characteristics of big-data will pique your interest. Discover the amazing, fabulous and marvelous world of data and beyond.
Comments in February 2026
The Twelve Vs: Big Data’s Confession Booth. In the grand carnival of big data, every glittering promise has always carried its shadow. The algorithms were supposed to predict our futures, optimise our economies, and cure our inefficiencies. However, they have amplified bias, drowned signal in noise, eroded privacy, and occasionally just plain failed spectacularly. History doesn’t lie. What has come before shows the seductive upside of treating data as the new oil. It also reveals the brutal downside.
That’s why I’m here, not to flash corporate secrets from behind the curtain. The phrase “opening up the kimono,” comes from the old Silicon Valley. I am also not here to slip you some illicit dossier in a plain brown envelope. Instead, I’m laying out the twelve V-words in plain sight, no varnish, no euphemism.
By naming them, I am intentionally exposing what the industry prefers to keep undisclosed. There is no single essential core that justifies the trillion-dollar frenzy. I am flinging open the doors of the entire field. Its science, art, profession, and pretensions are subject to scrutiny. They are open to interrogation and, yes, especially ridicule.
And not just any ridicule. Welsh ridicule. The kind that arrives wrapped in dry wit and ancient grudges. It also comes with a linguistic precision sharp enough to draw blood without raising the voice. This is the sort of mockery that has survived centuries of being called outsiders, foreigners, and worse. It turns that very marginalization into a weapon of devastating understatement. This ridicule doesn’t shout. It simply states the obvious until the emperor realizes he’s naked. Then, it keeps stating it.
These twelve Vs aren’t the usual marketing gloss (Volume! Velocity! Variety! Value!). They form a more complete, more unflinching taxonomy. This taxonomy admits the circus is unvirtuous. The promises are overinflated. The emperors are underdressed. They force the conversation beyond cheerleading into something closer to honesty.So here we stand: the data deluge laid bare, its grandeur and its grift on full display. Laugh if you must. Ridicule if you dare. But look closely. Because once the twelve Vs are spoken aloud, there’s no un-saying them. The field can never quite pretend the same innocence again.
(And if the Welsh contingent is listening: diolch yn fawr. Your brand of scorn remains, as ever, peerless.)
Discover more from GOOD STRATEGY
Subscribe to get the latest posts sent to your email.