Tags
Nonsense on parade

Martyn Rhisiart Jones
Madrid, Tuesday 13th January 2026
This GigaOm report is a classic vendor-sponsored benchmark. Fivetran picked the competitors and shaped the test scope. GigaOm (as explicitly disclosed) executed it “as-is” with “compatible configurations subject to judgment.” It’s marketing material presented as independent research. It has several structural weaknesses. These weaknesses make the 77–95% cost savings claim highly misleading for most real organisations.
1. It only measures ingestion compute, not true TCO.
The report repeatedly calls itself a “TCO report,” but it explicitly excludes:
- Fivetran’s own pricing (Monthly Active Rows, which is the same in both arms, so “fair,” but conveniently invisible)
- Most transformation / ELT compute (they avoided transformations entirely for “simplicity”)
- Query/analytics workload costs (beyond the tiny gold-layer merge)
- Governance, cataloguing, security, and permission management (which the appendix admits is “a total mess” with Iceberg + Lake Formation)
- Engineering time (the quoted customer says managing a lake without Fivetran would need 5–6 engineers, yet the report doesn’t quantify the value of Fivetran absorbing that)
- Egress, concurrency scaling, or cold-start penalties
They literally only compare incremental sync write cost + a small gold MERGE cost, extrapolated to 4 syncs/day × 365.
That’s the ingestion delta cost, not the total cost of ownership. Calling it TCO is marketing spin.
Real-world picture – Many organisations spend 10–40% of their warehouse bill on ingestion. However, they spend 60–90% of their time on queries, such as BI, ML training, or analysts hammering Snowflake/Databricks. Moving ingestion to a lake does not touch the dominant cost driver.
2. The test design is engineered for the lake to “win” on paper. Look at what they actually tested:
- Direct-to-warehouse = Fivetran ➡️ warehouse native load ➡️ warehouse runs expensive MERGE on its own compute
- Lake path = Fivetran ➡️ ️lake (Iceberg/Delta) plus Fivetran absorbs the ingest compute (they literally say “we absorb the cost of ingest compute when writing to a data lake”)
➡️ Of course, the lake path is cheaper. Fivetran is eating the write cost as part of their product pricing. They even admit this in blog posts. They state, “Fivetran covers the costs of ingestion into the data lake, greatly reducing your TCO.” It’s like comparing:
- “Buy a Ferrari and pay for gas yourself”
- “Buy a Ferrari, but we give you free gas for the first 10,000 miles”
The second looks cheaper… until you look at the car price (Fivetran MAR).
The test also uses tiny warehouse sizes. This includes XS Snowflake = 1 credit/h ≈ $2–3/h, 2X-Small Databricks, and 8-RPU Redshift serverless. These sizes are undersized for the real 3 TB TPC-DS and frequent merges. This situation artificially inflates per-sync warehouse costs.
3. Performance is downplayed, but matters more than they admit
- 8–10% slower incremental syncs → sounds minor
- But that’s just sync duration, not end-to-end freshness
- In real pipelines, downstream Spark/Dremio/Trino jobs on Iceberg can be significantly slower than native warehouse tables for many workloads (lack of automatic clustering, materialised views, etc.)
- The “gold merge” cost is only the warehouse MERGE after lake staging. However, if you keep everything in the lake, you now pay query compute on Databricks/Snowflake external tables. This is usually more expensive per TB scanned than native tables.
4. The “customer quotes” are cherry-picked anecdotes. The $100k/year Snowflake savings quote is nice, but it’s from one unnamed company. There are no before/after numbers, and it likely includes heavy Fivetran usage. Another quote admits Iceberg permissions are “a total mess”, a huge real-world pain point the report glosses over.
If you already pay for Fivetran (or plan to), routing through Fivetran Managed Data Lake can save meaningful money. This is especially true if your biggest pain is ingestion compute on Snowflake/Databricks/Redshift. You can tolerate slightly slower syncs and eventual external table querying. These changes can save 20–50% of ingestion costs in many cases, not the headline 95%. But 95% overall TCO savings? Laughable. That’s only possible if your warehouse is used almost exclusively for ingestion (which almost no serious analytics org does).
This is standard vendor-sponsored benchmark behaviour. It has a narrow scope in which the vendor absorbs key costs. There is a headline-grab with big percentages. They bury the caveats in the disclaimer (“TCO is only one criterion… changes over time…”).
Modern data lakes and open formats are great. Fivetran’s managed service looks solid for automating Iceberg/Delta landing. But this report is not independent proof that data lakes cost less than warehouses. It’s a sales deck with tables. Use it as a signal to test the Fivetran lake path yourself on a subset of tables. Do not treat it as gospel. Real TCO decisions need your query patterns, concurrency, transformation volume, and team velocity factored in.
Many thanks for reading.
😺 Click for the last 100 Good Strat articles 😺
Discover more from GOOD STRATEGY
Subscribe to get the latest posts sent to your email.