The Hidden Tax on Your Customer Data

November 27, 2025
Calculating...
rising_cost_saas.png

When Adobe announced the end-of-life for Data Workbench, I got calls from three different clients within 48 hours. Not because they were surprised—the writing had been on the wall ever since Adobe purchased Omniture—but because they were panicked. These are companies in the Telecom, Insurance and Financial Services industries.

"Where do we go now?" one VP of Data Strategy asked. "Everything is cloud-only. We literally cannot put patient data in someone else's infrastructure."

He wasn't wrong. And he's not alone. I was too intellectually immature in my career at that point to really understand the broader strategy outside of Adobe's suite of products; so there was very little to no comfort I could provide. Those clients used it until the End of Life support expired. Some stayed with Adobe and became flagship Real-Time CDP customers, but others faded out of my inbox.


The Quiet Erosion

Somewhere in the last decade, "cloud-first" became "cloud-only." And nobody noticed what we gave up in the trade.

I've watched this pattern play out dozens of times to a lesser or greater extent for any SaaS product I've designed and implemented:

Year 1: Sign the contract. Pricing looks reasonable. Implementation goes smoothly. Everyone's optimistic.

Year 2: Data volumes grow. More sources connected. New use cases emerge. The bill climbs, but it's manageable.

Year 3: That "reasonable" $75K annual contract is now $300K. Data egress fees alone are six figures. Finance is asking questions. And you're so deeply integrated that migration would cost more than staying.

This isn't a bug. It's the model working exactly as designed.


The Costs Nobody Mentions

License fees are the tip of the iceberg.

Data egress is the silent killer. Every time you pull your own data out for analysis, reporting, or integration—you pay. At scale, this becomes your largest line item.

Compliance overhead multiplies. Every security review, every SOC 2 audit, every HIPAA assessment now includes your cloud vendors. That's not just money—it's calendar time and internal resources you can't get back.

Schema rigidity can't be quantified but it's very real. When your CDP forces you into their data model, you're not just inconvenienced—you're losing insights that don't fit their predetermined structure. Or when you run out of custom variables for your Analytics solution and you either need to pay more or repurpose something that may or may not have data contamination from the previous data point.

Talent dependency grows silently. Your team becomes experts in one vendor's way of doing things, not in analytics fundamentals.


The Guardrails You Didn't Sign Up For

Here's what the glossy sales deck doesn't mention: every cloud platform comes with guardrails. Some are reasonable. Many are not.

Identity limits. Your CDP caps the number of identity namespaces you can define. You've got cookie IDs, device IDs, CRM IDs, loyalty IDs, email hashes, phone hashes—and suddenly you're negotiating with your account team for permission to add another identifier type. Your business complexity shouldn't require a contract amendment.

Event quotas. You're paying per event, but there's also a ceiling. Exceed your contracted volume and you're either throttled, paying overage penalties, or scrambling for an emergency contract renegotiation mid-quarter.

Retention walls. Raw event data? Gone after 30 days. Or 90, if you're lucky. Want to run an analysis against last year's behavioral data? Too bad—you should have thought of that when you designed your summary tables. Or you're exporting to a data warehouse and paying egress on your own data.

Processing windows. "Real-time" often means "within the hour." Batch processing jobs run overnight. You're building dashboards that show yesterday's story, not today's.

Schema lockdown. Need to add a field? Open a support ticket. Want to restructure your event taxonomy? That's a professional services engagement. The platform owns your data model, not you.

Destination restrictions. You can activate to *their* partner ecosystem. Want to send data somewhere they haven't pre-integrated? Build it yourself—and hope their API rate limits don't throttle your use case.

These guardrails exist because multi-tenant cloud infrastructure requires constraints. Fair enough. But somewhere along the way, we accepted these constraints as normal instead of recognizing them as trade-offs.

They're not requirements. They're choices. And there are other choices available.


Who Actually Needs Another Way?

Certainly, not everyone. But the list is longer than the industry or the folks on Measure Slack would readily admit.

Healthcare organizations dealing with PHI. Financial services firms handling trading data or the behavioral use of their Digital Platforms. Government contractors with controlled and classified information. Any company with strict data residency requirements. And increasingly—organizations that have simply done the math and realized that at their scale, owning infrastructure is cheaper than renting it in the long run; especially if their business is on a slower but steady growth trajectory.

The market gap for these organizations is not that on-premise analytics doesn't exist, but that existing legacy solutions are still struggling to keep up. The real gap lies in the absence of modern, purpose-built solutions specifically designed for on-premise deployment.

If you want on-prem today, your options are legacy tools approaching end-of-life, open-source frameworks requiring significant engineering, or enterprise platforms that technically support on-prem but clearly prioritize cloud.

None of these give you what cloud CDPs have normalized: real-time collection, flexible schemas, cross-channel identity, and visual analytics—all in one platform.


A Better Way Forward

What if you didn't have to choose between modern capabilities and infrastructure control?

The technology exists. The architectural patterns are well-understood. Horizontal scaling, schema-on-read, probabilistic identity matching, real-time stream processing—none of this requires a multi-tenant cloud to function. These are engineering problems, not business model requirements.

The reason on-premise solutions feel dated isn't a technical limitation. It's an investment priority. When venture capital flows toward recurring revenue and cloud-first architectures, that's what gets built. On-premise becomes a legacy checkbox, not a first-class citizen.

But for the organizations I've worked with—the ones managing PHI, the ones with data sovereignty mandates, the ones tired of unpredictable invoices—there's a different value equation. One where:

- You define the schema. Your data model reflects your business, not someone else's assumptions about what fields you need.
- You control the identities. Add as many identifier types as your business requires. No caps. No negotiations.
- You own the retention. Keep raw events forever if you want. Run analyses against data from five years ago. Your storage, your rules.
- You set the processing cadence. Real-time means real-time. Watch the numbers move as events flow through.
- You choose the destinations. Activate to anywhere. No partner ecosystem restrictions. No API rate limits you didn't set yourself.

This isn't theoretical. These aren't features on a roadmap. This is what I've been building.


Something Different

I've spent the last year building what I wished existed for my clients 10 years ago.

It takes a different approach:

User-defined schemas instead of forcing your data into someone else's model. You know your business. Build the data structures that reflect it.

Modular architecture where each component scales independently. Need more processing? Add nodes. Need more collection capacity? Add sensors. No rearchitecting required.

One-time licensing instead of per-event pricing. Your costs don't balloon because your business is growing. If you need more processing power, then you can either expand your infrastructure on your schedule or "club up" and increase your server license count.

Zero data egress because your data never leaves your infrastructure in the first place.

Here's what might surprise you: this isn't an anti-cloud position. If your organization is already invested in AWS, Azure, or GCP—and you're comfortable managing your own infrastructure there—nothing stops you from deploying this on cloud VMs. The architecture is infrastructure-agnostic. It runs on bare metal, virtualized data centers, private cloud, public cloud, hybrid configurations or a spare Mac Mini sitting on your desk. The distinction isn't where the servers live. It's who controls them. When you deploy on your own cloud account, you own the data residency decisions, you control the scaling economics, and you're not paying a vendor margin on top of your compute costs. That's a fundamentally different proposition than handing your customer data to a multi-tenant SaaS platform.


Is This For You?

Maybe? Maybe not...

If you're a small marketing team doing basic web analytics, a cloud solution is probably simpler and more cost-effective. I'm not here to create complexity where you don't need it. I always tell my customers that, "...there's a tool for every job..." I'm not here to propose a solution to something that isn't a problem for you. Google Analytics works for your needs? I'd love to help you get more out of it!

But if you've felt the pain points I've described—if you've watched your analytics costs spiral, struggled with compliance requirements, or wished you could just define your own data model—then maybe we should talk.

I'm building this thing. It doesn't have a public name yet. But if you want to know when it does, I'm keeping a list.

No pitch deck. No sales call. Just your email, and I'll let you know when there's something to see. Or if you'd rather strike up a conversation about this or about another pain point that you would be interested in getting rid of with your current Martech stack, drop a comment below.


Comments

Loading... on "The Hidden Tax on Your Customer Data"

Join the Discussion

0/5000
reCAPTCHA loading...

Related Articles

Continue reading with these related posts