The AEP Field Change That Could Have Broken Everything — And the Scanner I Built to Prevent It

number is a string. Not in dev — dev is fine. In *production*. And it's been like that for months.No dependency graph. No "find all references." No impact analysis. Just vibes and a prayer.
So naturally, I built one.
How We Got Here
loyalty.tierPoints — was modeled as a number in our dev sandbox. Alright; akes sense. Points are numbers.string in production. Maybe it was a manual schema build. Maybe someone copied the field group and fat-fingered the type. Doesn't matter how — what matters is that the data was ingesting *just fine* because it matched the schema as defined. AEP wasn't silently coercing anything. The data *was* strings. That was the whole problem — the schema said string, the source sent strings, ingestion succeeded, and nobody questioned it.Nobody noticed for months because:
1. Data ingestion was succeeding — the data matched the (wrong) schema perfectly
2. Segment builder doesn't flag when you compare a string field to a numeric literal
3. PQL expressions "worked" in dev (where the type was correct) and returned garbage in prod
"500" > "1000" evaluates to true in string comparison because "5" comes after "1" lexicographically. So our "High Value Members" segment was qualifying people with 500 points while excluding people with 1,000+. Backwards. For months.null in prod. You can't SUM strings. Traced it back to the schema, found the mismatch, and then had one of those "oh no" moments where you realize the fix might be worse than the bug.Why You Can't Just Fix It
Changing a field type in AEP isn't a migration — it's a nuke. You can't alter an existing field's type. Profile enabled XDM schemas are append-only by design. So the "fix" is actually:
1. Create a *new* field with the correct type
2. Update every single thing that references the old field to point to the new one
3. Backfill historical data
4. Deprecate the old field
5. Try not to miss anything and blow up prod
Step 2 is where I started sweating. "Every single thing" includes:
And AEP's UI gives you... nothing. No search. No cross-reference. You're expected to manually click through every segment, every computed attribute, every dataset, and visually confirm whether it references your field.
For an enterprise platform where a single schema might back hundreds of segments across multiple sandboxes, that's not a plan. That's a liability.
Discovering What the APIs Could Do
Here's where it got interesting. I was initially resigned to the manual audit — open each segment, Ctrl+F the PQL, take notes in a spreadsheet. Then I started poking around the AEP API documentation, mostly out of curiosity and realized the data I needed was *all there*. Adobe just never built a UI for cross-referencing it.
/data/core/ups/segment/definitions). I was expecting it to return segment metadata — name, status, maybe an ID. But it returns the full PQL expression. The actual query logic. Every segment, fully readable, fully searchable.curl just to see:{
"id": "seg-123",
"name": "High Value Loyalty Members",
"expression": {
"type": "PQL",
"value": "loyalty.tierPoints > 5000 AND loyalty.status = 'active'"
}
}There it is. The field path, right in the PQL string. If I could pull every segment and regex across the expressions, I'd have my dependency map.
/data/foundation/catalog/datasets) — same deal. Every dataset comes back with its schemaRef, so you can trace which datasets are bound to which schema. If I know the schema that contains my field, I know every dataset that could hold that data./data/core/ups/config/computedAttributes) — returns the expression definitions. Same PQL-style field references, same searchability./data/foundation/schemaregistry/tenant/schemas) — this one's the most thorough but also the slowest. You can list all tenant schemas, then fetch each one's full definition with Accept: application/vnd.adobe.xed-full+json; version=1. The full JSON tree contains every field path in the schema. String match against it and you know exactly which schemas contain your field.Four APIs. All documented. All returning exactly what I needed. Adobe had built the ingredients and never assembled the dish.
Building the Scanner
Once I knew the APIs had the data, the scanner was straightforward. Python, FastAPI, about 300 lines of actual logic.
in check — person.age shouldn't match person.ageGroup:def _field_matches(text: str, field_path: str) -> bool:
pattern = re.escape(field_path) + r'(?![.\w])'
return bool(re.search(pattern, text, re.IGNORECASE))Word-boundary matching. Simple, but it eliminated every false positive I threw at it.
_page.next links until exhausted, and runs the match against each PQL expression. For each hit, it returns the segment name, lifecycle status, evaluation type, and a snippet of the PQL so you can see *how* the field is used without having to open the UI.Same approach for computed attributes — pull them all, match the expressions. Datasets get filtered by schema ID if you have one, or by name/description search as a fallback.
One endpoint gives you the full picture:
GET /field-usage/scan?field_path=loyalty.tierPoints&sandbox=prod{
"field_path": "loyalty.tierPoints",
"sandbox": "prod",
"segments": [
{
"id": "seg-123",
"name": "High Value Loyalty Members",
"status": "ACTIVE",
"evaluation_type": "batch",
"pql_snippet": "loyalty.tierPoints > 5000 AND ..."
}
],
"computed_attributes": [
{
"id": "ca-789",
"name": "lifetimeTierPoints",
"status": "ACTIVE",
"expression_snippet": "SUM(loyalty.tierPoints, 'P365D')"
}
],
"datasets": [
{
"id": "ds-456",
"name": "Loyalty Events",
"schema_id": "https://ns.adobe.com/tenant/schemas/abc123"
}
],
"summary": {
"segments": 1,
"computed_attributes": 1,
"datasets": 1,
"total_references": 3
}
}Three references. Not zero ("it's probably fine"). Not unknown ("let's just hope"). Three, with names, IDs, and the exact expressions. Now you can update each one, verify it, and move on.
/field-usage/segments, /field-usage/computed, /field-usage/datasets, /field-usage/schemas. Faster than a full scan when you just want to spot-check.What I Learned
tierPoints > 5000 where tierPoints is a string just... does string comparison. Your segment qualifies the wrong profiles and nothing tells you. This is, in my opinion, a genuine platform defect — but it's been this way long enough that I don't expect it to change.The Takeaway
requests library. The hardest part is getting the auth token sorted — the actual scanning is just paginated GETs with regex matching.But here's the thing that still gets me: I shouldn't have had to build this. Every enterprise data platform — Snowflake, BigQuery, Databricks — has some form of lineage or impact analysis. AEP is a platform where a single field change can silently corrupt audience qualifications that drive millions of dollars in ad spend, and there's no built-in way to check what depends on what.
Until Adobe fixes that, at least now there's a scanner.
*If you've hit the same "what depends on this field?" wall in AEP, I'd genuinely love to hear how you dealt with it. And if you've found other gaps where the APIs expose data that the UI doesn't surface — that's usually where the best tooling opportunities are hiding. Drop it in the comments below.*
Comments
Loading... on "The AEP Field Change That Could Have Broken Everything — And the Scanner I Built to Prevent It"
Join the Discussion
Related Articles
Continue reading with these related posts

The Hidden Tax on Your Customer Data
*After 16 years implementing multi-channel analytics for enterprises, I've watched the pendulum swing hard toward cloud. Now I'm watching it start to ...

The Hidden Challenges of Adobe Analytics in Ionic Capacitor Apps
Spent weeks fixing 30-second app startup hangs in my Ionic Capacitor app. The culprit? Adobe Analytics making native network calls to demdex.net durin...

Unlocking the Future of Digital Experiences
Hello, I'm Chip (Andrew Chepurny), a marketing tech expert with years in Adobe Experience Platform. This blog explores MarTech's power, AEP's role, an...