Skip to main content

The AEP Field Change That Could Have Broken Everything — And the Scanner I Built to Prevent It

April 11, 2026
Calculating...
2158001-Digital illustration of a single male so-OfficialStableDiffusionJuggernaut-XL_v9.png
Let me set the scene. You're staring at an XDM schema in Adobe Experience Platform. A field that should be a number is a string. Not in dev — dev is fine. In *production*. And it's been like that for months.

You know you need to fix it. You also know that AEP gives you exactly zero tools to answer the only question that matters: "What breaks if I touch this?"

No dependency graph. No "find all references." No impact analysis. Just vibes and a prayer.

So naturally, I built one.

How We Got Here

Here's the scenario. A field — let's call it loyalty.tierPoints — was modeled as a number in our dev sandbox. Alright; akes sense. Points are numbers.

Somewhere along the way, the same field got created as a string in production. Maybe it was a manual schema build. Maybe someone copied the field group and fat-fingered the type. Doesn't matter how — what matters is that the data was ingesting *just fine* because it matched the schema as defined. AEP wasn't silently coercing anything. The data *was* strings. That was the whole problem — the schema said string, the source sent strings, ingestion succeeded, and nobody questioned it.

Nobody noticed for months because:

1. Data ingestion was succeeding — the data matched the (wrong) schema perfectly

2. Segment builder doesn't flag when you compare a string field to a numeric literal

3. PQL expressions "worked" in dev (where the type was correct) and returned garbage in prod

That last one is the killer. "500" > "1000" evaluates to true in string comparison because "5" comes after "1" lexicographically. So our "High Value Members" segment was qualifying people with 500 points while excluding people with 1,000+. Backwards. For months.

We caught it because a computed attribute that summed tier points started returning null in prod. You can't SUM strings. Traced it back to the schema, found the mismatch, and then had one of those "oh no" moments where you realize the fix might be worse than the bug.

Why You Can't Just Fix It

Changing a field type in AEP isn't a migration — it's a nuke. You can't alter an existing field's type. Profile enabled XDM schemas are append-only by design. So the "fix" is actually:

1. Create a *new* field with the correct type

2. Update every single thing that references the old field to point to the new one

3. Backfill historical data

4. Deprecate the old field

5. Try not to miss anything and blow up prod

Step 2 is where I started sweating. "Every single thing" includes:

- Segment definitions — PQL expressions filtering on the field
- Computed attributes — aggregations reading from it
- Datasets — anything ingesting into this schema
- Dataflows — source connectors mapping to this field
- Destinations — activations exporting it to downstream systems
- CJA data views — derived fields and dimensions built on top
- Launch/Web SDK — data elements mapping to the XDM path

And AEP's UI gives you... nothing. No search. No cross-reference. You're expected to manually click through every segment, every computed attribute, every dataset, and visually confirm whether it references your field.

For an enterprise platform where a single schema might back hundreds of segments across multiple sandboxes, that's not a plan. That's a liability.

Discovering What the APIs Could Do

Here's where it got interesting. I was initially resigned to the manual audit — open each segment, Ctrl+F the PQL, take notes in a spreadsheet. Then I started poking around the AEP API documentation, mostly out of curiosity and realized the data I needed was *all there*. Adobe just never built a UI for cross-referencing it.

The breakthrough was the **Segmentation API** (/data/core/ups/segment/definitions). I was expecting it to return segment metadata — name, status, maybe an ID. But it returns the full PQL expression. The actual query logic. Every segment, fully readable, fully searchable.

I pulled one down with curl just to see:

json
{
  "id": "seg-123",
  "name": "High Value Loyalty Members",
  "expression": {
    "type": "PQL",
    "value": "loyalty.tierPoints > 5000 AND loyalty.status = 'active'"
  }
}

There it is. The field path, right in the PQL string. If I could pull every segment and regex across the expressions, I'd have my dependency map.

Then I checked the **Catalog API** (/data/foundation/catalog/datasets) — same deal. Every dataset comes back with its schemaRef, so you can trace which datasets are bound to which schema. If I know the schema that contains my field, I know every dataset that could hold that data.

**Computed Attributes API** (/data/core/ups/config/computedAttributes) — returns the expression definitions. Same PQL-style field references, same searchability.

**Schema Registry** (/data/foundation/schemaregistry/tenant/schemas) — this one's the most thorough but also the slowest. You can list all tenant schemas, then fetch each one's full definition with Accept: application/vnd.adobe.xed-full+json; version=1. The full JSON tree contains every field path in the schema. String match against it and you know exactly which schemas contain your field.

Four APIs. All documented. All returning exactly what I needed. Adobe had built the ingredients and never assembled the dish.

Building the Scanner

Once I knew the APIs had the data, the scanner was straightforward. Python, FastAPI, about 300 lines of actual logic.

The core matching function needed to be smarter than a raw in check — person.age shouldn't match person.ageGroup:

python
def _field_matches(text: str, field_path: str) -> bool:
    pattern = re.escape(field_path) + r'(?![.\w])'
    return bool(re.search(pattern, text, re.IGNORECASE))

Word-boundary matching. Simple, but it eliminated every false positive I threw at it.

The segment scanner paginates through every definition (AEP caps at 100 per request), follows the _page.next links until exhausted, and runs the match against each PQL expression. For each hit, it returns the segment name, lifecycle status, evaluation type, and a snippet of the PQL so you can see *how* the field is used without having to open the UI.

Same approach for computed attributes — pull them all, match the expressions. Datasets get filtered by schema ID if you have one, or by name/description search as a fallback.

One endpoint gives you the full picture:

GET /field-usage/scan?field_path=loyalty.tierPoints&sandbox=prod

json
{
  "field_path": "loyalty.tierPoints",
  "sandbox": "prod",
  "segments": [
    {
      "id": "seg-123",
      "name": "High Value Loyalty Members",
      "status": "ACTIVE",
      "evaluation_type": "batch",
      "pql_snippet": "loyalty.tierPoints > 5000 AND ..."
    }
  ],
  "computed_attributes": [
    {
      "id": "ca-789",
      "name": "lifetimeTierPoints",
      "status": "ACTIVE",
      "expression_snippet": "SUM(loyalty.tierPoints, 'P365D')"
    }
  ],
  "datasets": [
    {
      "id": "ds-456",
      "name": "Loyalty Events",
      "schema_id": "https://ns.adobe.com/tenant/schemas/abc123"
    }
  ],
  "summary": {
    "segments": 1,
    "computed_attributes": 1,
    "datasets": 1,
    "total_references": 3
  }
}

Three references. Not zero ("it's probably fine"). Not unknown ("let's just hope"). Three, with names, IDs, and the exact expressions. Now you can update each one, verify it, and move on.

There are also focused endpoints if you only need to check one asset type — /field-usage/segments, /field-usage/computed, /field-usage/datasets, /field-usage/schemas. Faster than a full scan when you just want to spot-check.

What I Learned

Type mismatches between sandboxes are a ticking time bomb. AEP doesn't enforce type consistency across sandboxes. Sandbox promotion tools don't catch it either. If you have dev/stage/prod sandboxes, go audit your critical fields right now. I'm serious. Open both schemas side by side and compare the types on any field that's used in a computation or comparison. You might not like what you find.

**PQL silently does the wrong thing with wrong types.** There's no warning, no error, no log entry. tierPoints > 5000 where tierPoints is a string just... does string comparison. Your segment qualifies the wrong profiles and nothing tells you. This is, in my opinion, a genuine platform defect — but it's been this way long enough that I don't expect it to change.

The AEP APIs are better than the UI. That's not a compliment to the APIs — it's an indictment of the UI. The Segmentation API returns full PQL. The Schema Registry returns full definitions. The Catalog API returns schema bindings. All the data is there for cross-referencing. Adobe just hasn't connected the dots in the product.

This should be a native feature. I submitted a feature request on the Experience League Ideas community: built-in field-level impact analysis. Before any schema change, you should be able to right-click a field in the schema editor and see "Used by: 3 segments, 1 computed attribute, 5 datasets." The APIs already support it. Someone just needs to build the UI. Adobe, if you're reading this — please. This would save every AEP customer hours of anxiety per schema change.

The Takeaway

The scanner is about 300 lines of Python. It needs an Adobe I/O service account with AEP access and the requests library. The hardest part is getting the auth token sorted — the actual scanning is just paginated GETs with regex matching.

But here's the thing that still gets me: I shouldn't have had to build this. Every enterprise data platform — Snowflake, BigQuery, Databricks — has some form of lineage or impact analysis. AEP is a platform where a single field change can silently corrupt audience qualifications that drive millions of dollars in ad spend, and there's no built-in way to check what depends on what.

Until Adobe fixes that, at least now there's a scanner.


*If you've hit the same "what depends on this field?" wall in AEP, I'd genuinely love to hear how you dealt with it. And if you've found other gaps where the APIs expose data that the UI doesn't surface — that's usually where the best tooling opportunities are hiding. Drop it in the comments below.*

Comments

Loading... on "The AEP Field Change That Could Have Broken Everything — And the Scanner I Built to Prevent It"

Join the Discussion

0/5000
reCAPTCHA loading...

Related Articles

Continue reading with these related posts