The Engine — RetailNorm

01 — The Problem

Attribution is a measurement problem, not a data problem

You already have the data. It's the rulers that are broken — and they break differently on every platform, in every category, every quarter.

When a media planner runs campaigns across Amazon Ads, Walmart Connect, and Criteo simultaneously, they get three separate performance reports. Each report claims a different ROAS. But those numbers cannot be compared side by side — because they were measured differently.

The differences aren't minor rounding errors. They're structural. Each platform uses a different combination of attribution window (how many days after a click count as a conversion), attribution model (which touchpoint gets the credit), and conversion basis (clicks only, or clicks plus views). These dimensions interact multiplicatively — and they shift over time as platforms quietly update their methodology.

Amazon Ads

Window: 14-day

Model: Last click

Basis: Clicks only

4.2×

Reported ROAS

Walmart Connect

Window: 30-day

Model: Last click

Basis: Clicks + views

6.8×

Reported ROAS

Criteo

Window: 7-day

Model: First click

Basis: Clicks only

3.1×

Reported ROAS

On the surface, Walmart looks like the clear winner at 6.8×. But its 30-day window captures conversions that happened weeks after the ad — many of which would have happened organically. And it includes view-through conversions that the other platforms don't count at all.

The question isn't "which platform reports the highest ROAS." It's "which platform actually drives the most revenue per dollar, when measured on the same terms?"

After normalization — 14-day last-click baseline

Amazon Ads

4.2×

baseline — no change

Walmart Connect

4.7×

↓ corrected from 6.8×

Criteo

3.9×

↑ corrected from 3.1×

When measured on the same terms, the ranking changes. Walmart's advantage was partly measurement artifact. Criteo was systematically penalized by its shorter window and first-click model. The real performance picture only becomes visible after correction.

02 — Architecture

A multi-layer correction system. Not a formula.

The engine applies a structured series of calibrated corrections — each addressing a distinct and documented source of attribution distortion.

RetailNorm's normalization operates across multiple correction dimensions. Each dimension targets a specific structural difference in how a platform measures conversions. The corrections are applied as independent, composable layers — each versioned, each auditable, each calibrated to the specific platform and product vertical.

The entire correction engine runs server-side behind an authenticated API. No correction logic, calibration parameters, or proprietary factors are ever transmitted to the browser. The client application sends raw CSV data to the engine, receives normalized results, and renders the output — but never has access to the correction methodology itself. This architectural isolation protects proprietary research while ensuring clients benefit from continuously updated calibration.

The architecture is deliberately not a single equation. It's a correction framework with interdependent calibration profiles that behave differently depending on the platform, the product category, the reporting period, and the engine version in use. The interactions between layers — how a window correction compounds with a model conversion in a specific vertical — are where the complexity lives, and where the accuracy comes from.

Correction architecture

L0

Server-Side Isolation & API Gateway

Secured

↓

L1

Platform Identification & Schema Mapping

Deterministic

↓

L2

Structural Attribution Correction

Calibrated · Proprietary

↓

L3

Vertical-Specific Behavioral Adjustment

Calibrated · Proprietary

↓

L4

Uncertainty Quantification & Confidence Scoring

Statistical

↓

L5

Output Governance & Version Lock

Versioned

Every correction layer is independently versioned. Reports are locked to the engine version used at generation time, ensuring auditability and reproducibility.

Layers L2 and L3 contain the core correction logic — proprietary calibration profiles built from aggregated campaign performance data, published attribution research, and platform methodology documentation. These profiles are the product of sustained research, not one-time calculation. They encode vertical-level behavioral patterns that can't be derived from a single client's data alone.

The distinction matters. A static multiplier might get you within range. A continuously calibrated, vertical-aware, version-controlled correction system gets you to a number you can defend in a client review — and reproduce six months later when someone asks how you arrived at it.

Why this matters for agencies: When a head of media asks "how did you get this number?", the answer is traceable: correction layer, calibration version, platform profile, engine version — all logged server-side with every API request. Every output carries its provenance. There is no opaque model to defend — just a governed system with a documented audit trail.

03 — Processing Pipeline

From CSV to auditable output in six steps

The entire process runs server-side in seconds. No manual intervention, no staging environments, no SQL. CSV data is sent to the engine API, processed through the correction pipeline, and returned as structured normalized output. Every step is deterministic or explicitly marked when it isn't.

01

Ingest & Parse

Upload a CSV export from any supported platform. The file is transmitted to the engine API, where the parser auto-detects the platform from the filename, maps columns using fuzzy matching, and validates the data structure. It handles inconsistent headers, missing fields, and locale-specific formatting without manual configuration.

Deterministic

02

Platform Identification & Profile Selection

Based on column structure and data patterns, the engine identifies which retail media network generated the file and selects the appropriate calibration profile. Each platform has a versioned profile reflecting its current attribution methodology — not a generic template.

Deterministic

03

Multi-Layer Correction

The engine applies correction layers sequentially: structural attribution correction, vertical-specific behavioral adjustment, and cross-platform calibration. Each layer operates independently with its own versioned parameters. The output is a normalized revenue figure for every campaign row in the original file.

Deterministic · Versioned

04

Uncertainty Quantification

The engine propagates uncertainty across all correction layers using statistical sampling. Rather than reporting a single point estimate, it generates confidence intervals that reflect the cumulative precision of every correction applied. This step is structurally separated from the correction logic — it never modifies the normalized figures.

Statistical

05

Output Governance & Version Lock

Every generated report is locked to the specific engine version, calibration profiles, and platform profiles used at the time of generation. This means the same input processed on the same engine version will always produce the same output — even if calibration profiles have since been updated.

Governance

06

Report Generation & Narrative

Normalized data flows into a comparison dashboard and a one-click exportable report. An AI layer — fully sandboxed from the correction engine — generates executive narratives and, when historical data is available, budget reallocation recommendations.

AI-Enhanced · Sandboxed

Key architectural boundary: Steps 1–4 run entirely server-side behind authenticated API access — the browser never receives correction logic or calibration parameters. These steps are deterministic: the same input always produces the same output for a given engine version. Step 5 enforces traceability. Step 6 is the only layer where AI operates, and it has zero write access to correction logic or numerical output. The AI layer reads normalized data. It never produces it.

04 — Calibration Framework

Corrections are not constants. They're maintained systems.

Each platform profile is the output of a calibration process that accounts for attribution methodology, product vertical behavior, and longitudinal performance patterns.

A naive approach to normalization would treat each correction as a fixed number — a static multiplier applied uniformly. That works as a rough approximation. It doesn't survive scrutiny from a technical buyer, and it doesn't hold accuracy across verticals or over time.

RetailNorm's calibration framework operates on three axes:

Platform Axis

Each retail media network has a unique attribution methodology profile that encodes its specific window length, attribution model, conversion basis, and view-through inclusion rules. These profiles are built from platform documentation, API behavior analysis, and empirical validation against known benchmarks.

Profiles are not interchangeable. The structural correction for a 30-day last-click platform differs fundamentally from a 7-day first-click platform — not just in magnitude, but in the mathematical behavior of the correction itself.

Vertical Axis

Attribution behavior varies significantly by product category. A $5 impulse purchase and a $500 electronics product have fundamentally different conversion patterns — different decision timelines, different path lengths, different sensitivity to touchpoint position. A window correction that's accurate for grocery will systematically over- or under-correct for consumer electronics.

The engine maintains vertical-specific behavioral models calibrated from aggregated, anonymized campaign data across multiple agencies and categories. These models capture longitudinal conversion patterns that are invisible in any single client's data.

Temporal Axis

Platforms change their attribution methodology over time — sometimes announced, sometimes not. Correction profiles that were accurate six months ago may introduce systematic error today. The engine tracks these changes and maintains temporal calibration, ensuring corrections applied to Q1 data use Q1-era profiles, not current ones.

This is the dimension most internal solutions fail to maintain. Getting the initial correction right is achievable. Keeping it right across methodology changes, seasonal effects, and platform-specific reporting updates requires sustained operational investment.

The calibration lifecycle: New platform profiles begin in research, move through empirical validation against known benchmarks, enter a calibration phase with real campaign data, and only graduate to production after meeting accuracy thresholds. Existing profiles are reviewed on a recurring cycle and updated when platform methodology changes or when accumulated data reveals drift. This lifecycle runs continuously — it's not a one-time setup.

05 — Attribution Drift

Platforms change their measurement rules. Quietly. Constantly.

A correction system that doesn't monitor for attribution drift will degrade over time — even if it was accurate on day one.

Retail media networks routinely adjust their attribution methodology. Sometimes with announcements. Sometimes with subtle changes to API output behavior, default window settings, or view-through inclusion rules that only surface when you're actively monitoring for them.

This creates a compounding problem for anyone relying on static correction factors. An accurate correction from six months ago may introduce systematic error today — not because the math was wrong, but because the platform moved under it.

Drift type	Example	Impact if undetected
Window adjustment	Platform changes default window from 30 to 14 days	Correction overcounts by applying a reduction to already-shortened data
View-through rule change	Platform begins including post-view conversions in click reports	Normalized ROAS inflated due to uncorrected view-through inclusion
Model redefinition	Platform shifts from strict last-click to weighted multi-touch	Model conversion layer applies wrong correction direction
Reporting format change	Column names, date formats, or metric definitions change silently	Parsing errors or misidentified fields produce incorrect normalization

RetailNorm operates a continuous monitoring layer across all supported platforms. We track attribution documentation changes, API behavior changes, reporting format changes, and anomalous statistical patterns in normalized output that could indicate undocumented methodology shifts.

When drift is detected, the affected calibration profiles are flagged, reviewed, and — if necessary — recalibrated. Reports generated during the drift window can be retroactively re-normalized once the updated profile is available, maintaining data consistency across time.

This is the operational cost competitors underestimate. Building a correction system is a project. Maintaining one across platform changes, new network launches, vertical expansion, and methodology drift is an ongoing operation. That sustained investment is where the accuracy lives — and where the barrier to replication sits.

06 — Confidence & Uncertainty

Every number comes with a precision statement

Normalization involves estimation. We quantify that estimation's boundaries so you can make informed decisions with the output.

No correction system produces exact figures. The behavioral models that inform vertical calibration carry inherent estimation ranges. Platform methodology documentation can be ambiguous. Conversion patterns vary between individual campaigns within the same vertical.

Rather than pretending our normalized figures are precise to the decimal, we propagate uncertainty through every correction layer and produce confidence intervals alongside every normalized ROAS figure. This quantification is structurally separated from the correction logic — it observes the correction process, it doesn't influence it.

Amazon → 14-day baseline

High confidence

Amazon already uses 14-day last-click. Minimal structural correction required. Uncertainty is dominated by vertical calibration rather than platform correction.

Walmart → 14-day baseline

Medium confidence

Requires structural window correction, view-through adjustment, and vertical calibration. Each layer compounds uncertainty. Interval is wider but within actionable range for budget allocation.

Why this changes decision-making. A normalized ROAS of 4.2× with high confidence is more actionable than 4.8× with medium confidence. When two platforms show similar normalized performance but different confidence levels, that asymmetry should influence allocation decisions — and it does, automatically, in the budget optimization layer.

This is also the mechanism that prevents over-reliance on corrections applied to platforms we have less calibration depth on. New platforms start with wider confidence intervals that narrow as calibration data accumulates. The system is honest about what it knows and what it's estimating.

07 — Versioning & Governance

Every report carries its full provenance

Engine versioning ensures that every output is reproducible, traceable, and comparable — even months after generation.

The normalization engine is version-controlled at multiple levels: the overall engine version, individual platform calibration profiles, vertical behavioral models, and the correction layer interaction parameters. When a report is generated, it is locked to the specific versions in use at that moment.

This means three things for agencies:

↺

Reproducibility

The same CSV processed on the same engine version produces identical output. Always. This is the standard agencies need to defend their methodology to clients and auditors.

⧖

Historical comparability

When calibration profiles update, you can choose to re-normalize historical reports under the new version — or preserve the original for continuity. Both are valid. The choice is yours.

◎

Audit trail

Every report records which engine version, platform profiles, and vertical models were applied. If a client asks "why did this number change between Q1 and Q2?", the system can isolate whether the change came from campaign performance or a calibration update.

⟳

Retroactive re-normalization

When a platform methodology change is detected retroactively, historical data can be re-processed with corrected profiles. Version control ensures you know exactly what changed, when, and why.

Think financial reporting, not analytics dashboards. The versioning system is designed to the standards that finance teams and compliance-aware agencies expect: immutable outputs, documented methodology changes, and the ability to trace any number to its calculation context. This isn't spreadsheet-level rigor. It's audit-grade.

08 — Budget Optimization

From "what happened" to "where to allocate next"

Once you can compare platforms on equal terms, budget allocation becomes a solvable optimization problem.

The traditional workflow ends at reporting. The planner presents ROAS numbers, the client asks where to increase budget, and the answer is usually based on intuition — or whichever platform reported the highest raw number.

With normalized data and confidence metadata, allocation becomes mathematical. Given a fixed total budget, a set of normalized performance metrics with confidence intervals, and vertical-specific diminishing return curves, the engine recommends the distribution that maximizes expected total return — risk-adjusted for correction uncertainty.

Marginal Return Modeling

The optimizer estimates marginal returns per platform. An additional $1,000 on a platform approaching saturation yields less than the same $1,000 on an underinvested one. Diminishing return curves are derived from normalized historical data — meaning they reflect actual performance patterns, not raw platform-inflated metrics.

Confidence-Weighted Allocation

Confidence scores feed directly into the optimizer. A platform with high normalized ROAS but low confidence receives a risk-adjusted allocation — the system accounts for the possibility that the correction is less precise, preventing over-commitment to uncertain metrics.

The output is a concrete, defensible recommendation backed by normalized data, confidence intervals, marginal return modeling, and engine version documentation. The kind of recommendation that survives a CFO's review because you can trace exactly where the number came from.

09 — The AI Boundary

AI generates language. Not numbers.

A strict architectural boundary separates the deterministic correction engine from AI-powered narrative generation.

The correction engine is deterministic mathematics. No machine learning models influence the normalized figures. No neural network decides correction weights. No generative model hallucinating output that looks like a ROAS number.

AI enters the pipeline at a single, well-defined point: after normalization is complete. It reads normalized data. It generates human-readable narratives for client reports. It surfaces insights from patterns in the normalized output. It assists with budget recommendation presentation.

The AI layer has zero write access to the correction engine. It cannot modify calibration profiles, adjustment parameters, or confidence scores. It operates in a sandboxed environment that treats the normalized figures as immutable inputs.

✕

AI does not calculate ROAS

Normalized figures are produced by deterministic math with versioned calibration profiles. No ML model is involved in the number generation.

✕

AI does not set correction parameters

Calibration profiles are maintained through a governed research and validation process. AI cannot modify, suggest, or influence correction values.

✕

AI does not access raw client data

The narrative layer receives only aggregated, normalized outputs. It never processes original CSV data or platform-specific raw metrics.

✕

AI cannot override confidence scores

Uncertainty quantification is a statistical process. The AI layer reads confidence metadata but has no mechanism to alter it.

Why this architecture matters: In a market full of products claiming "AI-powered analytics," RetailNorm takes the opposite position. The most critical layer — the one that produces the numbers — is explicitly AI-free. This is a deliberate trust decision: when your client asks "is this an AI number?", the answer is no.

10 — Why Internal Replication Fails

Building the correction is a project. Maintaining it is an operation.

The initial accuracy problem is solvable. The long-term maintenance problem is what breaks internal solutions.

We've spoken to dozens of agencies and holding companies that have attempted internal normalization solutions. The pattern is consistent: a smart analyst builds something in Excel or Python that works well for a few months, then degrades as the underlying conditions change. The reasons are structural, not talent-related.

Phase 1

Initial build

A competent analyst estimates correction factors from available data. The system works. Accuracy is reasonable for the platforms and verticals initially targeted. Time investment: 2–4 weeks.

Phase 2

Silent degradation

Platforms update attribution methodology. New verticals have different conversion behavior. The original factors aren't recalibrated because no one is monitoring for drift. Errors compound silently. Time to detection: 3–9 months.

Phase 3

Abandonment

A client questions the numbers. The original analyst has moved on or is working on other projects. The correction logic isn't documented or version-controlled. The agency reverts to raw platform numbers or rough manual adjustments.

The core issue isn't getting the first set of corrections right — that's a well-defined problem. The issue is the sustained operational cost of keeping corrections accurate across:

Platform methodology changes — detecting them, understanding their impact, recalibrating profiles. Vertical expansion — each new product category requires its own behavioral calibration, not a generic default. New network support — each retail media network has a unique attribution architecture that requires dedicated research and validation. Temporal consistency — maintaining version-controlled profiles so historical reports remain valid and comparable.

This maintenance cost scales linearly with the number of platforms, verticals, and clients — and it's ongoing. RetailNorm absorbs this cost centrally and distributes the benefit across all customers. For an individual agency, the math doesn't work: the operational investment to maintain accurate normalization across 4+ platforms and multiple verticals exceeds the cost of the tool by an order of magnitude.

11 — Design Principles

What we build — and what we don't

✓

Opinionated methodology

One attribution standard. One baseline. No "choose your own adventure" settings that make every agency's numbers incomparable again.

✓

Client-ready output

Everything is designed to be presented directly to the client. No intermediate data processing. No "export to BI tool" step.

✓

Audit-grade traceability

Every output is version-locked and traceable to specific calibration profiles. Designed for the scrutiny level of financial reporting, not dashboard analytics.

✓

Deterministic core, sandboxed AI

Core numbers are deterministic mathematics. AI is isolated to narrative and presentation. No model variability touches the normalized figures.

✕

Not a data warehouse

We don't store, aggregate, or manage your raw campaign data long-term. Upload CSVs, get corrected results, done.

✕

Not an API connector

We don't require your platform credentials or persistent access. CSV-based architecture means zero integration risk and zero credential exposure.

✕

Not a BI tool

No dashboards to configure, no custom queries, no SQL. The output is a governed report, not a dataset to explore.

✕

Not enterprise-only

No annual contracts, no $10k/month minimums. Built for agencies that need production-grade normalization but can't justify enterprise infrastructure spend.

12 — Frequently Asked

Questions from technical buyers

How are calibration profiles validated?

Each profile goes through a multi-stage validation process: initial calibration against published platform methodology and attribution research, empirical validation against known benchmarks and controlled test data, and ongoing monitoring for drift. Profiles only reach production status after meeting documented accuracy thresholds. The methodology is proprietary, but the governance process is rigorous and documented.

Why 14-day last-click as the baseline?

It captures the majority of ad-influenced conversions while filtering most organic noise. It's Amazon's default — meaning the world's largest RMN requires minimal structural correction, reducing overall system error. It's also the most commonly referenced standard in retail media RFPs, making it the natural lingua franca for cross-platform comparison.

Can I use a different baseline?

Not currently. We're deliberately opinionated about this. If every agency picks their own standard, you recreate the fragmentation problem at the agency level. One baseline, one truth, comparable output across every client and every report. That constraint is a feature, not a limitation.

What platforms are supported?

Currently in production: Amazon Ads, Walmart Connect, Criteo, Tesco / dunnhumby, Instacart Ads, and Carrefour / Unlimitail. Additional networks are in active calibration. Each new platform requires dedicated research, profile building, empirical validation, and accuracy testing — a process measured in weeks, not days. We don't launch a platform profile until it meets the same accuracy standards as our existing ones.

What happens when a platform changes its attribution model?

We maintain a continuous monitoring layer that tracks platform documentation, API behavior, and statistical anomalies in normalized output. When drift is detected, the affected profiles enter recalibration. Historical data can be retroactively re-normalized using the corrected profiles, and the version changelog documents exactly what changed and when.

Is the correction logic exposed to the browser?

No. The entire normalization engine runs server-side behind an authenticated API. The browser sends CSV data and receives normalized results — but never has access to correction parameters, calibration profiles, or proprietary factors. This is a deliberate architectural decision: it protects the accumulated research investment while ensuring clients always run against the latest calibrated version of the engine.

Is the normalization methodology published?

The architectural approach, correction dimensions, and governance framework are described here openly. The specific calibration profiles — the behavioral models, correction parameters, vertical calibration data, and their interactions — are proprietary. They represent sustained research investment and accumulated operational knowledge that cannot be derived from first principles or platform documentation alone.

How is this different from building normalization internally?

The initial build is achievable — a capable analyst can estimate reasonable correction factors for a few platforms. The challenge is maintaining accuracy across platform methodology changes, new verticals, new networks, and temporal drift. That ongoing operation — monitoring, recalibrating, validating, version-controlling — scales linearly with complexity and is what internal solutions consistently fail to sustain. RetailNorm centralizes that operational cost and distributes the benefit.

Can I trace how a specific number was calculated?

Yes. Every normalized figure is linked to the engine version, platform calibration profile version, and vertical model version used at the time of generation. You can identify which correction layers were applied, their version identifiers, and the confidence interval produced by the uncertainty quantification layer. This is designed for the level of scrutiny you'd expect in financial reporting.

A correction system built to be trusted by the people who present the numbers

Attribution is a measurement problem, not a data problem

A multi-layer correction system. Not a formula.

From CSV to auditable output in six steps

Corrections are not constants. They're maintained systems.

Platforms change their measurement rules. Quietly. Constantly.

Every number comes with a precision statement

Every report carries its full provenance

From "what happened" to "where to allocate next"

AI generates language. Not numbers.

Building the correction is a project. Maintaining it is an operation.

What we build — and what we don't

Questions from technical buyers

See the engine in action