Research · Frameworks & Benchmarks
The ADF Pipeline
The rest of this catalog measures the data. This one makes it.
The ADF Pipeline is the AI Data Factory’s standard methodology for producing multilingual annotation and reference sets — the labeled, validated data that AI systems learn from and that the benchmarks in this catalog measure against. A benchmark is only as honest as the reference set behind it, and the Pipeline is how that reference set is built. This page places it among the frameworks; the full account of how the Pipeline runs, station by station, lives on its primary page.
An AI Data Factory methodology · Updated June 2026
The ADF Pipeline
Made · Validated · Reference-ready
In the catalog
Everything here measures the data. This is how the data is made.
Most of the Frameworks & Benchmarks catalog acts on data that already exists. The indices and scores measure it — the Pashto-Dari Parity Index, the Sovereign Speech Index, and the rest measure where a system stands against a reference. The audits and standards validate it. But all of that depends on something prior: a reference set worth measuring against, and training data worth learning from. The ADF Pipeline is what produces them. It is not a measure or a check; it is the production methodology that makes the multilingual annotation and reference sets the rest of the catalog takes as given.
That is the Pipeline’s place here, and why it sits among instruments that otherwise judge rather than build. A benchmark that measures against a careless reference set measures nothing; a model trained on unvalidated data learns the wrong thing confidently. The Pipeline’s reference sets are produced through validation — built and checked, not assumed — which is precisely what allows a benchmark to trust them and a system to rely on them. The ground truth the rest of the catalog stands on is made here.
The complete account — how the Pipeline runs, station by station, and how annotation and reference sets are produced and validated — is on its primary page. → The ADF Pipeline
The shape of it
What everything else stands on.
The Pipeline makes the reference sets the rest of the catalog measures and validates against — the ground truth everything else stands on.

The catalog
Where the Pipeline sits among the instruments.
The catalog runs in three movements — the data is made, then measured, then validated — and the Pipeline is where it begins.
Measure
Instruments that produce a finding about where something stands.
The Pashto-Dari Parity IndexThe Sovereign Speech IndexThe Court Interpreter Readiness ScoreThe Resettlement Integration IndexValidate
Instruments that produce a judgment about whether something is sound.
The Cultural Hallucination AuditThe Section 1557 Conformance MapThe Diaspora Trial Inclusion StandardThe Five-Gate Validation ProtocolThe distinction
What the Pipeline is — and is not.
Not a measurement
It produces no score about where a system stands. That is the work of the indices.
Not a check
It does not validate a finished deliverable. That is the work of the audits and standards.
A production methodology
It makes the multilingual annotation and reference sets the rest of the catalog depends on.
The Pipeline’s reference sets are produced through validation — built and human-checked, not assumed — which is precisely what lets a benchmark trust them and a system rely on them. The full account is on its primary page.
The data is made here. The full method is one click away.
The ADF Pipeline belongs to the AI Data Factory, and its complete page covers everything this one does not — the stations, the production of annotation and reference sets, and the validation that makes them trustworthy. For work that depends on data built right, the conversation starts here.