Operating Model · Data & Infrastructure

Atlas Data Platform

The data sat in a shared folder. For most industries that is a compliance problem. For this one, it is a person’s safety.

Atlas is the firm’s secure institutional data environment — the governed home for the linguistic, cultural, and engagement data the Human Intelligence Collective, the AI Data Factory, and the Cultural Compliance Bureau produce. Built on a compliance-capable Microsoft foundation, with consent, provenance, and jurisdiction enforced as infrastructure rather than entrusted to policy.

What travels with the record
RECORD · transcript-0427 · Dari → EN
Provenance
Origin, validation history, current version
Consent
Scope and lawful basis, governed by Gate 4
Residency
Jurisdiction specified to the client
Population risk
A gate over what enters and what leaves
Enforced by the environment, not the folder.
Why an environment

Most data work stores the data and loses the obligations.

The default tooling treats data as files: a shared drive, a folder, a handful of spreadsheets emailed between people who mean well. The data is there, and on a good day it is even backed up. What is not there is everything that was supposed to travel with it. The consent that made a recording lawful sits in a different system, if it was kept at all. The provenance that tells you which dialect band a transcript represents, who validated it, and against which guideline version, was never attached. The residency rules a European client is bound by are honored by accident or not at all. And access is whoever has the link.

For most industries this is a compliance exposure — real, but bounded by fines and remediation. For this firm’s work it is something else, because the data concerns people with genuine security considerations: contributors with family still in Afghanistan, populations a hostile actor would be glad to enumerate. Here, separated consent and unbounded access are not a paperwork failure. They are a safety failure, and the person who pays for it is not the institution.

So the firm does not store this data in folders. It holds it in a governed environment where the obligations are part of the infrastructure: provenance and consent bound to the record, access bounded by role, residency specified by jurisdiction, and a population-risk gate standing over what enters and what leaves. Storage is the easy part. Atlas exists for the rest.

24
Languages and dialect bands the environment is structured to hold
0
Client data on the public website — a presentation layer, by design
0
Data resident inside Afghanistan — no in-country footprint, deliberately
5
Validation gates standing over what enters and leaves the environment
The doctrine
Storage is not stewardship.

Anything can store a file. Stewardship is the obligations the file carries — consent, provenance, residency, population risk — enforced by the environment, not entrusted to good intentions.

The substrate

The firm’s institutional memory, and the obligations attached to each part of it.

The coverage map and linguistic assets
The governed glossaries, the reference sets, the orthography decisions, the dialect-band records — the lexical and linguistic spine the whole firm draws on.
The obligation
Provenance and versioning, so every asset knows where it came from and which version is current.
Engagement and validation records
The Five-Gate files, the Bureau’s attestations and sign-offs, the conformance maps, and the Sign-Off Mark’s registry.
The obligation
Auditability and retention, so a Mark can be opened and a finding reproduced years later.
Contributor and consent records
The records that make the human and speech work lawful and ethical — consent, scope, compensation, and the identity protections the Collective’s people are owed.
The obligation
Consent binding and identity protection, governed by Gate 4.
Client data, under engagement
The material a client entrusts to the firm for a specific scope of work.
The obligation
Confidentiality and residency by jurisdiction — and, where the work involves protected health information, a Business Associate Agreement before any of it lands.
Four kinds of data, one environment — and not one of them stored without the obligations that came with it.
The controls

Six control domains, built on a compliance-capable foundation.

01
Provenance and consent, bound to the record
The keystone: an asset does not exist in Atlas without its origin, its validation history, and — where people are involved — its consent and scope traveling with it. The obligation is not a separate file; it is part of the record.
Bound to the record
02
Identity and bounded access
Role-based, least-privilege access with multi-factor authentication, implemented on enterprise identity tooling such as Microsoft Entra ID. Access is granted, scoped, and logged; it is never “whoever has the link.”
Role-based
03
Encryption, in transit and at rest
Industry-standard encryption applied to data moving and stored — the baseline, stated as design intent rather than as a claim about a specific certification.
In transit & at rest
04
Segregation and isolation
Client data held in isolation, scoped to its engagement; environments separated so that one client’s material is never commingled with another’s.
Per-engagement
05
Audit, logging, and monitoring
A record of who accessed what, and when — so the environment can answer the question every auditor eventually asks, and so anomalies are visible rather than assumed away.
Logged & retained
06
Classification, governance, and retention
Information protection, data-loss-prevention, and retention controls — drawn from a compliance-capable platform set such as Microsoft Purview — so sensitive material is labeled, governed, and disposed of on a schedule rather than accumulating forever.
Labeled & governed
Atlas is built on the firm’s Microsoft 365 environment, whose security and compliance capabilities the firm builds upon — not in place of — its own controls. Platform-level certifications and authorizations are held by Microsoft as the service provider, not by the firm; the firm’s own formal certifications are stated only where held, and confirmed per engagement.
Where it lives, and who can reach it

For data about people at risk, jurisdiction is a security control.

Encryption protects data from interception. It does nothing about the question that matters most for this population: where the data resides, and who can compel its disclosure. That is a question of structure, and the firm answered it structurally.

No in-country footprint
The firm holds no data resident inside Afghanistan and operates no presence there — deliberately, so that the most sensitive records cannot be seized or compelled in-country. The de-risking is in the architecture, not in a promise.
Jurisdiction and residency by design
The firm operates from the United States, with European offices planned; data residency and processing terms are specified to the client’s jurisdiction, and European engagements are governed under GDPR and UK GDPR.
A BAA before any data
Where an engagement involves protected health information, a Business Associate Agreement is executed before any client data enters the environment. No exceptions, no ‘we will paper it later.’
The public site holds nothing
This website is a presentation layer. No client data, contributor data, or engagement records touch the frontend — the work lives in the governed environment, never on the marketing surface.
A compliance-capable foundation
The Microsoft 365 environment beneath Atlas brings an established security and compliance capability set, on which the firm layers the controls and governance its own work requires.
The honest limit

Jurisdiction is a legal control, not a technical one. The absence of a compellable in-country entity removes the local legal nexus an Afghan compulsion order would require — it is not a substitute for encryption, access control, and key management, and it does not displace the firm’s obligations under the laws of the jurisdictions in which it does operate. Residency tells you where data sits; jurisdiction tells you whose law governs it — and the firm treats jurisdiction, not storage location alone, as the controlling factor.

Operating reach
United States
Operating today; residency and processing specified to the engagement.
France, Germany, Italy & other European states
Clients served under GDPR and UK GDPR; London and Berlin planned.
Arab states with major Afghan populations
Served from the United States, under the same governed environment.
In practice

A governed home for the work, not a shared drive with good intentions.

For the institution entrusting work to the firm, Atlas changes the answer to the diligence questions that decide vendor selection. Where will our data live, and under whose jurisdiction — answered specifically, to your terms, not waved away. Who can access it — answered with role-based, logged access rather than a shared link. How do we know an asset’s origin and consent — answered by provenance that travels with the record. And what about the population this concerns — answered by a structure that keeps the most sensitive data out of the country it came from, off the public site, and behind a population-risk gate. The data your vendor-risk team worries about most is the data this environment was built to hold.

A governed home, not a folder
Obligations enforced by the environment — consent, provenance, residency — rather than entrusted to whoever has access.
Audit-ready provenance
Every asset’s origin, validation history, and consent on the record — producible when your auditor asks.
Jurisdiction you can specify
Residency and processing terms aligned to where you operate, under GDPR and UK GDPR for European engagements.
De-risked by structure
No in-country footprint, no data on the public site, and a Business Associate Agreement before any health data lands — the safety posture built in, not bolted on.
6
Control domains governing the environment
4
Kinds of data, one governed home
3
Operating layers served — Collective, Factory, Bureau
41+
Trust Center documents behind the posture
Accountable stewardship

Governance is people, not just settings.

The obligations Atlas enforces are owed by named individuals who answer for them. The firm’s stewardship of this data is human before it is technical — these are the people accountable for what the environment holds, and for what it will never do with it.

Wasil Peroz
Lead, Atlas Data Platform
Owns the environment’s architecture and the control domains that govern it.
Maryam Safi
Principal, Cultural Compliance Bureau
Stewards consent and the population-risk gate over what enters and leaves.
Shukria Sakhi
Lead, Privacy & Data Protection
Holds residency and GDPR / UK GDPR commitments to the client’s jurisdiction.
Continue

Explore Data & Infrastructure.

← Back to the Orchestration Model
The door

Give your complexity a governed home.

For institutions whose Afghan-language data is too sensitive for a shared drive and too consequential for a vendor who cannot say where it lives. Briefings are conducted under NDA, in Washington, D.C. or virtually.

Request a confidential briefing