← Certification methodology index

Ethicality · Certification Methodology · AM-03

Technical Reproduction Protocol

How assessors independently reproduce key model evaluations.

1. Reproduction targets

Bias and fairness metrics across protected attributes; robustness against distribution shift and adversarial inputs; calibration and uncertainty; safety evaluations relevant to the deployment; environmental accounting (training and serving emissions).

2. Data access

Assessors obtain read-only access to evaluation datasets or to a representative holdout. Where access is constrained, the applicant provides a witnessed re-run on assessor-supplied prompts.

3. Environment

Assessors run evaluations in an isolated environment that mirrors the applicant's production stack to the version level. Discrepancies between the applicant's reported metrics and assessor reproduction exceeding the published tolerance are recorded as findings.

4. Emissions accounting

Assessors verify energy and water draw estimates against utility records and platform telemetry where available; assumptions and methodology are published in the determination file.

5. Adversarial testing

Assessors execute at least one red-team campaign aligned to the system's intended use and a second campaign aligned to plausible misuse.

Methodology suite v1.0 · controlled copy for certification use.