run.veric.dev
◆ Design partner previewCompliance Vault — sample data, real contracts.
Schedule a DP conversation →

Layout, contracts, and data shapes are real. The data shown is illustrative. Real customer data flows live with our first design partner.

Incident replay — counterfactual rendering.

T9 · audit-ledger · counterfactual

New York Times v. OpenAI & Microsoft

The New York Times filed a federal copyright suit alleging GPT-4 and Copilot were trained on millions of Times articles without a license, and that GPT-4 emits Times reporting near-verbatim in response to common prompts. The complaint pleads billions of dollars in statutory and actual damages.

Verdict shape that would have refuted

Tier T6+T8
Flow contract
flow(license_class: copyrighted_news) ∉ flow(model_artifact); for_each(training_record) ∃ license_attestation
Fixture that exercises this contract
/examples-ai/11-copyrighted-text-in-training-summary/manifest.json

The 'frontier-llm-v4-training-corpus' fixture demonstrates the AI-Act Art. 53(1)(d) summary completeness contract — a publisher reaches training-corpus that is not declared in the published summary.

Regulatory anchor
EU AI Act Art. 53(1)(d); 17 USC §501; DMCA §1202
Date the vault would have flagged
2022 — at training-pipeline-compile time, before GPT-4's December 2022 ingestion run

What broke instead

The training pipeline ingested URL-list-expanded web content without a per-source license tag flowing through to the model card. There was no compile-time check that 'license_class: copyrighted_news' could not reach 'model_artifact'. The case will turn on years of discovery to reconstruct what was actually in the corpus.

Public outcome · Unliquidated billions in statutory damages exposure; license-negotiation leverage permanently lost. A compile-time refutation would have named the offending shard and the missing license tag — auditable in a single regulator question.

Cross-references