run.veric.dev
Substrate explainer · P7 is honestly stubbed
The Veric thesis

veric is a substrate, not a vertical.

AI-provenance is the first vertical we lit up — the EU AI Act forced the schedule. But the underlying machinery is a six-layer attribute-grammar compiler tower whose top tier is a typed information-flow analyzer. Hand it a typed pipeline IR and a tag glossary and it returns either a proof that no forbidden flow exists or a counterexample trace witnessing one. Same compiler, same proof contract, same counterexample shape — different tag glossary, different deliverable template. That is what the rest of this page unpacks.

A vertical, on this picture, is a pair: a T-set (the tags the regime cares about — eu-personal-data, copyrighted-text, gdpr-erased, marketing-sink, journal-entry) and a D-set (the artifacts the regime requires — Annex IV pack, DSAR trace, Article 53 summary, SOX walkthrough). The substrate beneath is constant. We have lit up one vertical. We could light up three more tomorrow with no kernel work.

P1 — P7 · the substrate

Seven primitives. Six implemented, one honestly stubbed.

The substrate is what is constant across every vertical. It is the part of the system that does not care what tags you load or what artifact you ship. Each primitive points at the ag-tower silver layer that implements it; the silver tower is the public moat — github.com/garrick0/ag-tower.

  1. P1 · Lineage-graph builder

    Implemented

    silver-l8-v2 + dbt/Croissant adapters

    Walks any typed pipeline IR — dbt SQL, Croissant 1.1 manifests, fine-tune scripts — into a uniform Node tree. Every column, every embedding, every checkpoint is a node; every transform is an edge. The same Node IR the AG framework decorates everywhere else in the tower.

  2. P2 · Tag propagator

    Implemented

    silver-l8-v2 (tag attributes)

    Attribute-grammar equations that flow tags forward along lineage edges. Tags are declared at spec level — pii, pci, eu-personal-data, copyrighted, journal-entry, anything — and the propagator does not need to know what they mean. It only needs to know they exist.

  3. P3 · Sink-reachability solver

    Implemented

    silver-l8-v2 (T6 information-flow)

    The T6 information-flow tier. A forall-style traversal asking: for every path p in the lineage, for every tag t at the source of p, does the sink at the end of p accept t? The answer is either a proof or a path-witness counterexample. Same equation under any tag set.

  4. P4 · Refutation-witness emitter

    Implemented

    silver-l8-v2 + silver-l9-v2 origins

    When the solver fails, it returns the offending path verbatim. Origin annotations from silver-l9 carry source-location data through every rewrite, so the witness lands as a clickable trace — file, line, transform, sink. The witness is the artifact regulators replay; it is not a debugging aid bolted on.

  5. P5 · Signed-export pipeline

    Implemented

    silver-l11-v2 IO + Sigstore/Rekor

    Every certificate, every Annex IV pack, every DSAR trace exits the tower through a single signed-export channel. Sigstore identity, Rekor transparency log, in-toto attestations. The receiving auditor replays the signature chain without contacting us — chain-of-custody is a property of the artifact, not of our infrastructure.

  6. P6 · Replay-ledger

    Implemented

    silver-l9-v2 origins + silver-l11-v2 IO

    Every compile run is content-addressed and logged. Given any historical certificate, the ledger reconstructs the exact spec, exact tags, exact transform chain that produced it. Incident-replay is a query against the ledger; auditor read-only export is a slice of it. Determinism, not snapshots.

  7. P7 · Erasure-completeness verifier

    Stubbed · substrate gap

    silver-p7-erasure-v2 (StubProver)

    The one substrate gap. AI-provenance demands a stronger claim than reachability: not just "no forbidden flow exists today" but "no residual reference to a deleted subject survives anywhere in the model artifact." That is a forall-forall proof — every path, every checkpoint, every embedding. We ship a stub today. The vault renders the certificate shape a real proof would carry; the proof itself lands when the substrate primitive does.

    See the honest stub banner →

A vertical = T-set + D-set

A vertical is just a tag glossary plus a deliverable template.

The substrate has no opinion about what counts as a violation. A vertical names the tags whose flows are forbidden and the artifact the regime expects in return. The AI-provenance vertical happens to be the one we built first; below is its tag set and its nine deliverables. Swap either column and you have a different vertical.

T-set · what tags this vertical names

The AI-provenance tag glossary.

Ten primary tags, each carrying a regulatory anchor and a propagation rule. Declared at spec level; consumed by P2. The full ladder of tiers (T0–T9) and tag overlay lives on the tier-glossary page.

  • eu-personal-data
  • copyrighted-text
  • licensed-cc0
  • gdpr-erased
  • synthetic
  • model-output
  • training-input
  • red-team-curated
  • dsar-residual
  • embedding-shape
Open the tier glossary →

D-set · what deliverables it ships

Nine artifacts. Three already standing up.

  • D1 · Annex IV technical-doc pack

    AI Act Article 11 + Annex IV: model purpose, datasets, training, validation, monitoring. Compiled from the certificate ledger; signed; replayable.

    Sample registry →
  • D2 · Article 53(1)(d) training-data summary

    GPAI obligation: a sufficiently detailed summary of training-data sources. Auto-rendered from the lineage graph; refreshed every CI run.

  • D3 · DSAR chain-of-custody trace

    GDPR Article 15: every embedding, checkpoint, and fine-tune corpus that ever held data derived from a given subject. The trace IS the response.

  • D4 · Erasure-completeness certificate

    GDPR Article 17: proof or counterexample for every deletion request. Refutation names the residual reachability path. P7 stub today.

    See the stub banner →
  • D5 · Model-card appendix

    HuggingFace model-card extended with provenance fields — tag set seen, refutation history, signed checkpoints. Rides existing model-card YAML.

  • D6 · Provenance PR comment

    GitHub PR-time diff: new tags reaching new sinks since last green run. The engineer-loop deliverable; same shape as a SARIF report.

  • D7 · Sigstore-signed manifest

    Every Annex IV pack, every DSAR trace, every certificate carries a Sigstore identity + Rekor transparency-log entry. Verification offline.

  • D8 · Incident-replay bundle

    Given a misbehaving model, replay the training lineage backward to the responsible source. Every step from the ledger; every step signed.

  • D9 · Auditor read-only export

    Notified Body / external auditor portal. Frozen, scoped, signed. No customer credentials shared; replay happens in the auditor's browser.

What is queued behind the wedge

Three more verticals we could light up tomorrow.

None of these need a kernel change. Each is a T-set authored by a domain expert plus a D-set template authored by a regulator-adjacent customer. The point of this section is honesty about scope, not promise about timeline — there is no product surface for any of them today, and there will not be one until a design partner with a real artifact deadline shows up.

  • Data-engineering correctness

    The PLG wedge already lit up at run.veric.dev — same substrate, practitioner buyer.

    T-set sample
    PII · PCI · marketing-sink · cardinality · range · coercion · schema-shape
    D-set sample
    PR-time SARIF certificate · dbt-test bridge · cardinality witness · join-cardinality drift
    Regulatory anchor
    Internal SOC 2 / SOX-data-flow / vendor-data-handling
  • Supply-chain SBOM attestation

    in-toto SLSA attestations are already a P5-shaped problem; SBOM tags ride P2 verbatim.

    T-set sample
    vendor-source · provenance-attested · build-reproducible · transitive-cve · license-class
    D-set sample
    SLSA Level 3 attestation · CycloneDX SBOM · vendor-vulnerability replay · build-witness export
    Regulatory anchor
    EO 14028 · NIST SSDF · Cyber Resilience Act
  • Financial-compliance trace

    SOX general IT controls are an information-flow problem dressed in audit clothing.

    T-set sample
    journal-entry · approver-segregation · period-close-window · materiality-class · related-party
    D-set sample
    Workiva-shippable controls walkthrough · segregation-of-duties witness · period-close certificate
    Regulatory anchor
    SOX 404(b) · PCAOB AS 2201 · IFRS audit trail

Reading this and thinking about an HIPAA, Part-11, or CSRD tag-set you would write yourself? That is precisely the point. The substrate is permissive about what counts as a regulated flow; we are permissive about which vertical lights up next.