run.veric.dev

Monthly revenue silently wrapped past $2.1B — 2022

Cost: one month of executive-dashboard revenue figures reported as a small negative number; emergency restatement; downstream commission calculations re-run · Time-to-detect: ~36 hours from monthly close to first executive question · Root cause class: T1 (type conflict) compounded by T7 (range / interval bound)

What happened

A late-stage payments platform ingested transaction data from a legacy ledger system into its analytics warehouse. The ingestion pipeline declared the amount_cents column as INT32 — a choice that long predated the company's growth and had never been revisited. For per-transaction values, INT32's ~$21M ceiling was comfortable. For monthly aggregates, it wasn't.

When monthly revenue first crossed the equivalent of $2.147B in amount_cents (INT32 max = 2,147,483,647), the warehouse's monthly aggregation SUM(amount_cents) silently overflowed. The downstream warehouse — Postgres in the analytics layer, not the source system — interpreted the overflowed INT32 as a wrapped negative number, and the monthly executive dashboard rendered the company's revenue as roughly negative $40M.

The CFO noticed within two business days. A short-term fix cast the column to BIGINT at the aggregation step; a longer-term fix migrated the column type at the source. Commission and revenue-share calculations that had run against the bad number were re-issued, and one quarterly KPI report had to be republished.

The pattern

A type's representable range was lower than the value the pipeline could legitimately produce, and the overflow was silent rather than loud. The column type was correct for the per-row use case and wrong for the aggregated use case; nothing in the pipeline tracked the value-range under aggregation. Schema tests passed because the type was unchanged and every individual value fit; row-level data quality tests passed because each transaction was within the per-row bound.

Any pipeline where a numeric column's declared type is sufficient for individual row values but insufficient for downstream aggregations has this exposure: USD revenue in INT32, event counts in INT32 for high-volume products, byte-size columns in 32-bit types when files grow past 2GB, millisecond timestamps in 32-bit columns.

How veric would catch it

veric's T1 tier checks declared types against operations, and the T7 tier propagates a value-range annotation along the model graph that includes range expansion under aggregation. Given transactions.amount_cents : int32, range(0, 21_474_836), the verifier infers that SUM(amount_cents) GROUP BY month projects to a range bounded by range(0, monthly_txn_count) × max(amount_cents). When that projected range exceeds int32's representable range, it flags: "aggregation SUM(amount_cents) projects range (0, ~9.2e12) for monthly grain; result column declared int32 with range (-2.1e9, 2.1e9) — T1/T7 range VIOLATED, widen the result type or declare a tighter range bound."

The same primitive catches the family: any aggregation whose projected range exceeds its declared type's capacity.

Try it: open the example below and watch the verdict change as you toggle the offending pattern on and off.

See also

Sources

  • Anonymised; pattern reflects the well-documented INT32 overflow class. Public references include the YouTube "Gangnam Style" 2,147,483,647 view-count overflow (Google, 2014) and the recurring discussion of monetary-aggregation overflow in payments-systems engineering literature (e.g. Stripe's published guidance on storing money values).
Reproduce in playgroundT1 · Type in the glossary

Opens the playground pre-loaded with a model that exhibits this pattern. Toggle the offending lines to watch the verdict change.