JPMorgan London Whale VaR spreadsheet error — Q1 2012
Cost: ~$6.2B trading loss in the Synthetic Credit Portfolio; $920M in regulatory fines; senior CIO leadership exits · Time-to-detect: ~3 months from new VaR model deployment (Jan 2012) to first internal escalation (mid-Apr 2012); first public reporting May 10, 2012 · Root cause class: T9 (equivalence — replacement model was not equivalent to predecessor on the input distribution it actually saw)
What happened
Beginning in late 2011, JPMorgan's Chief Investment Office (CIO) accumulated a multi-hundred-billion-dollar notional position in synthetic credit indices that came to be known as the "London Whale" trade. In January 2012, the CIO replaced the Value-at-Risk (VaR) model that governed the position's risk-capital allocation. Per the Senate Permanent Subcommittee on Investigations' March 2013 report and the OCC's January 2013 enforcement order, the new model was implemented in a chain of Excel spreadsheets in which values were copy-pasted between worksheets by hand. One of those copy-paste steps replaced a divisor that should have been the average of two rates with the sum of the two rates — silently halving the reported VaR for the portfolio.
The effect: the Synthetic Credit Portfolio's reported VaR fell from $132M to $66M overnight, allowing the position to grow without breaching the desk's risk limit. Through Q1 2012, the position grew to a notional of roughly $157B. When market conditions turned in March–April 2012, the trade lost ~$6.2B before being unwound. The loss triggered Senate hearings, $920M in regulatory fines across the OCC, SEC, FCA, and Federal Reserve, and a deferred-prosecution agreement.
The new VaR model was approved on the basis of a backtest against the old VaR's outputs over a sample window. On that window, the two models produced numerically close answers — the equivalence claim that justified the swap. The window did not include the regimes in which the spreadsheet's divisor bug actually mattered.
The pattern
A "new" computation was deployed under the implicit claim of equivalence to an "old" computation it replaced — a refactor, in software terms. The equivalence was validated empirically on a sample window, not statically on the computation's structure. The two computations were not equivalent: one materially understated risk on a non-trivial fraction of the input distribution. The discrepancy was opaque because the implementation language (a chain of Excel cells with hand-edited formulas) provided no machine-checkable specification of what either computation was supposed to be.
Any data system where a model is replaced under the claim that "it computes the same thing more efficiently" without a static proof of input-output equivalence has this exposure: dbt model rewrites for performance, query optimizer rewrites a tool emits, hand-tuned SQL replacing a generated baseline, a "v2" of a metric that is supposed to match v1 on legacy inputs.
How veric would catch it
veric's T9 tier proves equivalence between two SQL/dbt models on every input that conforms to the declared schema — not on a sample window. In a PR replacing var_model_v1 with var_model_v2, the verifier would run the equivalence checker over the structural pair and report: "models var_model_v1 and var_model_v2 are NOT equivalent on the declared schema; counterexample input rows produce divergence at column total_var (v1: 132.4, v2: 66.2). T9 EQUIVALENCE VIOLATED — refactor not safe to deploy as a drop-in replacement."
Honest scope: veric does not validate the correctness of either VaR formula against a financial-mathematics specification — that is for the model-validation function. veric validates that the replacement computes the same answer as the predecessor on every input the warehouse can deliver. In this incident, that single check would have surfaced the divisor swap on the day the new model was committed, on every input row in the legacy backtest set, with a concrete counterexample regulators could have replayed.
Try it: open the example below and watch the verdict change as you toggle the offending pattern on and off.
See also
- /explore — the property — “v2 computes the same thing as v1 on every input” is the kind of universally-quantified statement testing cannot make.
- /explore — the certificate — an equivalence-proof artifact is exactly what gets baked into a SARIF a regulator can replay.
- Adjacent incidents: Knight Capital 2012, Citibank/Revlon 2020, marketplace refund double-count 2023.
Sources
- US Senate Permanent Subcommittee on Investigations, "JPMorgan Chase Whale Trades: A Case History of Derivatives Risks and Abuses" (Mar 15, 2013): https://www.hsgac.senate.gov/wp-content/uploads/imo/media/doc/REPORT%20-%20JPMorgan%20Chase%20Whale%20Trades%20(4-12-13).pdf
- OCC Consent Order, JPMorgan Chase Bank N.A. (Jan 14, 2013): https://www.occ.gov/static/enforcement-actions/ea2013-002.pdf
- JPMorgan internal "Task Force Report" (Jan 16, 2013): https://www.jpmorganchase.com/content/dam/jpmc/jpmorgan-chase-and-co/investor-relations/documents/task-force-report.pdf
- Wikipedia, "2012 JPMorgan Chase trading loss" (consolidated timeline): https://en.wikipedia.org/wiki/2012_JPMorgan_Chase_trading_loss