RaoscaffResearch
Prediction Series · Lock · Issue P-51
Prediction Series · P-51

LMArena #1 at year-end — OpenAI, Anthropic, or Google, locked at 84%.

84% probability the #1 model on LMArena's overall text leaderboard on Dec 31 2026 is from OpenAI, Anthropic, or Google. Jun 21 2026: an Anthropic model holds #1; top-10 is nearly all three labs. Resolved via a Wayback Machine snapshot dated Dec 31 2026.

Type · Prediction Lock · calibrated binary Locked · 2026-06-21 · LMArena #1 = Anthropic model at lock date Resolves · 2026-12-31 · LMArena overall text leaderboard via Wayback Machine snapshot Scored · binary: #1 model made by OpenAI, Anthropic, or Google yes/no
LMArena overall text leaderboard #1 (Jun 21 2026)
Anthropic model
top-10 dominated by OpenAI / Anthropic / Google · xAI / Meta climbing

LMArena (arena.ai) overall text leaderboard as of Jun 21 2026: #1 is an Anthropic model; top-10 is nearly all OpenAI, Anthropic, and Google. Probability trimmed 86%→84% to account for xAI and Meta climbers over 6 months. Resolution: Wayback Machine archived snapshot of arena.ai overall text leaderboard dated 2026-12-31. Disclosure: these RAOSCAFF prediction briefs are AI-assisted; Anthropic is listed among the eligible makers in this call.

— 1 · The Locked Call

LMArena #1 overall text model from OpenAI, Anthropic, or Google at Dec 31 2026 — P = 0.84.

We lock a binary claim: The #1 ranked model on LMArena's overall text leaderboard (arena.ai), as captured in a Wayback Machine snapshot dated December 31 2026, is made by OpenAI, Anthropic, or Google. Confidence 84%. Transparency disclosure: these RAOSCAFF prediction briefs are AI-assisted; Anthropic is one of the three eligible makers named in the YES criterion.

— 2 · Why it's nail-able

Anthropic holds #1 today; top-10 is the three labs; Wayback Machine snapshot is the clean resolution source.

As of Jun 21 2026, an Anthropic model holds the LMArena overall text #1 position, with the top-10 nearly entirely from OpenAI, Anthropic, and Google — a dominant structural position. Confidence is 84% rather than higher to account for six months of leaderboard churn and the rising trajectory of xAI (Grok) and Meta (Llama) models. Resolves against a Wayback Machine (web.archive.org) snapshot of the LMArena overall text leaderboard (arena.ai) explicitly dated December 31 2026; if no Dec 31 snapshot exists, the closest captured snapshot on or before Dec 31 2026 is used.

Locked on 2026-06-21 — scored against Wayback Machine snapshot of LMArena overall text leaderboard dated Dec 31 2026.

RAOSCAFF locks P-51 on 2026-06-21. Disclosure: AI-assisted brief; Anthropic is an eligible maker. Resolution via Wayback Machine snapshot; if no Dec 31 capture exists, the closest on-or-before capture applies.

Locked
2026-06-21 (commit timestamp on origin/main)
Resolves
2026-12-31 — LMArena overall text leaderboard (arena.ai) via Wayback Machine snapshot dated 2026-12-31
Source
web.archive.org snapshot of arena.ai overall text leaderboard; #1 model maker field
Scored by
Binary: YES if #1 model is from OpenAI, Anthropic, or Google; NO if from any other maker

If no Dec 31 2026 snapshot exists, the closest available on-or-before Dec 31 2026 Wayback Machine capture is used. Disclosure: these briefs are AI-assisted; Anthropic is listed among eligible makers.