How the ranking works

Full explanation of Rating, Skill, Form, per-domain, SP and CWP.

TL;DR

Rating

Your headline number — blends career results with recent form. 1500 = Open RX baseline. Higher is better.

Per-WOD scoring

Each WOD compares your result (time, reps, weight) to the field at that competition level. Above the median → rating gains.

Per-domain

Every rating splits into Cardio / Strength / Gymnastics based on the movements each WOD emphasises and their typical volume.

Local & global on one scale

Local event results are calibrated against athletes who also compete in the Open — a win at Fittest in Tartu maps to the same scale as a QF.

Form

Same calculation but last 12 months only. Arrow shows whether you’re trending up or down vs your career average.

Score coverage

97% of performances include exact margin data (reps, time, weight). The rest fall back to rank-based ordering.

What the numbers mean

A single ranking of Estonian CrossFit athletes across all competitions and divisions — the CrossFit Open, Quarterfinals, Semifinals, Games, and local Estonian comps (Fittest in Tartu, Tallinn Throwdown, Kõu Hybrid Storm, and more). The main columns:

  • Rating— the default sort. Blends career Skill with recent Form, weighted by how much recent evidence exists. Athletes who haven’t competed lately get pulled toward 1500 (the Open RX baseline). Active athletes with strong recent results rate close to their Form estimate.
  • Skill — career performance rating. Every WOD produces a per-WOD rating based on where you finished relative to the field at that competition level. Career Skill is the decay-weighted average across all your WODs. 1500 = typical Open RX finisher.
  • Cardio / Strength / Gymnastics — the same rating math applied to each domain separately. Every WOD contributes to the three domains in proportion to the movements it contains and their typical volume. A specialist can beat a generalist of equal overall rating at a WOD that favours their strong domain.
  • SP — Season Points for the current calendar year. Sums implied-rating contributions across every event this season. Resets each January. Rewards both quality and volume within a year.
  • Points (CWP) — accumulated volume. Sums prestige × percentile × time-decay across every event. Rewards both quality and participation.

Form is the same formula as Skill but restricted to the last 12 months. On the table it shows as an arrow: up means currently performing above career average; down means below. Team events are excluded from Form — individual contribution in a team is unobservable.

How each WOD is rated

The model is a Bayesian sports rating adapted from disc golf (PDGA) and chess (Elo/Glicko-2) conventions. Each WOD has a scratch rating — the expected rating of a median athlete at that competition level — and a spread. Your per-WOD rating is roughly:

per_wod_rating ≈ scratch + z × spread

zis your score margin, computed from the actual times and reps against the WOD’s field distribution — not from your finish position. A 3-second gap in a 10-minute sprint counts for more than the same gap in a 30-minute chipper, because the model scales by how spread out the field’s scores actually are. Finish position is displayed for context but never feeds the rating.

Scratch is dynamic.For any WOD with enough rated participants, scratch is the observed median rating of that specific event’s field. A nominally “Scaled” division with strong athletes gets a higher scratch than a nominally “Intermediate” division with weak ones. Division labels are treated as advisory, not authoritative — the actual measured field quality is what calibrates the WOD.

When score data is unavailable or the WOD’s scoring is ambiguous (mixed formats within a division, missing time cap, etc.), the model falls back to a finish-percentile calculation. This is less precise than score-margin but preserves correct ordering.

Per-domain ratings (Cardio / Strength / Gymnastics)

Every WOD is a mix of movements. Each movement has a domain profile — running is pure cardio, a deadlift is pure strength, a muscle-up is pure gymnastics, a wall-ball sits between cardio and strength, etc. A WOD’s overall profile is the weighted mix of its movements, with the weights based on how many reps a typical programming of that movement contains.

Your rating from each WOD contributes to your three per-domain ratings in proportion to the WOD’s profile. A workout that’s 60% cardio, 30% strength, 10% gymnastics gives 60% of the evidence to cardio, and so on.

The per-domain values start centred on your overall Rating and shift based on domain-specific evidence — so if you consistently outperform your general rating on strength-heavy WODs, your Strength rating rises above your overall while Cardio holds steady.

Movement domain weights and typical rep counts are curated by ensemble review across multiple sources. Not every WOD has a full movement extraction available (older or unstructured descriptions); for those, the rating still contributes to overall Skill but not to per-domain evidence.

Who appears in the ranking

The public ranking shows athletes who are active in the last 3 years and have competed in at least one Estonian local competitionin that window. The Open alone doesn’t qualify — many athletes register without completing the workouts. A local throwdown is the signal someone is an active Estonian competitor.

Athletes with sparse data (below a minimum evidence threshold) are flagged P Provisional. They appear in the list; the homepage filter lets you hide them for a tighter view.

Time decay

Every event is weighted by exponential decay with a 2-year half-life, cut to zero at 5 years. No hard cliffs — Estonian athletes typically have 1–2 events per year, so a cliff would erase rankings every off-season. Standard practice in peer sports-rating systems (OWGR uses 2 years, PDGA uses 12 months).

100%
0 years
~71%
1 year
50%
2 years
25%
4 years
0%
5+ years

Season Points (SP)

SP is an accumulation metric for the current calendar year (Jan 1 – Dec 31). It uses the same implied rating as Skill, but instead of a career average it sums contributions across every event this season:

event_contribution = max(0, avg_implied_rating_this_event − baseline)
SP = Σ event_contributions

Only results above the local Scaled scratch line add points — scoring below that at every event contributes zero. Competing in more events at higher levels compounds: a QF athlete who also does five local comps will outscore someone who only does the Open.

Unlike CWP, SP has no time decay or stage prestige multiplier — every event this year counts equally. It resets to zero each January, so it reflects current-season activity rather than career history.

Points (CWP)

CWP mirrors CrossFit’s official Worldwide Ranking stage maxima (Open 1k / QF 2k / SF 4k / Games 10k). Per event: stage_max × percentile × time_decay. Your CWP is the mean of your best 8 event scores, with a minimum divisor of 5 — so one lucky result doesn’t inflate a sparse athlete, and competing more doesn’t pad results once you have 8 events.

Data quality & recovery

The ranking is only as good as the data. Every event we ingest goes through a completeness gate — for every athlete who finished the competition, we verify their score is present on every WOD they attempted. Events that fail this gate are flagged for manual review before their scores contribute to ratings.

Historical data from before this gate existed has been audited row-by-row: hundreds of thousands of scores that were previously filtered out due to formatting inconsistencies (mixed-metric events, capped-time formats, non-English descriptions, etc.) have been reclassified and reprocessed where the underlying data was valid.

We also detect and handle divisions where finishers and non-finishers submit different score types (e.g. finish times vs rep counts at the time cap). These get percentile-based ratings instead of score-margin, which preserves correct ordering when raw score types can’t be reconciled.

Known limitations

  • Movement extraction coverage. Most WODs have their movements successfully extracted from their descriptions; the rest (empty descriptions, PDF-linked scoresheets, or non-English narrative) still contribute to overall Skill but not to per-domain evidence. See /stats for the current number.
  • Regressive vs walk-forward accuracy. When we test the model against past events, some of that accuracy comes from the fact that those same events are IN the training data. A walk-forward test (predicting future events from past-only data) is a stricter measure and gives lower but more honest numbers. Both are reported internally.
  • Body-size normalisation is limited.Weight and height data comes from athletes’ own opt-in on their public CF profile, which caps coverage well below 100%. Ratings therefore don’t currently normalise for bodyweight the way a strict lifting rating would.
  • Team ratings are separate. Ratings from team events feed a different framework because individual contribution inside a team score is unobservable. The main Rating column is individual-only.
  • Time decay excludes inactive athletes. No results in 5 years = zero contribution. A future Hall of Fame view may preserve historical recognition.

Source data

All inputs are public competition leaderboards from three platforms:

  • CrossFit Games (games.crossfit.com) — Open, Quarterfinals, Semifinals, Games. Each athlete links to their CF Games profile at games.crossfit.com/athlete/{id}.
  • Circle21 (circle21.events) — Estonian and Baltic local competitions: Fittest in Tartu, Kõu Hybrid Storm, Viljandi Throwdown, and others.
  • Competition Corner (competitioncorner.net) — additional local events including Tallinn Throwdown and Nordic Masters League.

See privacy & data for how the data is sourced and how to request removal. Full data coverage →