Evaluating Riders: Log Rank

Evaluating rider performance in professional cycling is a hard problem. While more advanced statistics like climbing times, segment times, survival with leading group, and others are available for certain races and certain riders, for most races and certainly for anything historical we’re left with something like this PCS result table: finishing rank in race, maybe UCI points, PCS points, and time gaps.

So any rider performance statistic has to be based on one of those three data-points: time gaps, points, or finishing rank. Each has its place.

Time gaps are a very poor way to evaluate success in a bunch sprint where 100 riders might finish on the same time, but they can be a good way to evaluate success on a mountain stage with an uphill finish.

PCS points have been developed into a widely used evaluative method which recognizes that success in cycling can be achieved in a wide array of competitions (GC, race wins, jersey competitions) and has dozens of different scales which are used for different quality of races, but fundamentally their point scales are opinions on the value of different results relative to each other.

Finally, finishing rank is often used to count victories, podiums, or top 10 finishes across the season, but is plagued by vastly different difficulty levels to achieve certain results (how good is 3rd in a World Tour race relative to 1st in a .1 race?). Ranks are often notoriously difficult to take averages of; Wout Van Aert’s transcendent 2021 Tour de France yielded an average rank of 25th for a return of 3 stage wins, just behind Enric Mas’s 6th on GC with nary a stage podium finish.

In recent months, I’ve developed my own tweaks to use finishing rank as an evaluative method, producing a stat I’m calling Log Rank. The handful of keys to make it work are:

  1. All finishing ranks in a race are transformed by taking the natural logarithm. This produces a value system where the difference between finishing 1st vs 5th are large, while the difference between finishing 50th vs 100th is not as large. The red dots below show equal gaps between results; so 1st and 3rd are separated about as much as 3rd and 7th/8th. However 1st and 7th/8th are separated equally as 7th/8th and 55th. I think this is a fairly intuitive appraisal of the value of different finishing positions.

2. Using these transformed ranks, taking averages are much easier. For example, Wout Van Aert’s final week of Tour de France where he finished 25th, 40th, 36th, 43rd, 1st, 1st (average 24th) are transformed into 3.2, 3.7, 3.6, 3.8, 0, 0 (average log rank of 2.4) which can be re-transformed back into average rank of 11th (by taking e^x where x = average log rank). Basically, this says we care way more about Van Aert’s two victories than the fact he finished outside the top 20 in those other races. In fact, he could have finished 50th in those four stages (new average of 34th), but his log rank would only change to 13th.

3. The difficulty of different races are found by an objective system which looks at how difficult it is to achieve certain results in different level races. For example, in recent seasons it is roughly similar difficulty to achieve a 10th place in an U23 2.2 level race as a 27th place in a World Tour race. Using a host of these type of comparisons, I’ve created a Strength of Peloton rating system to judge all level of races against each other based on the difficulty to achieve certain levels of results. All that needs to be said here is that results are adjusted based on what type of races they were achieved in. For example, Ethan Hayter and Tadej Pogacar achieved very similar raw finishing ranks in 2021, but Pogacar did so against the 4th toughest pelotons and Hayter only around the 600th toughest.

2021 Log Rank Rankings

Applying those three steps yields the following top 10 for all 2021 results, just averaging all race results (ignoring time trials):

RiderAverage Log Rank
Wout Van Aert4.3
Tadej Pogacar4.9
Mathieu Van Der Poel5.0
Primoz Roglic6.9
Sam Bennett8.3
Sonny Colbrelli9.2
Ethan Hayter10.0
Jasper Philipsen10.2
David Gaudu10.4
Julian Alaphilippe11.6

Building on Log Rank

The next challenge was to build on this basic Log Rank to add in parcours level impacts of things like the climbing difficulty and whether the race ended in a bunch sprint. For example, Enric Mas raced 66 times on the road in non-time trials in 2021. If we’re judging how good of a rider he is we probably don’t care about where he finished in the flatter stages which littered the Tour de France and Vuelta a Espana. However, we care a lot about how he performed in the tougher climbing stages of those races and others.

The find the impact of climbing difficulty and a bunch sprint finish I set-up a mixed effects model which can be run over results from a given period of time (eg, July 2019 to June 2021 to predict performance going into the 2021 Tour de France). The model was specified using three random effects involving individual riders attempting to find a) their general level of ability to finish with a good finishing rank in races b) the impact of climbing difficulty on their finishes, and c) the impact of the race ending in a bunch sprint on their finishes.

lmer(log_rnk ~ (1 + climb_difficulty | rider) +
 (0 + bunch_sprint | rider)

Using this model, we would expect a sprinter like Sam Bennett who struggles in the hills and mountains, but generally ranks highly in terms of finishing rank to have a smaller individual coefficient (indicating that he generally achieves high finishes), a larger climbing difficulty coefficient (indicating that as races get tougher in terms of climbing his finish rank get larger/worse), and a negative bunch sprint coefficient (indicating that he finishes with better ranks when the race ends in a bunch sprint vs smaller group).

The model results for July 2019 to June 2021 show Bennett with about the 50th best general ability to finish highly (a above), the 20th worst impact of climbing difficulty (b above), and the 2nd best bunch sprint impact (c above). Overall, he would be expected to finish with an average rank of 3.7 in a flat, bunch sprint race – 2nd best in world between Wout Van Aert (3.1) and in front of Caleb Ewan (3.8).

We can similarly look for hilly races not ending in bunch sprints (prototypical classics race) where Mathieu Van Der Poel had the best prediction at that time at 6.8 – essentially tied with Wout Van Aert – and ahead of Roglic, Pogacar, Van Avermaet, and Alaphilippe.

The top predictions in high mountains race were unsurprisingly the three main recent grand tour winners: Pogacar, Roglic, and Bernal. They were followed by Mikel Landa and Adam Yates.

2 thoughts on “Evaluating Riders: Log Rank

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s