Survival Probability (2020 TDF)

In recent years the Tour de France has added the live tracking feature to their online/second-screen coverage of the Tour. This telemetry data shows the position of every rider on the course (absent any errors/malfunctions/bike changes) throughout the race – including information about their speed, the road conditions, and wind conditions.

So far this has largely been exploited only as a social media activation tool for NTT (eg, on Twitter @letourdata). But knowing the position of every rider with their speed is obviously powerful information. For example, who was pulling in the lead group to try to extend the gap on stage 7 of this year’s Tour de France? How large was the group at the bottom of each final climb? How much time did Zakarin lose to the leaders on stage 8 on descents? Which Jumbo Visma domestique drove the pace the hardest on the climbs?

Leveraging this data, I’ve analyzed ten of the hilly or mountainous stages of this year’s Tour de France to look at the probability of staying with the front group (defined as the group with Primoz Roglic as he was in yellow for the lion’s share of these stages) over the stage. I’ve decided to ignore riders who spend the stage in the breakaway, but anyone who attacked away from Roglic (eg, Pogacar in stage 8) counts as surviving as well.

Survival Probability for Notable Stages

Survival probability with Roglic by rider type (Stage 8 2020 TDF)

Stage 8 was a short 141 km stage with three major climbs – Col de Mente at 82 km to the finish, Port de Bales at 37 km to the finish, and Col de Peyresourde at 11 km to the finish. Col de Mente did little to shake-up the peloton and almost all riders were able to come together to the bottom of the Port de Bales – a 12 km HC climb. That was where the major selection on the stage came; by the end of the climb, less than 20% of the riders in all rider types except Climbers had been distanced from the GC group. About 60% of Climbers survived Port de Bales with the GC group.

The selection for climbers came largely on the Peyresourde and about 30% of climbers survived with Roglic to the end of that climb (with nine riders finishing on the same time from the GC group).

Survival probability with Roglic by rider type (Stage 4 2020 TDF)

Compare that with Stage 4. Stage 4 was not particularly selective before the final climb with a handful of category 3/4 climbs leading up to the 1st category climb to Orcieres-Merlette. At the end of the final warm-up climb at 20 km to go at least 50% of domestiques and sprint train riders were still there along with upwards of 75% of puncheurs, mountain helpers, and climbers. The non-climbers were distanced quickly on the final climb, but it wasn’t until the final few kilometers of the stage that the selection was made among climbers and even then over 60% of them came to the line with Roglic (with 16 riders finishing on the same time).

Survival probability with Roglic by rider type (Stage 17 2020 TDF)

Stage 17 had two HC climbs – the Col de Madeleine summit came with 64 km left and the race finished on the Col de la Loze. The major selection here came very early on the Madeleine where already only half of climbers were left in the front group with 5 km to go on that climb. Riders were steadily distanced on Col de la Loze until the leaders came over the line with massive time gaps. The first six riders came in alone and there were 17 different groups in the top 20 riders.

Most Selective Climbs

The four most selective climbs for Climber rider types were Col de la Loze on Stage 17 (50% of climbers at beginning vs 12% at end), Montee du plateau des Glieres on Stage 18 (72% of climbers were left at beginning of climb and 22% at end), Col de Peyresourde on Stage 8 (54% at beginning and 27% at end), and Col de Madeleine on Stage 17 (100% at beginning and 50% at end).

For the full peloton, Col de la Loze (Stage 17) was the most selective overall, cutting the peloton down to less than a sixth of its size before the climb. The Madeleine (Stage 17), Port de Bales (Stage 8 penultimate climb), and Glieres (Stage 18) were the next most selective – each reducing the peloton to a fifth of its prior size.

With enough data it would be interesting to tease out the most important factors to make a climb selective. Is it the length, the gradient, a combination of the two, the position in the stage? Based on this limited sample of 50+ climbs, the two most important factors are the length in kilometers (long climbs are more selective) and the overall difficulty in terms of vertical gain (gradient * length). The difference in length and vertical gain between a typical HC climb like the Col de Madeleine and a 1st category climb like the Orcieres-Merlette climb is about five times more important than the difference in distance to the finish between a climb 100 km from the finish and one which is a summit finish. However, that is a very weak claim with only a dozen stages worth of data.

Other Survival Probabilities

Stage 6 2020 TDF
Stage 9 2020 TDF
Stage 12 2020 TDF
Stage 13 2020 TDF
Stage 14 2020 TDF
Stage 15 2020 TDF
Stage 16 2020 TDF
Stage 18 2020 TDF

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s