Relative Power Output by Stage Characteristics

In my last post I dug into the Tour de France power data shared by @Velofacts, specifically adding to his analysis by breaking-down the relative power output of each rider compared to themselves. In other words, instead of judging the power output of Thomas de Gendt relative to other riders who have different skill-sets, judge relative to his own level. Some of the key findings were: 1) the flat sprint stages saw significantly lower relative power output than the big mountain days, 2) tough days where the peloton pushed hard like stage 7 saw comparable power output to the mountain days, and 3) riders saw peak power output when they were in the break.

This post will take that analysis further and determine what stage characteristics lead to high or low relative power outputs across pro races. To do that, I’ve collected nearly 10,000 individual stages linked to specific races for pro riders across the World and continental tours for 2019 and 2020. This data-set includes 292 unique riders with 98% of the data coming from riders with at least 10 races (the minimum to include in modelling below). The average normalized power for this sample was 278 watts (4.09 watts/kg) with the 10th to 90th percentile represented by 232 to 321 watts (3.49 to 4.67 watts/kg).

Model Creation

I separated 2019 from 2020 with 2019 acting as the training set and 2020 as the test set so we’ll be able to judge how predictive the model is without having seen the data yet. I built two models: a simple linear regression with easy to interpret effects and then a random forest based model (xgboost) which should theoretically have better performance with worse interpretability.

I linked the race level power files to my existing data-set of stage results which include variables like whether a race was a time trial, one day race, and/or grand tour, what the climbing difficulty of the stage was, whether the stage ended with an uphill finish, what class of race it was (World Tour or lower levels), but also finishing position data from riders.

To build the linear model, I included:

one_day_race, time_trial, length of stage (km), natural log of finish position, climb_difficulty of stage, and rider_DNF (did not finish race). I also included an interaction between log finish position and climb difficulty with the idea that there is probably a larger difference in power output by finish position on tougher stages.

The model was built to predict the relative power output on the stage calculated in the form of: eg, 300 watts on stage / 285 watts on average = 1.053 relative power output.

The linear model achieved in-sample R^2 of 0.25 with a standard error of 0.10. Obviously predicting power output is a high variance task. Five of the seven variables were judged significant at p < 0.01 level (rider DNF was not significant and the finish position/climb difficulty interaction was significant at p <0.05 level).

The coefficients were:

VariableCoefficientSE
Intercept1.0960.01
Natural log of finish position (1)-0.0140.002
climb_difficulty (2)0.0070.001
time_trial0.1380.009
one_day_race0.0550.003
length in km-0.0005<0.001
Natural log of finish position * climb_difficulty-0.0005<0.001
Rider DNF-0.0090.01
R^2 = 0.25, SE = 0.10

(1): Actually natural log of rank + 1 to allow for interaction term as LN(1) = 0

(2): Climbing difficulty is judged on a scale starting at 0 where the toughest mountain stages are around 30. Classic races like Flanders and Strade Bianche come in at 4-5, hillier races like Liege-Bastogne-Liege and Fleche Wallonne at 8-10, grand tour mountain stages typically start at 12 and up.

Practical Impacts

A rider finishing 1st on a mountain stage (climb difficulty = 15) will be estimated to have 9.4% higher power output than the same rider finishing 150th on that mountain stage. On a flat stage, the 1st place rider will have about 6.3% higher power output than the same rider finishing 150th.

One day races are raced with 5.5% higher relative power – which matches the findings of van Erp and Sanders that one day races are ridden at a higher intensity than stage races.

Time trials obviously have much higher normalized power as they are much shorter races. In this case, 14% higher relative power. Related, stage length plays a small role with shorter stages = greater power output. As time trials are shorter, much of this impact comes from time trials, but shorter stages like Stage 20 of the 2019 Tour de France have much higher normalized power than longer stages.

Testing

Testing this model on the 2020 data shows similar out-of-sample fit – R^2 of 0.23 and a standard error of 0.10. Again, predicting relative power at the stage level is a high variance endeavor!

The highest predicted power output in the test set (>2750 races in 2020) was Thomas De Gendt’s stage 20 time trial in the 2020 Tour de France which was predicted at 121.9% of his average normalized power. The Planche de Belles Filles time trial had almost all of the elements of a high power output stage: short, a time trial, with a lot of climbing. De Gendt finished 20th so our predicted power for the higher finishing riders would have been even higher. De Gendt actually recorded 135% of his average normalized power!

The highest road race prediction was Pierre Latour at Mont Ventoux Challenge – a one day race with two significant climbs where Latour finished 4th. The prediction was 118.5% of Latour’s average normalized power, but he only produced 105%. The residuals for ten riders with power files from that raced showed it as the 3rd largest negative difference between predicted and actual power – indicating the race required much less power than predicted by the model.

The flip-side of that was Stage 7 of the Tour de France where Bora attempted to make the race extremely difficult to shed Peter Sagan’s sprint rivals. Later the stage exploded in the crosswinds. Overall, it ranked as the 6th largest positive difference between predicted and actual power. Parcours and race type play a significant role in power output in a race, but how the race is ridden is a huge factor.

Gradient Boosted Model

Gradient boosted models leverage hundreds or thousands of independent random forest models to learn which variables are most significant and derive predictions. In this case, I used the same training and testing data and the same variables with the xgboost package in R.

Optimizing for root mean square error gave me an error of 0.088 on the training data and 0.099 on the testing data. Out of sample, the R^2 was 0.24 – not much improved on the linear model. Based on that lack of significant improvement from the boosted model, it makes sense to rely on the more easily interpreted linear model.

Predictions

To end, here are the top 10 over-estimated and under-estimated stages by the model for 2020. As mentioned, the Mont Ventoux Challenge was one of the most over-estimated in power output alongside four of the flatter Tour de France stages, the pan flat Milano-Torino race, a flat stage from Tirreno-Adriatico, a Tour of Portugal time trial, and – surprisingly – Milano-Sanremo.

On the under-estimated side, there’s a handful of minor French and Spanish stage races from February along with the World Championship road race, a Binckbank Tour stage where Mathieu van der Poel won from 60km out, and the aforementioned stage 7 of the Tour de France.

Intensity in Tour de France 2020

Twitter user Velofacts does great work compiling and sharing power data from pro riders on Strava. He’s collected stage level normalized power data from about 60 Tour de France riders and what looks to be nearly 1000 different stages. He’s analyzed the data by calculating watts/kg which shows FDJ domestique Sebastien Reichenbach as the rider who has generated the most watts/kg at just under 4.0 average over the race. He has also calculated stage level averages which shows the peak power output stage was stage 9 (306 watts) and that mountain stages were generally raced at 275-300 watts (with Bora’s demolition of the peloton in stage 7 and ensuing splits in crosswinds coming in as the 5th toughest stage).

Related to my recent post on measuring intensity using relative perceived effort and power data, we can utilize this data to calculate a rider’s relative power output across the race. van Erp and Sanders have found power output in grand tours is not on average any higher than even lower level one day races, however this can be explained by riders pacing themselves throughout with as many big efforts and lesser efforts. Is this obvious in the data?

Relative power outputs in 2020 Tour de France

Of Velofacts’s 60 some riders I’ve chosen 33 of the most interesting riders who made an impact on the Tour and calculated their relative power output (each stage divided by their race average). I’ve also classified them roughly into three groups to show what role they played in the race; guys like Dries Devenyns, Roger Kluge, and Tim Declercq were ‘Workers’, Sepp Kuss, Harold Tejeda, and Reichenbach were ‘Climbers’, and Simon Geschke, Quentin Pacher, and Carlos Verona were ‘Breakaway’.

Relative power output averages by rider type in 2020 Tour de France (data from @Velofacts)

You can see the whole peloton got a break on stages 3, 5, 11, and 21 with all of the groups having much lower average power outputs. Climbers posted their peak relative power output in Stage 9 at 16% higher than their Tour average. Breakaway men peaked in Stages 9 and 16. Workers peaked in relative power on the difficult mountain days between stages 16-18, but also had to generate equal effort in Stage 7. Interestingly, the workers had less variable power outputs overall with a standard deviation of 7% vs 9-10% for the other two groups of riders.

Selected riders with relative power output by stage (data from @Velofacts)

Some interesting items above:

  1. The days riders were in breakaways are obvious – especially the ones in the first half of the race on easier days. Pacher’s stage 4 breakaway was 16% higher than his average and Ladagnous’s in stage 11 was equal to his race average, but about 15% higher than the average for other riders!
  2. Similarly, we have three of the top 5 on Stage 16 in this data-set – stage winner Kamna was +19%, Geschke +21%, and Reichenbach +19%. Those are three of the top 6 relative efforts in the entire race.
  3. The upper range in terms of efforts looks like about +20%. Harold Tejeda and Sepp Kuss both hit those figures in Stage 9, while Geschke and Pacher did in Stage 16. The standard deviation across all 619 stages analyzed by me is about 10%.
  4. The three riders who varied the most between stages were Neilson Powless, Simon Geschke, and Sepp Kuss. Powless and Geschke were both involved in several large breakaways racking up the 3rd and 10th most kilometers in breakaways according the Pro Cycling Stats. Kuss was Roglic’s top climbing domestique and as such had four efforts of 10% or higher than his average as well as three of around 20% below his average.
  5. Tim Declercq’s monster efforts at the front of the peloton on Stage 10 is obvious. That was Declercq’s peak effort in the race. He spent about 67% of the race in front of the peloton – by far the highest total of the day.

Relative Perceived Exertion (RPE)

Continuing my exploration of the recent pro cycling analytics papers, today I’m going to dig into three related papers on measuring intensity to monitor fatigue. The goal is to apply these findings to build an intensity metric that can be applied globally to see which riders have experienced higher or lower intensities at a given point in the season.

I will examine:

The datasets here are nine riders from single cycling team within the 2016 Giro (paper C), twelve riders from a single cycling team within the 2016 Giro and/or 2016 Vuelta (paper A), and twenty riders (presumably from the same team) in a range of World Tour and lower (HC, Level 1) level races (paper B). Paper A also included a baseline training data-set from two weeks prior to each race. The authors gathered power output, heart rate, and relative perceived exertion data from each race and calculated intensity metrics.

Relative perceived exertion (RPE) is of particular interest as it provides a data point which is not publicly available in the ways that power data (for example from Strava) or riding speed is for many pros. For those unfamiliar, RPE is simply the athlete’s assessment of the difficulty of their workout on a scale of 1-10 where 10 is the most difficult.

The RPE was obtained 30 min after the exercise bout based on the question: “How hard was your workout?”

pg 2 Sanders and Heijbor (2017)

Intensity in Grand Tours

Paper A analyzes the intensity metrics in four groups: a baseline two weeks prior to a grand tour and then week 1, week 2, and week 3 of grand tours. They find the intensity as measured by RPE increases from 3.5 in baseline training to 6.0 in week 1, 7.0 in week 2, and 7.4 in week 3 – where week 3 is significantly different from week 1 (and all three weeks from the baseline). Power output – both mean watts and normalized power in watts – differed significantly in weeks 2 and 3 from week 1.

This matches what we typically see in grand tours where the first week is easier than subsequent weeks. Eg, of eight stages in the 2016 Giro classified as mountain stages by ProCyclingStats only one was in week 1 (stages 1-7), while three were in week 2 (stages 8-14) and four in week 3 (stages 15-21). This was similar to 2016 Vuelta where all seven of mountains stages came in the final two weeks.

Paper C digs into the differences between different stage types using the same type of data-set just from the 2016 Giro. They divide stages into four types: flat, semi-mountainous, mountainous, and time trials which seem – based on sample size – to largely correspond to the aforementioned PCS categorization. A mountain stage had to either have 35km+ of total climbing and/or a 10km+ finishing climb, while a flat stage could not have more than 13km of climbing and could not end uphill.

RPE by stage types showed flat stages easier at 5.8, semi-mountainous/hilly at 6.5, mountain stages at 7.8, and time trials at 6.8. The gap between mountain and flat stages was significant. Power output also increased significantly between each of the three road stages with mountain > semi-mountainous > flat.

So we have some basic findings:

  • Baseline training leading into a grand tour (and presumably in taper mode) is about a 3.5 on RPE
  • Flat stages – as are typical in the first weeks of grand tours – rate around 6.0 in RPE
  • Hillier stages rate around 6.5
  • Time trials will be rated around 6.5-7.0 – presumably higher for those riding them with intent to compete for podium/in team time trials
  • Mountain stages will rate closer to 8.0

Influence of Category

This is the most interesting paper of the bunch as it leverages the vast array of races a World Tour team will enter throughout the year to attempt to tease out intensity differences by category. Pro cycling is organized with a the highest level being the World Tour of the most elite ~35-40 or so events including the grand tours and the five one-day monuments at the top of the heap. Below that level is three additional levels of .HC, .1, and .2 races. A World Tour team will typically compete only in the first two levels in a season with maybe a quarter to half the teams in a given race at the .HC level being World Tour teams and a lower percentage being World Tour teams in Level 1 races.

A note, the RPE values in this study are collected on a different 6-20 scale from the 1-10 scale used in the other two papers.

The authors show two sets of a results utilizing RPE; one focusing on one-day races comparing monuments (the five most prestigious one day races) with three other levels of one-day races (World Tour, HC, Level 1) and another for grand tours compared to three other levels of stage races (World Tour, HC, Level 1).

They find a RPE of approximately 18 for the monuments, vs 17 for World Tour races and 16 for HC/Level 1 races. The monuments differ significantly from each of the three lower levels and World Tour also differs significantly from Level 1 races. Monuments tend to be much longer races (268 km on average vs 219 km average for World Tour and <200 km average for HC/Level 1 races) which can explain the differences in intensity.

We do not see a similar stratification for stage races. Of stage races, the grand tours actually average the lowest RPE (14.5), and they stand-out as significantly lower in terms of max/mean heart rate (power output is not significantly different in a high or low sense). This is likely to do with team strategy for which I can’t explain better than the authors.

When comparing single-day races with multi-day races, it is clear that for all the race categories the single-day races are higher in volume, load and intensity compared to the multi-day races. Race regulations are an important contributor to this. Volume and load are higher competing in single-day races because race regulations allow longer races within all the single-day race categories compared to the multi-day race categories.

Furthermore, the higher intensities within the single-day races could be caused by differences in race tactics between the single-day and multi-day races. In a single-day race, a cycling team has one goal and that is to finish as high as possible and thus the whole team (race leader and domestiques) will work without any necessity to hold back for other days to come. Within a multi-day race, a team has different goals per stage and this will depend on their overall goal. For example, when a team brings a sprinter as a team leader to a multi-day race, on the flat stages the support riders will likely have to work on the front of the peloton which will result in an increased exercise intensity and load whilst the support riders for a climber will have a higher exercise load on the climbing stages when working for their leader.

Overall race length (i.e. number of stages) can be a cause for the slightly higher intensity measures (absolute and relative PO, IF) in the 2.1 race category compared to higher level multi-day stage races. On average, the lower category races are shorter and some have only two-race days. The more days a multi-day race consists of, the more riders will most likely aim to spread their energy over multiple days (and aim to minimise energy expenditure on days where it’s possible).

pg 11-12 van Erp and Sanders (2019)

Some more findings:

  • One day races see significant stratification between monuments > other World Tour races > other one day races where monuments are about 8% higher RPE than World Tour and 12% higher than HC/Level 1.
  • Stage races overall do not see this stratification – likely because of strategic pacing the riders implement over the length of the race. I presume stages ridden most competitively will be similar to one day races, while those ridden less competitively will be lower than average. Overall, the level for stage races is about 8% below the HC/Level 1 one-day races.

Implementing an intensity measure globally

These papers provide a solid foundation for a global RPE metric. Difficulty of the profile can increase the RPE by about 33% between a typical flat stage and a typical mountain stage in a grand tour. In addition, monuments and World Tour one-day races will be raced with between 4% and 12% higher intensity than HC/Level 1 one-day races. Stage races on average will not be raced differently by level, however certain stages will be ridden with higher intensity than others such that the combined average is approximately 8% below a HC/Level 1 one-day race.

There’s a big missing piece here; how do we estimate which stages in a stage race were ridden with higher vs lower intensity in stage races?

Different riders will ride differently depending on their team orders/role; eg, a domestique for a team focused on their sprinter like Quick Step will surely have higher RPE on a flat stage where he is responsible for bringing back the breakaway or leading out a sprinter than on a mountain stage where he isn’t protecting a GC leader. If riders could be classified as primarily flat or primarily climbing riders this would be an easier determination to make.

We could also use finishing position and/or presence in a breakaway to estimate higher RPE than normal. Eg, Michal Kwiatkowski’s stage 18 victory in the Tour de France came in a long breakaway over four high mountains where he was in a small group for most of the stage. He certainly had a higher RPE than in stage 17 when he finished 130th in the gruppetto on a similar high mountain stage.

2020 Road World Championships

This idea of recent rider-specific intensity is particularly relevant this week. The Road World Championships are being held with the men’s road race on Sunday – just a week after the Tour de France. While the startlist isn’t completely final, a large majority of the contenders per the betting odds competed at the Tour – meaning 21 days of racing in the last 30 days as of this Sunday. Typically worlds are held two weeks following the Vuelta a Espana (and two months after the Tour) which means the amount of racing in many of the contenders’ legs will be higher than normal.

Top contenders like Jakob Fuglsang, Thomas Pidcock, and Diego Ulissi did not ride the Tour. Fuglsang rode several one-day classics – high RPE events – in August followed by a week-long stage race in mid-September. Pidcock rode the U23 Giro d’Italia at the turn of August into September. Ulissi has ridden two stage races for a combined ten races worth of effort in the past month – most spent riding as one of the leaders. In each case, these riders have roughly half the race days in their legs as co-favorites Wout van Aert and Julian Alaphilippe.

Tom Dumoulin in Grand Tours (van Erp et al 2019)

Some of the best pro cycling research in the last few years has come out of Team Sunweb thanks to sports scientists Teun van Erp and Dajo Sanders. They have had access to Sunweb’s power files for several seasons of racing and have written several detailed analyses of the differences between men’s and women’s racing, the relationships between different training load measures across different stage types, the influence of race category and results on intensity, and several others.

The most interesting work was done by van Erp in collaboration with three other researchers, and was published in November 2019 in Medicine and science in sports and exercise as Load, Intensity, and Performance Characteristics in Multiple Grand Tours. The pdf can be accessed at that link.

Their work analyzes four grand tour performances by Tom Dumoulin where he was the GC leader for Sunweb/Giant-Alpecin in the 2015 Vuelta, 2017 Giro, 2018 Giro, and 2018 TDF. As the paper notes, Dumoulin finished 6th, 1st, 2nd, and 2nd in those tours and won at least a stage in each. Not included in the analysis were three other grand tours in the time period where he DNF’d as the focus was on his performance while contending for GC throughout.

Their data set was Dumoulin’s power data on the finishing climbs throughout the Tours. In this case, they had 33 climbs ranging from short efforts like Mur de Bretagne in Stage 6 2018 TDF to longer efforts like Mount Etna in the Giro. They supplemented the power files with information about the climbs (gradient, distance) and stage conditions (temperature, altitude climbed prior to final climb).

The main findings were that the three different grand tours had broadly similar requirements to win in terms of load and intensity characteristics. The power requirements over 33 final climbs in those tours averaged 5.9 watts/kg +/- 0.6. And those power outputs were impacted significantly by the duration of the climb and the amount of climbing prior to the climb on the stage.

Ross Tucker and others in less formal analyses have shown before that watts/kg in the high 5s/low 6s are required to contend for grand tours in the mountains, so it is great to see that replicated with actual power data from a World Tour team. Ross quotes the work of twitter climb timing expert Ammattipyoraily in a 2015 article here showing estimated watts/kg for Tour winners using Michele Ferrari’s equation; he showed Armstrong averaging 5.92 watts/kg or higher in his last six Tour wins, with more modern winners like Contador, Nibali, Wiggins, and Froome in the 5.87-6.07 watts/kg range.

Mike Puchowicz has also posted graphs of relative power output for the top contenders in 2013 and 2014 TDFs here. You can dig into Ross’s archives on the Tour analysis here and read Mike’s work at Veloclinic.

Returning to the paper, the most interesting part of this work is when they analyze a range of variables and how they impact the power output on a climb. Obviously I would love to see this with more than one rider, but their results fit the smell test and give us some coefficients. Their three factors influencing power output on final climbs are:

  1. duration of the climb (length is negatively associated with power output)
  2. gradient of the climb (steepness is positively associated with power output)
  3. total elevation gain (TEG) before mountain (a lot of preliminary climbing is negatively associated with power output)

The log duration in minutes of climb is such that a 15 minute effort is ridden at 0.8 watts/kg higher than a 45 minute effort. The gradient (in %) is such that a 5% climb is ridden at -0.6 watts/kg lower than a 10% climb. And the total elevation gain is such that a climb after a comparatively flat stage before a climb (TEG of 1000 meters) is ridden at about 0.45 watts/kg higher than a climb after a comparatively mountainous stage before a climb (TEG of 3000 meters).

This certainly fits with what you would assume; a short, steep climb at the end of a flatter stage (think the typical wall finish in a Vuelta) would be ridden at a high watts/kg, while a long, grinding alpine climb at the end of a tougher climbing day would be ridden at lower watts/kg.

They also show Dumoulin’s maximum power profile over each race, which shows 20 minutes efforts of 6.0-6.2 watts/kg and 60 minute efforts of 5.1-5.6 watts/kg across the four races (page 11). They also plot each of the 33 finishing climbs with the climb duration (X axis) and watts/kg (Y axis) to show his relative power output (page 12). He likely peaked in these four races in stage 14 of the 2018 Giro on Mount Zoncolan where he was bang-on 6.0 watts/kg for 40 minutes in finishing 5th on the stage.

General Classification Model

Winning the GC in a stage race should be considered as distinct from stage-by-stage success. Twelve of the last 30 TDF winners have done so with zero or one stage wins and eleven of the last 30 TDF winners have done so without winning a non-time trial stage.

To measure a rider’s GC ability, I collected results from the major stage races for the last 30 years. Riders were awarded points based on finishing positions with five different (arbitrary) scales for groups of races (A: Tour de France, B: Vuelta/Giro, C: Switzerland/Dauphine/Paris-Nice, D: races like Tour of Catalonia, Tirreno-Adriatico, Tour of Basque Country, etc, E: all other races of significance to predicting a grand tour GC). That final point is critical; stage races without apparent ability to transfer to winning a grand tour GC (Tour Down Under, Four Days of Dunkirk, old Dubai Tour) were not considered.

Which races are most strongly correlated with Tour de France results?

I matched all rider results (for their careers) in GC races with all of their Tour de France results (for their careers). Eg, Chris Froome’s 2018 Giro victory is matched with each of his Tour de France entries. After filtering for riders with 15+ GC races and at least one GC victory, I grouped by race (Giro, Vuelta, etc) and ran a Spearman correlation of GC results in that race with GC results in the TDF. Spearman correlations measure how strongly one ranking is associated with another ranking.

Most strongly associated with TDF results:

spearman-corr-gc-with-TDF

Results in the Tour de France (in other years) is most predictive of results in other TDF years. Closely following are the other two grand tours and the two most important Tour de France warm-ups (Dauphine/Switzerland). The strong correlation of the Tour of California is surprising, but that was dominated by American riders like Floyd Landis and Levi Leipheimer in the early years and two TDF winners have also won (Egan Bernal and Bradley Wiggins).

The other major week-long races follow at various degrees of moderate to weak correlation. It’s clear that winning a grand tour is something distinct from winning a week long or five day long race. The non-existent correlation of the Tour de l’Avenir (the U23 Tour de France) is interesting; Egan Bernal’s 2019 Tour de France victory was the only one won by a Tour de l’Avenir winner since it became a U25 only race in 1992.

Calculating GC Rankings

The points scales were designed with A races awarding 25 points to winner, B races 20 points, C races 12 points, D races 10 points, and E races 8 points. The value of points decays over time with a weight equal to 1 / (days_since + 730) and a five year window (meaning results since the 2014 Tour de France would be considered when predicting 2019 Tour de France).

I looked at numerous calculations methods (average points per races, total points, best performance, ignoring weighting, etc), but settled on counting just the top five results for each rider. Eg, for Egan Bernal entering the 2019 Tour de France his top five results were #1 in 2019 Switzerland, #1 in 2018 California, #1 in 2019 Paris-Nice, #3 in 2019 Catalonia, and #2 in 2018 Romandie. Bernal ranked 3rd best behind Geraint Thomas and Vincenzo Nibali and ahead of Jakob Fuglsang and Nairo Quintana going into last year’s Tour.

Tour de France Competitiveness

Last year’s Tour was one of the most wide-open with the fourth lowest point total for the #1 ranked rider since 1992 (only 2007, 2006, and 1999 were lower). The previous year’s Tour was the peak in this regard with Chris Froome coming off four wins in five years + holding the other two grand tour titles. Froome eclipsed Miguel Indurain in 1993, 1994, and 1995 and Alberto Contador in 2011.

The most competitive Tour in terms of the average points for the top 15 riders was 2016. All of the top 10 riders in GC points at the time of that Tour started the race (though this did not turn out to be a particularly competitive race with both Nibali and Contador performing poorly).

top-5-in-each-TDF

Above is a graph of the top five riders on each TDF startlist in terms of their GC points ranking. The peaks of Indurain, Armstrong, Contador, and Froome are visible, as well as the weaker transition periods that accompanied both Armstrong and Indurain’s departures from the sport.

Should the 2020 Tour de France actually run as scheduled, we would be in store for another very competitive race with no clear favorite and Froome, Bernal, Primoz Roglic, and Thomas all within less than 10 GC points of each other – similar to the competitive situation last year.

Predicting Tour de France success

Predicting podium success even for the best rider entering each year has not been a slam dunk. Since 1992 only 18 of 28 ‘best GC riders’ have finished on the podium – though the list of failures has been mostly among the weaker ‘best GC riders’ during those transition periods – including Alex Zulle in 1997 and 1998, Damiano Cunego in 2006, and Alexandre Vinokourov in 2007.

A logistic regression model predicting podium success using 1) the natural log of the GC rank entering the race, 2) the best GC performance for each rider, and 3) whether the rider had riden the Giro d’Italia showed each as significant predictors.

The coefficients showed a rider who entered with #1 ranking, had won the previous year TDF, and had ridden the Giro (the situation Chris Froome found himself in in 2018) would be expected to finish on the podium about 61% of the time. If not riding the Giro, that number would be 75%.

A rider in Egan Bernal’s position going into 2019 (#3 ranking, best finish having just won the Tour of Switzerland, and not having ridden the Giro) is predicted for about a 32% chance at the podium.

The limitations of this model are obvious:

  1. It doesn’t take team orders into account. Roberto Heras ranked 3rd, 2nd, and 4th in GC ranking in the 2001-2003 years riding in support of Lance Armstrong while never sniffing the podium. Also, riders who enter aiming for stages after big efforts in the Giro (eg, Simon Yates and Vincenzo Nibali in 2019) will be overrated.
  2. Abandons and DNFs are considered equally alongside finishes outside the top 3. Who knows what Chris Froome’s form would have been if not knocked out in 2014, but all the model sees is the #1 rider in the world defending his 2013 victory as not finishing on the podium.
  3. Young riders are probably underrated; Ullrich entered the 1996 Tour de France with no GC success in his career (though he already had a 3rd place in the World Championship Time Trial at age 20), but finished 2nd easily to team leader Bjarne Riis.

Comparing with Archived Odds

Sports Odds History has pre-race odds available dating back to 2009 from Westgate sportsbook.

The top three favorites (ML odds):

2019: Thomas (+225), Bernal (+550), Fuglsang (+550)

2018: Froome (+150), Porte (+400), Quintana (+800)… Thomas was +1400

2017: Froome (+125), Porte (+175), Quintana (+600)

2016: Froome (+110), Quintana (+200), Contador (+450)

2015: Froome (+175), Quintana (+225), Contador (+350)

2014: Froome (-111), Contador (+150), Nibali (+900)

2013: Froome (-154), Contador (+250), Joaquim Rodriguez (+1800)

2012: Wiggins (+110), Evans (+200), Menchov (+1600)

2011: Contador (-167), A. Schleck (+210), Evans (+2000)

2010: Contador (-200), A. Schleck (+700), Armstrong (+800)

2009: Contador (+100), Armstrong (+250), Leipheimer (+400)

The podium model matches the bookmaker favorite every year except 2015 (Contador #1, Froome #2, Nibali #3, Quintana #4) and 2016 (Nibali #1, Froome #2, Contador #3, Quintana #4).

Used as a diagnostic tool to judge which riders should have been considered favorites in each race, this GC model has value. As a predictive method looking ahead, certainly less so than evaluating the current bookmaker prices ahead of each race.

Climbing Recap thru Pyrenees

Before the GC battle recommences on Thursday with three straight Alpine stages, let’s recap the climbing battles so far through four stages (really three with effort for the GC contenders). Below is a chart of the top climbers so far in terms of time lost to Thibaut Pinot in each of the four mountain stages. Remarkably, Pinot has lost time to just one GC contender on one stage: when Geraint Thomas took two seconds in stage 6.

top-climbers-thrust15

The likes of Adam Yates, Fabio Aru, Bardet, Quintana, and Dan Martin can’t even fit on the chart. Those five on average lost 37 seconds to Pinot on Planche de Belles Filles and each has lost at least 2.5 minutes per stage to Pinot in stage 14 and 15. The top 11 on this graph averaged losses of 18 seconds to Pinot on Planche de Belles Filles and only Uran (stage 15) and Porte (stage 14) have lost more than two minutes to Pinot on stages 14 and 15.

So far, Landa has been next best. In stage 14 he was part of the most select group on the Tourmalet until Pinot broke away with less than 500m to the line. In stage 15, he broke away on the penultimate climb and only Pinot could catch him on the final climb. And in stage 6 he launched an attack which kept him away for a few kilometers and then wasn’t dropped at any point by the attacks in the final kilometer on that stage. He has a competitive Giro in his legs which may hamper him in the final week, but has shown the ability to attack from the pack multiple times.

Both Buchmann (3rd best) and Bernal (4th best) had great results in their tune-up races in June. Buchmann finished even with Jakob Fuglsang ahead of all other GC contenders in the only true mountain stage, while Bernal dominated in Switzerland where he was easily the top climber in three mountain stages.

This presentation strips out the impacts of the time trials, the crosswinds, and the punchy breakaways Alaphilippe launched in stages 3 and 8. It shows Alaphilippe, Thomas, and Kruijswijk have been almost dead-even across three mountain contests. Each has looked shaky once: Kruijswijk lost around 30 seconds in stage 6, Thomas in stage 14, and Alaphilippe in stage 15.

Projecting losses

These plots below show the time lost to the leader (Pinot both days) based on an estimate of how far from the finish a rider was dropped from contact with the leader (from GPS data and broadcast). The point is to measure time losses per kilometer; this isn’t designed to predict – for example – Romain Bardet’s losses when being dropped on the Col su Soulor in stage 14.

st-14-dropped

st-15-dropped

In stage 14 up the Tourmalet, the GC men lost about 24 seconds per kilometer to Pinot + final selection (Landa, Bernal, Alaphilippe, Buchmann, Kruijswijk). Riders began to be dropped almost immediately with Yates and Martin going before 10km left, Quintana and Aru around 10km left, and more when FDJ blew-up the group with a bit over 5km to go. The Tourmalet is at least 6-7% gradient for every kilometer in the last nine coming in so time losses were linear with Valverde having the largest deviation (regression line predicts about 90 seconds, he loses a bit over 50 seconds).

Stage 15 was both a shorter climb and less steep in the finishing kilometers. The GC men lost about 14 seconds per kilometer to Pinot. The linear fit (and forcing the intercept to 0) doesn’t fit this data nearly as well as for stage 14. This is probably because Prat d’Albis is very steep in the middle and less so in the closing kilometers (eg, the stretch between 9km and 5km to go where Pinot attacked averages over 9% vs 5% in the final two kilometers). So Bernal and Buchmann lost only ~5 seconds/KM vs more than double that for those jettisoned around 6-8km to go.

prat-albis

Projecting forward – and taking the difference between these two rates of losses – maybe we can figure on 20 seconds/KM lost on mountaintop finishes. However, stage 18 up and down the Galibier finishes with 20km of downhill which may allow Alaphilippe to claw back some time lost if he’s dropped earlier. Stage 19 ends with an odd category 1 climb with steep ramps to start, a gentler middle, a steep end, and then a flat-ish few kilometers to the finish. Stage 20 is the only true summit finish, but even that climb has numerous flatter sections. Alaphilippe’s road to winning in Paris may only require him to survive until the final kilometers of these climbs if none of the five chasers has a big attack in them.

TDF Stage 14 Recap

As discussed earlier this week, this was a short mountain stage with just one big climb before the final climb up the Tourmalet. This was the first experience of the high mountains as well, reaching over 500m higher than the next highest point and racing the last 14km over 1500m.

By my tracking, Movistar took control of the peloton with about 5km left on the Col du Soulor and immediately shredded it down to an elite group of about 30-35 riders by the summit. This got rid of Romain Bardet and Adam Yates (who eventually rejoined on the downhill, but was dropped immediately on the Tourmalet and wasn’t a factor). INEOS lost Kwiatkowski and two others during this acceleration as well which would end up being a factor later on. You can see the estimated W/kg on the Soulor below, which was ridden at a high tempo (final summit climbs are around 6 W/kg or higher, when the GC group was dogging it on stage 12 those were ridden closer to 5 W/kg).

By the time they reformed down in the valley, it was a group of 50 riders which came to the Tourmalet including all the major favorites except Bardet. In the first 8km with Movistar still setting the tempo this group was shelled down to about 25 which included dropping Dan Martin and the Yates group. But in a major own goal, Movistar’s pace soon dropped Quintana who wouldn’t get back. INEOS took over pace-making next during which time Barguil attacked.

stg14-left

Riders left with GC group by kilometers left. Dashed lines show start of Tourmalet (19km) and summit of Soulor (57 km)

FDJ went to the front with about 7km left and they distanced INEOS’s last two helpers, Enric Mas, and Richie Porte among others. FDJ’s last man, Gaudu, rode a high tempo here which dropped a few more riders, but cracked shortly leaving Jumbo and Mikel Landa riding tempo for the next 2.5 km. This cracked Fuglsang and Uran.

There’s a flatter section of about 6% around 2km left in the climb, which immediately transitions into ramps of 12% for 500m. Thomas was dropped here when he couldn’t raise his pace to match the rest of the yellow jersey group. The final GC group of Pinot, Alaphilippe, Kruijswijk, Landa, Buchmann, and Bernal survived here and contested the last kilometer with Pinot breaking away for the stage win.

Below is a graph of time losses to Pinot based on what KM a rider was dropped from contact with the yellow jersey (eg, Thomas around 1.2 km, Richie Porte around 5.5-6km, etc). The regression line using just GC contenders (not domestiques like Gaudu/de Plus) shows each kilometer you survived today was worth about 24 seconds.

st-14-dropped

Going forward

The main takeaways from today:

1). Alaphilippe survived the high mountains. Tomorrow brings three sharp category 1 climbs, but no riding over 1500m. This is a similarly difficult stage to stage 6 when he survived and attacked in the last KM. Depending on the tempo it’s raced at, the danger zone for him may be the final kilometers of the penultimate climb where it averages 12%.

We now have two stages of him surviving with the top climbers, as well as a dominant TT and two big attacks on intermediate stages. He’s been incredibly strong, but how much remains for the last week in the thin air of the Alps? He took 30 seconds on Thomas today; at this point he could lose that sum in each of the final four mountain stages and wear yellow in Paris.

2). Pinot hasn’t missed a chance to attack the entire race. Stage 6 and 14 he rode away in the final kilometer, stage 8 he escaped, and his TT was elite – distancing almost all the specialists and racing about a minute faster than you would expect by his normal time-trialing form. He hasn’t been put in trouble in the mountains in any race since March as well.

It will take another effort similar to today’s to get back the time he lost to Thomas in the crosswinds. But so far, Pinot has been about half a minute faster in stages 3, 8, 13, and 14 when he wasn’t reliant on his team, just losing time to Thomas in the TTT (12 seconds) and stage 10 crosswinds (100 seconds).

3). INEOS looks to have taken a step-back. We don’t have data on how many riders they could keep in the final group from past Tours, but in stage 6 they had just one man left with Thomas/Bernal with 5km to go and today they had lost everyone by that point. Both Wout Poels (stage 6) and Michal Kwiatkowski (stage 14) have been dropped on the penultimate climb when Movistar raised the pace.

As we showed in the climbing preview, Bernal has – statistically – been the best climber coming into the TDF. While few of the others on top of that list (Martin, Quintana, Bardet) have showed anything this year, Bernal has met expectations and coped way better with the high altitudes today which we’ll see in the final three Alpine stages. There was a lot of discussion about using him to attack and draw out the other contenders, but clearly Thomas didn’t have enough today to have survived any attacks.

Major questions

1). Can INEOS put Alaphilippe in trouble either tomorrow or in the Alps? Their pace-making – for the short time they were in front – wasn’t responsible for putting many riders into trouble today. Either Movistar dropped riders on penultimate climb or lower stretches of Tourmalet, or FDJ/Jumbo dropped them later on the Tourmalet. Other teams will surely try to put Alaphilippe in trouble, but Movistar’s GC race is surely over and it’s yet to be seen that FDJ or Jumbo can control the race for more than a few kilometers at the end of a stage.

Thomas’s attacking so far across 2018 and 2019 has been limited to the last kilometer of stages, so if Alaphilippe can survive, is Thomas even able to attack him to regain time?

2). How do high mountains impact Alaphilippe? Assuming he can survive the punchier stage 15 in Pyrenees, that leaves stages 18-20 in the Alps where there are seven massive climbs with substantial climbing above 2000m. His team was reduced to just him and Enric Mas with 15km left and Mas was gone with 5km left to race. If he gets dropped, it will be tough for him to rejoin.

3rd Weekend TDF Preview

Beginning with Thursday’s mountain stage comes a stretch of four critical stages – three in the high mountains and one individual time trial. The first eleven stages included four with significant exchanges of time between GC contenders: the team time trial saw Jumbo Visma and FDJ hang with Team INEOS while Movistar and AG2R lost big chunks of time, stage 6 was the first mountain stage where Geraint Thomas took time on every rival, stage 8 saw Pinot and Alaphilippe escape to take time back, and stage 10 saw a select group including Thomas, Bernal, Nairo Quintana, and Adam Yates take over 1.5 minutes on Pinot and Jakob Fuglsang due to a split in the crosswinds.

The net is that among serious GC contenders, Geraint Thomas already leads everyone but Bernal and Steven Kruijswijk by at least 30 seconds – ballooning to over two minutes on Romain Bardet and Fuglsang and three minutes on Mikel Landa. After today, Geraint Thomas has risen over 55% likely to win the TDF according to market odds on Pinnacle Sports. Not only is he leading the GC contenders now, but he’s shown moments of serious strength like the end of stage 6 and his recovery on stage 8 that point to him being in similar form to last year.

Stage 13 ITT

We’ll discuss the time trial in stage 13 before getting into the mountains which will be featured in stages 12-14-15. The time trial is just 27km over rolling terrain; the total vertical gain and length are very close to grand tour averages. This should produce very similar results to the average time trials, and in the last nine TDF individual time trials we’ve seen the winner take about 3 seconds per KM on the rider in 10th position on the stage.

Based on the data, it looks like a two man race between Rohan Dennis and Wout van Aert. Dennis has the track record and is half a second faster than anyone else in the peloton with a significant ITT track record. He’s surely been targeting this stage from the start. However, van Aert has been completely dominant in two ITTs this year – winning similar length stages in both the Belgian national championships and the Dauphine by half a minute. Unlike Dennis, you can count his time trials raced on one hand since the start of 2018.

Most years we have a couple GC favorites like Froome or Contador who excel in the time trials. That may be the case with Geraint Thomas this year as he’s performed well in TT stages in the past (including 3rd last year in the final TDF time trial), but so far this year he’s been about 1.5 seconds/KM slower than someone like Dennis. The only other relevant GC man for the stage win is Steven Kruijswijk who has been slightly faster than Thomas in TTs.

Besides Kruijswijk, the next best GC rival to Thomas are either Bernal or Fuglsang, who rate about 0.8 seconds/KM worse than Thomas/Kruijswijk. This is another advantage for Thomas; based on his lead already and the more comfortably projected gains in the ITT, he can ride extremely defensively in the mountains and wait to use his late kick to take time in the closing kilometer like in stage 6 this year and stages 11/12 last year.

GC contenders versus Geraint Thomas in projected TT performance over 27km

Seconds / KM on 10th place Projected time vs Thomas
Steven Kruijswijk +0.1 +3 seconds gained
Geraint Thomas -0.1
Egan Bernal -0.8 -19 seconds lost
Jakob Fuglsang -0.8 -19 seconds lost
Julian Alaphilippe -1.0 -24 seconds lost
Adam Yates -1.1 -27 seconds lost
Thibaut Pinot -1.2 -30 seconds lost
Dan Martin -1.5 -37 seconds lost
Nairo Quintana -1.7 -42 seconds lost
Mikel Landa -2.6 -68 seconds lost
Enric Mas -2.7 -71 seconds lost
Romain Bardet -2.8 74 seconds lost

Things start to look a bit dire – given the existing time gaps – as you move down this list. Pinot has probably looked strongest among Thomas’s rivals so far – he’s just been unlucky to get caught out on stage 10 and drop 1.5 minutes. But even he is projected to lose half a minute more to Thomas. We can also see Kruijswijk’s path to the podium; gain 30-60 seconds over the other contenders here and hold serve in the mountains.

The Pyrenees

The race finishes with three mountain stages in the Pyrenees and three more in the Alps. The challenges will be completely different though. The Pyrenees will feature a succession of shorter, but steeper climbs which don’t reach truly asphyxiating elevations. The Alps follow with longer climbs which repeatedly reach over 2000m.

Stage 12 is a long stage at 210km (climbing difficulty of 23) towards the end before a long downhill to the finish. Given the average TDF mountain stage is about 169km with a climbing difficulty of 44, this one is well longer, but with less climbing.

tdf1219

In terms of length, gradient, and height, the Peyresourde is the toughest mountain climbed so far in the Tour and the second climb is a good shout for second toughest so far. Surely given their position in the race, INEOS would be happy to set a high tempo across both climbs and neutralize any attacks. The question is whether they can.

In stage 6 to La Planche de Belles Filles, one of their two mountain domestiques, Wout Poels, absolutely cracked before the final climb which left their two leaders with just one helper (Kwiatkowski) for the final climb. As a result, Team Movistar set the tempo for much of the lower levels of the climb. Kwiatkowski fell away with just under 3km left to the summit and Thibaut Pinot’s team (FDJ) took over and ratcheted the pace upwards. This immediately shelled a number of riders and FDJ really controlled the final group until Alaphilippe rode away in the final KM.

The big story for stage 12 is how much control INEOS is prepared to exert over the final group or whether a team like Movistar or FDJ will be able to take command.

After the time trial, stage 14 is a much shorter one at 118km with a summit finish up the Tourmalet. In the last decade, the Tourmalet has featured nine times, but primarily as the appetizer earlier in the stage and just once as a summit finish. This will be the first and only taste of the truly high mountains in the Pyrenees as the final 14km will be raced above 1500m.

tdf1419

Stage 15 will continue the trend of shorter, but steep climbs in the Pyrenees with three climbs of at least 7% including the summit finish at Prat d’Albis. That will make seven climbs of around 7% or higher in the Pyrenees, with only one summit higher than 2000m. As we’ll see when we come to the Alpine stages, there will be six (and nearly seven) climbs reaching 2000m, and of the 210km riden in the race at elevations over 1500m, 191km will be in the final three Alpine stages.

tdf1519

Maillot Jaune

After his raid in stage 8, Julian Alaphilippe holds the yellow jersey by over a minute on Geraint Thomas. Given his ability to stick with the front group over a number of 1st/2nd category climbs (and even attack on the final climb) in stage 6, it’s not unreasonable that Alaphilippe rides with the final group over the second climb in stage 12. If he can manage to not get dropped on the lower slopes, his descending skills should allow him to stay in yellow.

As we’ve laid-out above, Alaphilippe is one of the better time trialists among the GC men. He’s projected to lose just 24 seconds to Geraint Thomas in stage 13. Again, it’s not unreasonable to see him in yellow after stage 13.

Further into the mountains in stages 14 and 15 will be tougher. He’s about 75th best in the world at climbing based on the metric introduced last week (though his ability is much more uncertain given his typical strategy is not to attempt to remain with the GC contenders, but rather to target a handful of stages to go for stage wins from the breakaway). Pure anecdote, but in his two stage wins in 2018 and in stage 6 this year, the elevations reached by the race were much lower than the average TDF mountain stage. Considering that, it seems much more likely his yellow jersey bid ends on the slopes of the Tourmalet in stage 14.

First TDF Mountain Stage

Stage 6

The first mountain stage of the Tour is upon us; the peloton will cross six categorized climbs totaling 35km leading up to La Planche de Belles Filles. In the Vosges region of France there aren’t the 2500m summit finishes like in the Alps or Pyrenees, but the Stage 6 summit finish will bring the difficulty in terms of gradient; the 7km climb has nearly 9% gradient and is actually steeper than that because of two flats near the finish. This is the fifth steepest of the categorized climbs in the race and at 7km it’s double the length of the four steeper climbs.

This means all that climbing will put serious hurt into the GC contenders, and then make them tackle a brutal finishing climb. The climb to La Planche de Belles Filles has been climbed three times recently; in 2012 there was little climbing ahead of time and a select group including Froome and Wiggins from Sky survived to the end; in 2014 – like Stage 6 this year – the peloton crossed six categorized climbs before the finish and Nibali rode away to gain slightly on his rivals; in 2017, there was little climbing beforehand and Fabio Aru beat a group of less than ten climbers.

For tomorrow, the market – such that it is – makes Bernal about 20% to take the stage win, with Pinot and Adam Yates as the next favorites. Dan Martin, Thomas, Valverde, Fuglsang, Landa, and Bardet are considered next likeliest to win.

Summit finishes

We see fewer breakaways stay away to win on summit finishes than mountain stages which end with a downhill or flat. In grand tours, 42% of mountain stages ending in downhills/flats see the winner gain 3 minutes or more on the eventual race GC winner vs just 20% of summit finishes. In the three stages all ending in summit finishes at La Planche de Belles Filles, the GC group has contained the stage winner every time. It’s unlikely a breakaway survives to the end tomorrow.

Climbing Performance

In the last post breaking down how grand tour winners gain their advantage, we found that about 50% of the gains in the last 19 grand tours came on climbing stages and 15 of those 19 winners have been the top climber in that race. This year’s TDF is one of the toughest climbing Tours ever with five true summit finishes and the largest total KOM climb difficulty of any grand tour since 2013. Already, the two favorites –  Bernal and Thomas – have created small time gaps on every other contender besides Steven Kruijswijk, so those trailing will need to ride aggressively to try to create time gaps.

In this post discussing measuring climbing performance, we proposed a method to evaluate climbing based on the time gaps compared to the 10th place finisher on a stage. We then adjust for the difficulty of the climbing on that stage (with the idea that tougher climbing stages give the opportunity for more time gains). You can measure that resulting Climb Gains metric over multiple years by taking either the median value or the average of the top 75 percentile values.

Entering this TDF, these are the top climbing performers based on median performance in mountain stages (considering long-term performance back to the 2016 TDF). I’ve ignored Froome, Dumoulin, and Miguel Angel Lopez who are missing this TDF; Lopez would rank just after Bernal, Froome after Martin, and Dumoulin after Valverde.

1. Egan Bernal
2. Dan Martin
3. Romain Bardet
4. Nairo Quintana
5. Rigoberto Uran
6. Mikel Landa
7. Alejandro Valverde
8. Adam Yates
9. Thibaut Pinot
10. Steven Kruijswijk

Thomas ranks 12th in the this year’s TDF and Fuglsang is 14th.

As a sense-check, Froome has entered the seven grand tours he’s raced since 2015 ranked 1st in long-term performance five times and 2nd twice. The leader in climbing during that grand tour has ranked 1st in long-term climbing performance five of thirteen times, with a low of 22nd and median of 3rd.

These ratings have been tested on stage level results and also on grand tour climbing performance. Below is a table showing the model probability for a rider who enters ranked N within climbers in that grand tour finishing as the #1 climber and one of the top 3 climbers.

Climbing Rank Entering Tour de France Probability of Ranking as #1 Climber Probability of Ranking as top 3 Climber
1st 28% 59%
2nd 15% 40%
3rd 10% 29%
5th 5% 16%
10th 2% 6%
20th <1% 2%
50th <1% <1%

The main blindspot for this model is factoring in short-term form for riders who have completely transformed themselves before the race. Eg, Geraint Thomas was not riding like a GC level contender until the Dauphine in 2018; he had ridden 9 straight mountain stages from March 2017 to April 2018 without a single GC level performance. However, he was 4/4 in the Dauphine and would have rated 22nd best in the climbing model going into the TDF.

This year, Fuglsang and Adam Yates are the favorites who have improved the most relative to their long-term performance. Including short-term performance in models improves the performance and comes out as a significant predictor, but it is much noisier and subject to the concerns floated at the end of this post.

INEOS / Sky

For the better part of a decade, we’ve seen climbing stages in the TDF largely controlled by Team Sky who have super-domestiques like Wout Poels and Michal Kwiatkowski to consistently ride at a high tempo on the climbs. In this year’s Tour, I’ll be tracking how long Sky can keep support riders in the GC group, how tempo changes when the likes of Poels and Kwiatkowski drop away, and most importantly who can stick around through the relentless pace-making.

Yellow Jersey

Alaphilippe has some live probability of maintaining the yellow jersey past Stage 6. We’ve seen him win two stages of the Tour last year in the high mountains so there’s a least a chance he can stick with the GC group up to the high reaches of the the finishing climb. But he’s ridden 17 stages in his career with similar climbing difficulty as tomorrow (that also end with a summit finish); on only two occasions has he managed to stick with the GC group (in back to back days in the 2016 Dauphine).

With a strong climber like Steven Kruijswijk just 25 seconds back and Bernal/Thomas within a minute, the odds are long. It’s also possible Alaphilippe doesn’t even try to keep it; Stage 8 and 9 are both perfect for him to grab another win, and that’s much more likely if he conserves his energy and loses 5-10 minutes on Stage 6.

Working on the assumption Alaphilippe won’t finish with or near the GC group, Kruijswijk’s path is simple: stick with the GC group of Bernal/Thomas. Besides Bernal/Thomas who can get it simply if Kruijswijk can’t keep their pace, Michael Woods, Enric Mas, and Wilco Kelderman climb well enough, are close enough in the GC, and – crucially – are unlikely to be chased down by INEOS if they do try a move in the closing KMs (unlike Pinot or Uran).

Winning a Grand Tour

Grand tours can be won in numerous ways. Chris Froome won the 2018 Giro on the back of a stunning raid in Stage 19 – picking up over three minutes on the best placed rider ahead of him in the GC. He won the 2013 TDF using five big efforts – two ITTs, two summit finishes, and a team time trial. His teammate Geraint Thomas steadily accumulated time on Tom Dumoulin – never gaining more on a stage than the 53 seconds when Dumoulin crashed on Stage 6. Dumoulin himself relied on gaining over four minutes in two ITTs in 2017 Giro to best Nairo Quintana by just 31 seconds. Point is, there’s many ways to skin a cat to win these things.

Looking at the 19 grand tours won since 2013 (ignoring time bonuses and team time trials – just individual gains on the road), the time gained by the winner on 2nd place has ranged between 30 seconds (Horner over Nibali in 2013 Vuelta) and 457 seconds (Nibali over Peraud in the 2015 TDF) with an average of 125 seconds – just over two minutes.

Mountains

15 of 19 grand tour winners have beaten their 2nd place rivals in total time in the mountains with Quintana having the two best runners-up finishes in the 2017 Giro (220 seconds gained) and 2015 TDF (46 seconds gained). The median gain by the GC winner on 2nd place is 33 seconds total in the mountains (65 seconds is the average).

No GC winner has ranked worse than 7th best in the mountains; that was Dumoulin’s 2017 Giro which even saw him win a mountain stage. 15 of the 19 winners have ranked #1 in the mountains; take care of that element and you have a great chance of winning the race.

Time trials

In this year’s TDF, there’s only a 27km time trial to separate the pack. The median gains by GC winner on 2nd is 26 seconds on ITTs (45 seconds is the average). Twelve of 19 grand tour winners have beaten 2nd place when looking just at ITTs; Froome was the best runner-up gaining over two minutes in the 2016 Vuelta on Quintana, while Quintana has both of the lowlights losing over four minutes on two occasions (Bardet’s 2016 TDF effort also saw him lose 3.5 minutes).

Froome’s 2015 TDF win was the worst ranked against the clock – but considering there was just the 14km opening TT he lost little time to anyone that mattered and he was in yellow by the third stage.

Primoz Roglic has the best TT effort by a non-winner; he gained almost 2.5 minutes on Carapaz against the clock, but was just 4th best in the mountains.

Other stages

Besides time trials and climbing stages, there’s plenty of other chances to gain or lose time. Eg, last year’s Stage 6 finish at Mur de Bretagne in the TDF saw Thomas gain nearly a minute after Dumoulin crashed. In his 2014 TDF win, Nibali took over three minutes on Peraud on the cobbled Stage 5 (with Froome DNF). In this year’s Giro, Richard Carapaz announced himself with a big win uphill on Stage 4 and then picked up some more time on several rivals on the more intermediate Stage 15.

These gains/losses follow more of a normal distribution centered with a median of 0 seconds exchanged with a max of 202 seconds lost and min of -55 seconds gained.

Performance of GC winner in grand tours (vs 2nd place)

distro-gains

Stage Type Winners gaining on 2nd (of 19) Ranked #1 in race (of 19) Average gain  on 2nd Median gain  on 2nd
Mountains 15 of 19 15 of 19 65 seconds 33 seconds
Time Trials 12 of 19 5 of 19 45 seconds 26 seconds
Other 8 of 19 1 of 19 15 seconds 0 seconds

The only grand tour winner to rank #1 in gains on the ‘Other’ stages was Vincenzo Nibali in 2014; he escaped on Stage 2 into Sheffield to take yellow, finished third with big time gains on the cobbles in Stage 5, and finished 3rd again in the punchy finish of Stage 8. Before the first mountain Stage 10, Nibali led every realistic GC contender by at least 1.5 minutes.

Looking ahead

Already this year, Thomas and Bernal have gained a little time on every rival except Kruijswijk (ranging from just 8 seconds over Uran and 12 seconds over Pinot to 45 seconds over Quintana/Landa/Valverde and nearly a minute over Bardet).

Stages 3 and 5 both have tricky finishes with sharp climbs which could see a contender get into trouble either by being caught in a crash or being dropped by a breakaway. Note again the time gains by Nibali in 2014 and Thomas in 2018 on these other flat stages with punchy finishes.

All that leads up to Stage 6 which crosses four legitimate 2nd to 1st category climbs before a summit finish at La Planche de Belles Filles. There’s a brutal final KM to the finish there where the GC hierarchy should start to shake out.