TDF Stage 14 Recap

As discussed earlier this week, this was a short mountain stage with just one big climb before the final climb up the Tourmalet. This was the first experience of the high mountains as well, reaching over 500m higher than the next highest point and racing the last 14km over 1500m.

By my tracking, Movistar took control of the peloton with about 5km left on the Col du Soulor and immediately shredded it down to an elite group of about 30-35 riders by the summit. This got rid of Romain Bardet and Adam Yates (who eventually rejoined on the downhill, but was dropped immediately on the Tourmalet and wasn’t a factor). INEOS lost Kwiatkowski and two others during this acceleration as well which would end up being a factor later on. You can see the estimated W/kg on the Soulor below, which was ridden at a high tempo (final summit climbs are around 6 W/kg or higher, when the GC group was dogging it on stage 12 those were ridden closer to 5 W/kg).

By the time they reformed down in the valley, it was a group of 50 riders which came to the Tourmalet including all the major favorites except Bardet. In the first 8km with Movistar still setting the tempo this group was shelled down to about 25 which included dropping Dan Martin and the Yates group. But in a major own goal, Movistar’s pace soon dropped Quintana who wouldn’t get back. INEOS took over pace-making next during which time Barguil attacked.

stg14-left

Riders left with GC group by kilometers left. Dashed lines show start of Tourmalet (19km) and summit of Soulor (57 km)

FDJ went to the front with about 7km left and they distanced INEOS’s last two helpers, Enric Mas, and Richie Porte among others. FDJ’s last man, Gaudu, rode a high tempo here which dropped a few more riders, but cracked shortly leaving Jumbo and Mikel Landa riding tempo for the next 2.5 km. This cracked Fuglsang and Uran.

There’s a flatter section of about 6% around 2km left in the climb, which immediately transitions into ramps of 12% for 500m. Thomas was dropped here when he couldn’t raise his pace to match the rest of the yellow jersey group. The final GC group of Pinot, Alaphilippe, Kruijswijk, Landa, Buchmann, and Bernal survived here and contested the last kilometer with Pinot breaking away for the stage win.

Below is a graph of time losses to Pinot based on what KM a rider was dropped from contact with the yellow jersey (eg, Thomas around 1.2 km, Richie Porte around 5.5-6km, etc). The regression line using just GC contenders (not domestiques like Gaudu/de Plus) shows each kilometer you survived today was worth about 24 seconds.

st-14-dropped

Going forward

The main takeaways from today:

1). Alaphilippe survived the high mountains. Tomorrow brings three sharp category 1 climbs, but no riding over 1500m. This is a similarly difficult stage to stage 6 when he survived and attacked in the last KM. Depending on the tempo it’s raced at, the danger zone for him may be the final kilometers of the penultimate climb where it averages 12%.

We now have two stages of him surviving with the top climbers, as well as a dominant TT and two big attacks on intermediate stages. He’s been incredibly strong, but how much remains for the last week in the thin air of the Alps? He took 30 seconds on Thomas today; at this point he could lose that sum in each of the final four mountain stages and wear yellow in Paris.

2). Pinot hasn’t missed a chance to attack the entire race. Stage 6 and 14 he rode away in the final kilometer, stage 8 he escaped, and his TT was elite – distancing almost all the specialists and racing about a minute faster than you would expect by his normal time-trialing form. He hasn’t been put in trouble in the mountains in any race since March as well.

It will take another effort similar to today’s to get back the time he lost to Thomas in the crosswinds. But so far, Pinot has been about half a minute faster in stages 3, 8, 13, and 14 when he wasn’t reliant on his team, just losing time to Thomas in the TTT (12 seconds) and stage 10 crosswinds (100 seconds).

3). INEOS looks to have taken a step-back. We don’t have data on how many riders they could keep in the final group from past Tours, but in stage 6 they had just one man left with Thomas/Bernal with 5km to go and today they had lost everyone by that point. Both Wout Poels (stage 6) and Michal Kwiatkowski (stage 14) have been dropped on the penultimate climb when Movistar raised the pace.

As we showed in the climbing preview, Bernal has – statistically – been the best climber coming into the TDF. While few of the others on top of that list (Martin, Quintana, Bardet) have showed anything this year, Bernal has met expectations and coped way better with the high altitudes today which we’ll see in the final three Alpine stages. There was a lot of discussion about using him to attack and draw out the other contenders, but clearly Thomas didn’t have enough today to have survived any attacks.

Major questions

1). Can INEOS put Alaphilippe in trouble either tomorrow or in the Alps? Their pace-making – for the short time they were in front – wasn’t responsible for putting many riders into trouble today. Either Movistar dropped riders on penultimate climb or lower stretches of Tourmalet, or FDJ/Jumbo dropped them later on the Tourmalet. Other teams will surely try to put Alaphilippe in trouble, but Movistar’s GC race is surely over and it’s yet to be seen that FDJ or Jumbo can control the race for more than a few kilometers at the end of a stage.

Thomas’s attacking so far across 2018 and 2019 has been limited to the last kilometer of stages, so if Alaphilippe can survive, is Thomas even able to attack him to regain time?

2). How do high mountains impact Alaphilippe? Assuming he can survive the punchier stage 15 in Pyrenees, that leaves stages 18-20 in the Alps where there are seven massive climbs with substantial climbing above 2000m. His team was reduced to just him and Enric Mas with 15km left and Mas was gone with 5km left to race. If he gets dropped, it will be tough for him to rejoin.

3rd Weekend TDF Preview

Beginning with Thursday’s mountain stage comes a stretch of four critical stages – three in the high mountains and one individual time trial. The first eleven stages included four with significant exchanges of time between GC contenders: the team time trial saw Jumbo Visma and FDJ hang with Team INEOS while Movistar and AG2R lost big chunks of time, stage 6 was the first mountain stage where Geraint Thomas took time on every rival, stage 8 saw Pinot and Alaphilippe escape to take time back, and stage 10 saw a select group including Thomas, Bernal, Nairo Quintana, and Adam Yates take over 1.5 minutes on Pinot and Jakob Fuglsang due to a split in the crosswinds.

The net is that among serious GC contenders, Geraint Thomas already leads everyone but Bernal and Steven Kruijswijk by at least 30 seconds – ballooning to over two minutes on Romain Bardet and Fuglsang and three minutes on Mikel Landa. After today, Geraint Thomas has risen over 55% likely to win the TDF according to market odds on Pinnacle Sports. Not only is he leading the GC contenders now, but he’s shown moments of serious strength like the end of stage 6 and his recovery on stage 8 that point to him being in similar form to last year.

Stage 13 ITT

We’ll discuss the time trial in stage 13 before getting into the mountains which will be featured in stages 12-14-15. The time trial is just 27km over rolling terrain; the total vertical gain and length are very close to grand tour averages. This should produce very similar results to the average time trials, and in the last nine TDF individual time trials we’ve seen the winner take about 3 seconds per KM on the rider in 10th position on the stage.

Based on the data, it looks like a two man race between Rohan Dennis and Wout van Aert. Dennis has the track record and is half a second faster than anyone else in the peloton with a significant ITT track record. He’s surely been targeting this stage from the start. However, van Aert has been completely dominant in two ITTs this year – winning similar length stages in both the Belgian national championships and the Dauphine by half a minute. Unlike Dennis, you can count his time trials raced on one hand since the start of 2018.

Most years we have a couple GC favorites like Froome or Contador who excel in the time trials. That may be the case with Geraint Thomas this year as he’s performed well in TT stages in the past (including 3rd last year in the final TDF time trial), but so far this year he’s been about 1.5 seconds/KM slower than someone like Dennis. The only other relevant GC man for the stage win is Steven Kruijswijk who has been slightly faster than Thomas in TTs.

Besides Kruijswijk, the next best GC rival to Thomas are either Bernal or Fuglsang, who rate about 0.8 seconds/KM worse than Thomas/Kruijswijk. This is another advantage for Thomas; based on his lead already and the more comfortably projected gains in the ITT, he can ride extremely defensively in the mountains and wait to use his late kick to take time in the closing kilometer like in stage 6 this year and stages 11/12 last year.

GC contenders versus Geraint Thomas in projected TT performance over 27km

Seconds / KM on 10th place Projected time vs Thomas
Steven Kruijswijk +0.1 +3 seconds gained
Geraint Thomas -0.1
Egan Bernal -0.8 -19 seconds lost
Jakob Fuglsang -0.8 -19 seconds lost
Julian Alaphilippe -1.0 -24 seconds lost
Adam Yates -1.1 -27 seconds lost
Thibaut Pinot -1.2 -30 seconds lost
Dan Martin -1.5 -37 seconds lost
Nairo Quintana -1.7 -42 seconds lost
Mikel Landa -2.6 -68 seconds lost
Enric Mas -2.7 -71 seconds lost
Romain Bardet -2.8 74 seconds lost

Things start to look a bit dire – given the existing time gaps – as you move down this list. Pinot has probably looked strongest among Thomas’s rivals so far – he’s just been unlucky to get caught out on stage 10 and drop 1.5 minutes. But even he is projected to lose half a minute more to Thomas. We can also see Kruijswijk’s path to the podium; gain 30-60 seconds over the other contenders here and hold serve in the mountains.

The Pyrenees

The race finishes with three mountain stages in the Pyrenees and three more in the Alps. The challenges will be completely different though. The Pyrenees will feature a succession of shorter, but steeper climbs which don’t reach truly asphyxiating elevations. The Alps follow with longer climbs which repeatedly reach over 2000m.

Stage 12 is a long stage at 210km (climbing difficulty of 23) towards the end before a long downhill to the finish. Given the average TDF mountain stage is about 169km with a climbing difficulty of 44, this one is well longer, but with less climbing.

tdf1219

In terms of length, gradient, and height, the Peyresourde is the toughest mountain climbed so far in the Tour and the second climb is a good shout for second toughest so far. Surely given their position in the race, INEOS would be happy to set a high tempo across both climbs and neutralize any attacks. The question is whether they can.

In stage 6 to La Planche de Belles Filles, one of their two mountain domestiques, Wout Poels, absolutely cracked before the final climb which left their two leaders with just one helper (Kwiatkowski) for the final climb. As a result, Team Movistar set the tempo for much of the lower levels of the climb. Kwiatkowski fell away with just under 3km left to the summit and Thibaut Pinot’s team (FDJ) took over and ratcheted the pace upwards. This immediately shelled a number of riders and FDJ really controlled the final group until Alaphilippe rode away in the final KM.

The big story for stage 12 is how much control INEOS is prepared to exert over the final group or whether a team like Movistar or FDJ will be able to take command.

After the time trial, stage 14 is a much shorter one at 118km with a summit finish up the Tourmalet. In the last decade, the Tourmalet has featured nine times, but primarily as the appetizer earlier in the stage and just once as a summit finish. This will be the first and only taste of the truly high mountains in the Pyrenees as the final 14km will be raced above 1500m.

tdf1419

Stage 15 will continue the trend of shorter, but steep climbs in the Pyrenees with three climbs of at least 7% including the summit finish at Prat d’Albis. That will make seven climbs of around 7% or higher in the Pyrenees, with only one summit higher than 2000m. As we’ll see when we come to the Alpine stages, there will be six (and nearly seven) climbs reaching 2000m, and of the 210km riden in the race at elevations over 1500m, 191km will be in the final three Alpine stages.

tdf1519

Maillot Jaune

After his raid in stage 8, Julian Alaphilippe holds the yellow jersey by over a minute on Geraint Thomas. Given his ability to stick with the front group over a number of 1st/2nd category climbs (and even attack on the final climb) in stage 6, it’s not unreasonable that Alaphilippe rides with the final group over the second climb in stage 12. If he can manage to not get dropped on the lower slopes, his descending skills should allow him to stay in yellow.

As we’ve laid-out above, Alaphilippe is one of the better time trialists among the GC men. He’s projected to lose just 24 seconds to Geraint Thomas in stage 13. Again, it’s not unreasonable to see him in yellow after stage 13.

Further into the mountains in stages 14 and 15 will be tougher. He’s about 75th best in the world at climbing based on the metric introduced last week (though his ability is much more uncertain given his typical strategy is not to attempt to remain with the GC contenders, but rather to target a handful of stages to go for stage wins from the breakaway). Pure anecdote, but in his two stage wins in 2018 and in stage 6 this year, the elevations reached by the race were much lower than the average TDF mountain stage. Considering that, it seems much more likely his yellow jersey bid ends on the slopes of the Tourmalet in stage 14.

First TDF Mountain Stage

Stage 6

The first mountain stage of the Tour is upon us; the peloton will cross six categorized climbs totaling 35km leading up to La Planche de Belles Filles. In the Vosges region of France there aren’t the 2500m summit finishes like in the Alps or Pyrenees, but the Stage 6 summit finish will bring the difficulty in terms of gradient; the 7km climb has nearly 9% gradient and is actually steeper than that because of two flats near the finish. This is the fifth steepest of the categorized climbs in the race and at 7km it’s double the length of the four steeper climbs.

This means all that climbing will put serious hurt into the GC contenders, and then make them tackle a brutal finishing climb. The climb to La Planche de Belles Filles has been climbed three times recently; in 2012 there was little climbing ahead of time and a select group including Froome and Wiggins from Sky survived to the end; in 2014 – like Stage 6 this year – the peloton crossed six categorized climbs before the finish and Nibali rode away to gain slightly on his rivals; in 2017, there was little climbing beforehand and Fabio Aru beat a group of less than ten climbers.

For tomorrow, the market – such that it is – makes Bernal about 20% to take the stage win, with Pinot and Adam Yates as the next favorites. Dan Martin, Thomas, Valverde, Fuglsang, Landa, and Bardet are considered next likeliest to win.

Summit finishes

We see fewer breakaways stay away to win on summit finishes than mountain stages which end with a downhill or flat. In grand tours, 42% of mountain stages ending in downhills/flats see the winner gain 3 minutes or more on the eventual race GC winner vs just 20% of summit finishes. In the three stages all ending in summit finishes at La Planche de Belles Filles, the GC group has contained the stage winner every time. It’s unlikely a breakaway survives to the end tomorrow.

Climbing Performance

In the last post breaking down how grand tour winners gain their advantage, we found that about 50% of the gains in the last 19 grand tours came on climbing stages and 15 of those 19 winners have been the top climber in that race. This year’s TDF is one of the toughest climbing Tours ever with five true summit finishes and the largest total KOM climb difficulty of any grand tour since 2013. Already, the two favorites –  Bernal and Thomas – have created small time gaps on every other contender besides Steven Kruijswijk, so those trailing will need to ride aggressively to try to create time gaps.

In this post discussing measuring climbing performance, we proposed a method to evaluate climbing based on the time gaps compared to the 10th place finisher on a stage. We then adjust for the difficulty of the climbing on that stage (with the idea that tougher climbing stages give the opportunity for more time gains). You can measure that resulting Climb Gains metric over multiple years by taking either the median value or the average of the top 75 percentile values.

Entering this TDF, these are the top climbing performers based on median performance in mountain stages (considering long-term performance back to the 2016 TDF). I’ve ignored Froome, Dumoulin, and Miguel Angel Lopez who are missing this TDF; Lopez would rank just after Bernal, Froome after Martin, and Dumoulin after Valverde.

1. Egan Bernal
2. Dan Martin
3. Romain Bardet
4. Nairo Quintana
5. Rigoberto Uran
6. Mikel Landa
7. Alejandro Valverde
8. Adam Yates
9. Thibaut Pinot
10. Steven Kruijswijk

Thomas ranks 12th in the this year’s TDF and Fuglsang is 14th.

As a sense-check, Froome has entered the seven grand tours he’s raced since 2015 ranked 1st in long-term performance five times and 2nd twice. The leader in climbing during that grand tour has ranked 1st in long-term climbing performance five of thirteen times, with a low of 22nd and median of 3rd.

These ratings have been tested on stage level results and also on grand tour climbing performance. Below is a table showing the model probability for a rider who enters ranked N within climbers in that grand tour finishing as the #1 climber and one of the top 3 climbers.

Climbing Rank Entering Tour de France Probability of Ranking as #1 Climber Probability of Ranking as top 3 Climber
1st 28% 59%
2nd 15% 40%
3rd 10% 29%
5th 5% 16%
10th 2% 6%
20th <1% 2%
50th <1% <1%

The main blindspot for this model is factoring in short-term form for riders who have completely transformed themselves before the race. Eg, Geraint Thomas was not riding like a GC level contender until the Dauphine in 2018; he had ridden 9 straight mountain stages from March 2017 to April 2018 without a single GC level performance. However, he was 4/4 in the Dauphine and would have rated 22nd best in the climbing model going into the TDF.

This year, Fuglsang and Adam Yates are the favorites who have improved the most relative to their long-term performance. Including short-term performance in models improves the performance and comes out as a significant predictor, but it is much noisier and subject to the concerns floated at the end of this post.

INEOS / Sky

For the better part of a decade, we’ve seen climbing stages in the TDF largely controlled by Team Sky who have super-domestiques like Wout Poels and Michal Kwiatkowski to consistently ride at a high tempo on the climbs. In this year’s Tour, I’ll be tracking how long Sky can keep support riders in the GC group, how tempo changes when the likes of Poels and Kwiatkowski drop away, and most importantly who can stick around through the relentless pace-making.

Yellow Jersey

Alaphilippe has some live probability of maintaining the yellow jersey past Stage 6. We’ve seen him win two stages of the Tour last year in the high mountains so there’s a least a chance he can stick with the GC group up to the high reaches of the the finishing climb. But he’s ridden 17 stages in his career with similar climbing difficulty as tomorrow (that also end with a summit finish); on only two occasions has he managed to stick with the GC group (in back to back days in the 2016 Dauphine).

With a strong climber like Steven Kruijswijk just 25 seconds back and Bernal/Thomas within a minute, the odds are long. It’s also possible Alaphilippe doesn’t even try to keep it; Stage 8 and 9 are both perfect for him to grab another win, and that’s much more likely if he conserves his energy and loses 5-10 minutes on Stage 6.

Working on the assumption Alaphilippe won’t finish with or near the GC group, Kruijswijk’s path is simple: stick with the GC group of Bernal/Thomas. Besides Bernal/Thomas who can get it simply if Kruijswijk can’t keep their pace, Michael Woods, Enric Mas, and Wilco Kelderman climb well enough, are close enough in the GC, and – crucially – are unlikely to be chased down by INEOS if they do try a move in the closing KMs (unlike Pinot or Uran).

Winning a Grand Tour

Grand tours can be won in numerous ways. Chris Froome won the 2018 Giro on the back of a stunning raid in Stage 19 – picking up over three minutes on the best placed rider ahead of him in the GC. He won the 2013 TDF using five big efforts – two ITTs, two summit finishes, and a team time trial. His teammate Geraint Thomas steadily accumulated time on Tom Dumoulin – never gaining more on a stage than the 53 seconds when Dumoulin crashed on Stage 6. Dumoulin himself relied on gaining over four minutes in two ITTs in 2017 Giro to best Nairo Quintana by just 31 seconds. Point is, there’s many ways to skin a cat to win these things.

Looking at the 19 grand tours won since 2013 (ignoring time bonuses and team time trials – just individual gains on the road), the time gained by the winner on 2nd place has ranged between 30 seconds (Horner over Nibali in 2013 Vuelta) and 457 seconds (Nibali over Peraud in the 2015 TDF) with an average of 125 seconds – just over two minutes.

Mountains

15 of 19 grand tour winners have beaten their 2nd place rivals in total time in the mountains with Quintana having the two best runners-up finishes in the 2017 Giro (220 seconds gained) and 2015 TDF (46 seconds gained). The median gain by the GC winner on 2nd place is 33 seconds total in the mountains (65 seconds is the average).

No GC winner has ranked worse than 7th best in the mountains; that was Dumoulin’s 2017 Giro which even saw him win a mountain stage. 15 of the 19 winners have ranked #1 in the mountains; take care of that element and you have a great chance of winning the race.

Time trials

In this year’s TDF, there’s only a 27km time trial to separate the pack. The median gains by GC winner on 2nd is 26 seconds on ITTs (45 seconds is the average). Twelve of 19 grand tour winners have beaten 2nd place when looking just at ITTs; Froome was the best runner-up gaining over two minutes in the 2016 Vuelta on Quintana, while Quintana has both of the lowlights losing over four minutes on two occasions (Bardet’s 2016 TDF effort also saw him lose 3.5 minutes).

Froome’s 2015 TDF win was the worst ranked against the clock – but considering there was just the 14km opening TT he lost little time to anyone that mattered and he was in yellow by the third stage.

Primoz Roglic has the best TT effort by a non-winner; he gained almost 2.5 minutes on Carapaz against the clock, but was just 4th best in the mountains.

Other stages

Besides time trials and climbing stages, there’s plenty of other chances to gain or lose time. Eg, last year’s Stage 6 finish at Mur de Bretagne in the TDF saw Thomas gain nearly a minute after Dumoulin crashed. In his 2014 TDF win, Nibali took over three minutes on Peraud on the cobbled Stage 5 (with Froome DNF). In this year’s Giro, Richard Carapaz announced himself with a big win uphill on Stage 4 and then picked up some more time on several rivals on the more intermediate Stage 15.

These gains/losses follow more of a normal distribution centered with a median of 0 seconds exchanged with a max of 202 seconds lost and min of -55 seconds gained.

Performance of GC winner in grand tours (vs 2nd place)

distro-gains

Stage Type Winners gaining on 2nd (of 19) Ranked #1 in race (of 19) Average gain  on 2nd Median gain  on 2nd
Mountains 15 of 19 15 of 19 65 seconds 33 seconds
Time Trials 12 of 19 5 of 19 45 seconds 26 seconds
Other 8 of 19 1 of 19 15 seconds 0 seconds

The only grand tour winner to rank #1 in gains on the ‘Other’ stages was Vincenzo Nibali in 2014; he escaped on Stage 2 into Sheffield to take yellow, finished third with big time gains on the cobbles in Stage 5, and finished 3rd again in the punchy finish of Stage 8. Before the first mountain Stage 10, Nibali led every realistic GC contender by at least 1.5 minutes.

Looking ahead

Already this year, Thomas and Bernal have gained a little time on every rival except Kruijswijk (ranging from just 8 seconds over Uran and 12 seconds over Pinot to 45 seconds over Quintana/Landa/Valverde and nearly a minute over Bardet).

Stages 3 and 5 both have tricky finishes with sharp climbs which could see a contender get into trouble either by being caught in a crash or being dropped by a breakaway. Note again the time gains by Nibali in 2014 and Thomas in 2018 on these other flat stages with punchy finishes.

All that leads up to Stage 6 which crosses four legitimate 2nd to 1st category climbs before a summit finish at La Planche de Belles Filles. There’s a brutal final KM to the finish there where the GC hierarchy should start to shake out.

Judging Climbing Performance

Judging climbing performance is all about how much separation you can get on your rivals. It’s a function of 1) avoiding getting dropped by the initial pace-making, 2) avoiding getting dropped by attacks/responses to attacks, and 3) being able to launch attacks of your own.

Some stages will – because of the tactical games being played or overall strategic situation – not lead to large time gaps. For example, in stages 11 and 12 of the 2018 Tour de France, Thomas, Froome, and Dumoulin were the strongest riders but with Dumoulin content to follow any attacks and Froome unwilling to attack Thomas’s lead, the three riders were separated by just 24 seconds after 284km and two HC summit finishes. There were 10 and 11 riders within a minute of Thomas on those stages.

However, some stages produce enormous gaps. Chris Froome’s raid in Stage 19 of the 2018 Giro yielded enormous gains – 3 minutes on 2nd place in the stage and over 8 minutes on 10th position.

What we do see is a linear relationship between the difficulty of the climbing on a stage and the time gained by the winner. The stage winner on the toughest mountain stages is expected to gain over 1 second/KM on 10th place compared to just 0.5 seconds/KM on 10th place on the easier mountain stages.

climb-difficulty-vs-separation

This is expected. Take Stage 5 in the 2017 Tour which finished at La Planche de Belles Filles with little climbing prior. Fabio Aru took 24 seconds on Bardet in 5th, 40 seconds on 10th, and 73 seconds on 20th. Divided by the 160km stage he took just 0.15 seconds/KM on 5th, 0.25 seconds/KM on 10th, and 0.45 seconds/KM on 20th.

Compare that stage to Stage 16 in the 2014 Giro which featured three HC-level climbs including Passo Stelvio. Quintana won narrowly over Hesjedal, but took 217 seconds on 5th, 288 seconds on 10th, and 688 seconds on 20th. Over 140km stage, that’s 1.55 seconds/KM on 5th, 2.1 seconds/KM on 10th, and 4.9 seconds/KM on 20th.

Huge amounts of climbing don’t necessarily equal big time gaps (this goes back to tactics, race situation, etc), but the difficulty of the climbing on a stage is a crucial part of judging how decisive it will be.

Incorporating climbing difficulty

This post discusses the method for judging the difficulty of climbing on a stage. This model assigns a point value to each climb (0.5 for the weakest category 4 climbs and over 20 for the toughest HC climbs) based on the average gradient, length of climb, and summit elevation. We add those values together to generate the climbing difficulty for a stage (doubling summit finish climbs). This yields a spread between about 15 and 70 for the difficulty of climbing stages.

From the post of judging performance in time trials we developed a method to estimate performance which was simply time in seconds gained per KM on the 10th place man on that stage. We can develop a similar model to estimate climbing performance. Only for this model instead of using the length of the stage we’ll use the difficulty of the climbing on that stage.

Above we showed the impact of climbing difficulty in terms of seconds/KM, but in reality the length of the stage has no impact on how much time is likely to be gained; it’s solely a matter of the climbing difficulty for mountain stages.

climbs-vs-gainslength-vs-gains

The two shortest stages in the data-set are the 2018 TDF stage 17 and a shortened Tour de Suisse stage in 2016. The TDF stage saw a climbing difficulty of 60, while the Swiss stage saw a climbing difficulty of 15. The 60 difficulty stage saw twice the gains of the 15 difficulty stage (121 vs 56 seconds).

Similarly, difficulty of climbing is what separates the gains on the two longest stages – Stage 4 of 2014 Tirreno-Adriatico (244km) and Stage 15 to Mont Ventoux in the 2013 TDF (242.5km). The climbing difficulty in T-A was 28 vs 45 to Mont Ventoux. The gains on 10th place were 17 seconds in T-A and 128 by Froome in the TDF.

The impact of 1 point tougher climbing is about 3 seconds of time gained by the winner on 10th place such that the winner would be expected to gain 45 seconds in a weak mountain stage and 220 seconds in the toughest mountain stage.

We use this method to adjust climbing performance to measure the actual gains vs potential gains so what we’re really interested in is building a linear model to find the impact of climbing difficulty on the actual gains divided by 80.5 (the median gains on a mountain stage). This yields an adjustment factor.

lm((actual_gains / 80.5) ~ climbing_difficulty)

adjustment_factor = (climbing_difficulty * 0.0393) - 0.04

modeled_gain = (actual_gains) / adjustment_factor

For a 15 difficulty stage this adjustment factor would be 0.55 and for a 70 difficulty stage this adjustment factor would be 2.71. To find Climb Gains we divide the gains on 10th place by this adjustment factor such that a 15 difficulty stage where 60 seconds were gained would be modeled as a gain of 109 seconds and a 70 difficulty stage where 60 seconds were gained would be modeled as a gain of 22 seconds. The metric is seconds gained per point of climbing difficulty.

Going back to Froome’s Stage 19 raid in the 2018 Giro, it rates out at 4.2 – one of the strongest stage wins of the year -while Thomas’s Stage 11/12 wins are more steady results – 0.3 and 0.5 – two of the weaker victories. However, those efforts are exactly what is needed to win grand tours. Since 2013, the median mountain stage by a GC winner has rated out at between 0.3 and 0.4. Put six of those together and you’ve probably won the race.

Further work

There’s still potential work to be done refining how to measure gains. Eg, measuring gains off of 10th place was chosen because it was close to the average finish position of the GC winner in grand tour mountain stages (8th), but it’s possible an ensemble metric considering gains on 3rd, 5th, 10th, 20th, etc could lead to better results.

Going along with that is factoring in the impact of breakaways. For example, in the 2018 Tour de France on Stage 16 the yellow jersey group with Thomas, Froome, Dumoulin, and the other contenders finished nearly 9 minutes back of the stage winner and 8 minutes back of 10th place because 20 riders stayed away in the break all day. This meant Geraint Thomas’s gains on 10th place in the six mountain stages were 0, 12, 53, 59, 74, and -454! In the same way, Julian Alaphilippe gained 78 and 203 seconds in the stages he won and lost 726, 1077, 1432, and 1698 seconds in the other mountain stages.

It’s possible to account for this by taking the median performance or top 75% of performances, but it’s not a perfect situation and one where there’s no obvious (programmable) solution which works consistently.

Motivation is another factor not captured here – in two different ways. First, GC contenders will often not race competitively when targeting grand tour wins. Eg, look at some of Armstrong’s early season results where he’s content to just get some racing in his legs or Froome’s results this year (where he finished 17-97-146-148 in the four mountain stages before the Dauphine). Again, we can account for that by taking just the top 75% of results, but it’s not a perfect situation.

At the same time, motivation/race strategy might mean a certain rider is content just to hold the wheel in certain stages (either in preparation for or recovery from a big effort) and won’t search out time gaps even if they theoretically had the power to do so.

Some of these items could be reasonably accounted for, others are unavoidable results of attempting to judge performance from stage results (a blunt tool when compared to the sharp edge available to teams with power meters and other data).

Best Climbers

The full list of top climbers will come out prior to Stage 6 to La Planche de Belles Filles.

Judging Climbing Stage Difficulty

It’s important to be able to judge the difficulty of the climbing on a stage – both to measure performance and to categorize each stage.

Utilizing length of climb, average gradient, and elevation at the summit, we can categorize the difficulty of climbs across Tours at a more granular and accurate level than the HC to 4th category system. More importantly, we can apply this model objectively across all races. This allows us to measure that a Category 1 climb in a small race is actually a weak Cat 2 climb when compared to grand tour climbs.

Our training data are the categorized climbs of the Grand Tours between 2013 and 2019 – as judged by the race organizers (this is based loosely off an objective system as well). Based on the common KOM points given out, we assign a difficulty value of 20 to HC, 10 to 1st category, 5 to 2nd category, 2 to 3rd category, and 1 to 4th category. That is our dependent variable.

I ran a number of regression models to obtain the model which explained the most variance. The best result was:

KOM_POINTS ~ gradient + length + summit + gradient:length

I then applied this model to all climbs across the full data-set of races.

The most difficult climb by this method is Col d’Finestre – often seen in the Giro – which rated out as a 22.7 point climb (beyond category for the already beyond category climbs). Finestre is 18.4km at 9.1% topping out at 2175m. The median HC climb in Grand Tours in the data-set was 14.0km at 7.4% topping out at 1860m.

Other toughest climbs are Stelvio (20.0) and Mortirolo (19.5) from the Giro and Rettenbachferner (20.0) from Tour de Suisse.

The toughest in the Tour de France are Col de la Madeleine, Mont Ventoux, and Col d’Portet at around 19.5 points. Angliru (17.5) is the toughest in the Vuelta.

Other notable climbs are Monte Zoncolan (18.0), Plateau de Beille (16.0), Col du Galibier (15.5), and Alpe d’Huez (15.0). Val Thorens – which could decide the GC in the final stage of the 2019 Tour de France is an 18.8 thanks to its 33.8km length and nearly 2400m summit.

To find the stage difficulty, add up the individual KOM_Points for each categorized climb. Summit climbs where the race ends with 2km of the summit of the climb are doubled to account for greater intensity on a final climb.

By this method, the toughest stage since 2013 is Stage 11 of the 2015 Vuelta which combined five climbs reaching over 1750m include a summit finish. The toughest Tour de France stage was Stage 12 in 2018 which combined two massive HC climbs of Madeleine and Croix de Fer with a summit finish on Alpe d’Huez.

When judging stages on this objective level, it becomes obvious how much the grand tours stand out in difficulty. Of 52 stages/races judged at least a 45 in climbing difficulty, 40 came in the Giro/Tour/Vuelta. 8 others came in the TDF warm-ups (Dauphine & Tour de Suisse). The rest of the racing calendar provided just four stages with that difficult level of climbing.

mtn-difficulty

In addition, those 52 stages came at a median position of Stage 16 in grand tours, meaning the riders have over 2500 km in their legs already.

Looking ahead

By this method, the toughest stage for climbing in the 2019 Tour de France will be Stage 20 with the summit finish at Val Thorens. This stage starts with the Cormet de Roselend which at 13.9 on the KOM scale is the fourth toughest climb in the race. That’s followed by the strong Cat 2 Cote du Longnefoy (6.9). The climb to Val Thorens is nearly 34 km at over 5% – with about 6 km of slight downhills/flats (18.8).

tdf-climbs-stats

Above is a plot of length and gradient for Tour de France categorized climbs.

 

Measuring Performance by Discipline

There are broadly four domains in professional cycling: sprints, climbs, time trials, and all-around (the sort of ability which wins one day races, breakaways, and short uphill finishes). Measuring performance – with an eye to using in a predictive sense – is necessarily different in each area.

  • For sprints, there’s no time gaps so finish position is what matters.
  • For time trials, the clock is all that matters.
  • For climbing, generating time gaps is critical, but there needs to be an opportunity to do so.
  • For one day races or breakaways, it can be just as much about how often you make the break or key move as how often you finish it off with a win or podium place.

To measure each, I’ve developed metrics to answer the key question posed by each discipline:

SPRINTS: how high are you finishing?

CLIMBS: how much separation can you generate on your opponents?

TIME TRIALS: how fast can you go against the clock?

The necessary first step before measuring performance in each discipline is actually classifying races between disciplines (see this earlier post).

We’ll cover Sprints, Uphill Finishes, and Time Trials today, with analysis of performance in mountain stages to come later this week.

Sprints

For sprints, I’ve developed the metric Log Points. Riders get points for finishing in 10th place or better on a sprint stage based on a formula which heavily weights finishing in the highest positions.

Finish Position Points Earned
1st 1.00
2nd 0.49
3rd 0.31
4th 0.21
5th 0.15
6th 0.11
7th 0.08
8th 0.05
9th 0.03
10th 0.01

This means riders who consistently finish in the top three will be ranked around 0.5. Riders who finish outside the top 5 will be ranked around 0.05. This represents that the gulf between the top sprinters and the rest is wide – the top sprinter in each Grand Tour since 2013 have won 40% of sprint stages they’ve contested.

The ranking since the start of 2018 shows Groenewegen out in front followed by Sam Bennett, and a trio of Gaviria, Viviani, and Ackerman.

Rider Average Log Points Stages
Dylan Groenewegen tdf 0.49 26
Sam Bennett 0.44 43
Fernando Gaviria 0.40 39
Eli Viviani tdf 0.40 55
Pascal Ackermann 0.40 25
Peter Sagan tdf 0.37 42
Michael Matthews tdf 0.35 18
Caleb Ewan tdf 0.27 46
Jasper Philipsen tdf 0.26 16
Arnaud Demare 0.26 32

Yearly leaders since 2015 are:

  • 2015: Peter Sagan (0.45)
  • 2016: Peter Sagan (0.52)
  • 2017: Peter Sagan (0.50)
  • 2018: Dylan Groenewegen (0.52)
  • 2019: Sam Bennett (0.55)

By this method, Peter Sagan’s last 18 months have been a level below his 2015-17 performance (0.37 vs 0.49).

Looking at odds for TDF Stage 1 field sprint, Pinnacle has Groenwegen ~35%, Viviani ~25%, Ewan ~20%, and Sagan ~15% to win.

Uphill Finishes

Sharp uphill finishes at the finish line are unique breed of flat stage. Like stages designated as sprint stages, they tend to be long, fast, and mostly flat.

Sprint stages have an average climb difficulty of just 3.5 vs 4.4 for uphill finish stages (where 40+ is high mountains). The difference is sprint finishes have an average gradient of 0% in the final kilometer vs 5% for uphill finishes. These can range from selective, sharp closing climbs like Mur de Bretagne in last year’s Tour de France to the more mild rises like the immediate prior stage in Quimper.

Log Points is again the measure on these stages. Time gaps taken tend to be small – a median gain by the stage winner of 3 seconds on 10th and 9 seconds on 20th (versus 0 seconds on sprint stages) – so finish position is again the crucial metric.

Ranking the peloton since the start of the 2017 Tour de France sees Valverde, Alaphilippe, and Caleb Ewan as the far and away top three. Sagan comes in fifth and we see some other familiar faces like Daryl Impey and Greg van Avermaet.

Rider Average Log Points Stages
Alejandro Valverde 0.46 6
Julian Alaphilippe 0.45 9
Caleb Ewan 0.44 9
Magnus Cort 0.30 5
Peter Sagan 0.29 12
Sonny Colbrelli 0.27 7
Carlos Barbero 0.24 10
Richie Porte 0.22 9
Jasper Stuyven 0.20 6
Daryl Impey 0.20 10

Important to remember sample size on these type of races is much smaller than for sprint stages.

One missing man is excluded for that reason; Michael Matthews has just two of these type of races in the last two years, but dating back to 2013 he’s behind only Alaphilippe, Sagan, and Valverde on these type of stages.

Time Trials

To measure time trial performance, the obvious elements to consider are your time and the distance of the race. It’s typical for short prologues to be won by a handful of seconds (13 seconds separated 1st and 50th in the 4km 2019 Romandie prologue vs 98 seconds between 1st and 50th in the 17km Stage 5 ITT), while longer TTs create more separation.

Another crucial factor is that the time trial is relevant for only a handful of riders – GC competitors and TT specialists – meaning we care much less about how well the 50th placed rider performed than the 10th placed rider. In Grand Tours since 2013, the median rank for the eventual GC winner in time trial stages is 5.5 with a median gain of 0.4 seconds/KM on the 10th place finisher in the stage.

Our metric will be Relative Speed and will simply be the seconds gained on 10th place in the stage divided by the length of the course in kilometers.

For example, Fabio Aru entered the 2015 Giro Stage 14 ITT leading Contador by 19 seconds. The 59km course is the longest in recent Grand Tours meaning Contador had a lot of opportunity to pick up time on Aru. Contador gained 82 seconds on 10th place, while Aru lost 85 seconds – a difference of nearly three minutes which would give him his winning margin in the race. Contador gained 1.4 seconds/KM, while Aru lost -1.4 seconds/KM.

The median Relative Speed for time trial winners in Grand Tours since 2013 is about 3.0 seconds/KM. In this year’s Tour de France Stage 13 ITT is 27km which translates into a projected gain of about 81 seconds for the winner over 10th place on the stage.

However, going by this method is not perfect as uphill TTs like Stage 1 in this year’s Giro provide much a much larger opportunity to gain time than a flat TT. Eg, Roglic took 5 seconds/KM on 10th place over 8km (not all uphill) in that stage, while Wout van Aert took 2.7 seconds/KM on 10th place over 26km in the much flatter Dauphine TT this year.

We add an adjustment to relative_speed based on the total vertical meters gained divided by the length of the stage (eg, in that Stage 1 Giro TT the route moved uphill 203m over 8km so 0.026). The equation is:

adjustment_factor = 0.6 + (31.1 * (total_meters_gained / length_in_m))

adj_relative_speed = relative_speed / adjustment_factor 

For that Giro example, the adjustment_factor was 1.4 which yields an adj_relative_speed of 3.6 for Roglic – still very strong when compared with other TTs.

Since the start of the 2017 Tour de France, Rohan Dennis and Tom Dumoulin are the stand-outs in the TT. Against the strongest opposition in grand tours and the world championships, Dumoulin has finished 5th, 2nd, 1st, 3rd, 1st, and 1st in the last 24 months.

Rider Median Relative Speed Stages
Rohan Dennis 2.3 seconds / KM 13
Tom Dumoulin 1.9 seconds / KM 10
Victor Campanaerts 1.5 seconds / KM 15
Stefun Kung 1.3 seconds / KM 15
Soren Kragh Andersen 1.1 seconds / KM 9
Michal Kwiatkowski 1.0 seconds / KM 13
Tony Martin 0.9 seconds / KM 14
Primoz Roglic 0.8 seconds / KM 14
Patrick Bevin 0.7 seconds / KM 12
Chris Froome 0.6 seconds / KM 9

None of the GC contenders on this list are in the 2019 TDF field, but Bernal (0.5 seconds / KM) and Thomas (0.2 seconds / KM) have both performed well in this discipline.

Equally as important when talking about time trials and the TDF is how bad certain riders are in this discipline. It’s no secret Romain Bardet struggles; he’s given away about -4.3 seconds/KM to 10th place in ITTs in the last 24 months. Comparing him to Geraint Thomas – for example – sees him losing about 120 seconds over the 27km Stage 13 ITT this year.

Most other GC contenders are between -1.5 and -3.5 seconds / KM (Porte, Pinot, Yates, Martin, and Nibali are all in this zone). Jakob Fuglsang (-0.5 seconds / KM) is probably the best placed GC contender to tangle with Thomas/Bernal in this discipline; he should be able to keep his losses under 30 seconds.

Assigning Stage Types

Inherent in any analysis of cycling is a division of races into categorical types – sprint stages, mountain stages, time trials, etc. In this post I’ll discuss several processes to categorize one day races, multi-day stage races, and grand tour stages into appropriate groups for analysis.

This classification problem requires a large amount of data to be gathered along with significant feature engineering. My first attempt to categorize stages came solely from race and stage results; stages are already assigned KOM points and sprint points, the actual timing results are strongly indicative of the type of stage (eg, sprint stages will have very little or no separation between the times of the top finishers while mountain stages will have significant separation throughout the peloton), and the speed and length of stages inform as well.

I first using Principal Component Analysis to find the crucial features which separated stages. The first PC was always the KOM difficulty of climbing and the second PC was always strongly influenced by the distance of the stage (separating out short mountain stages from long one day classic races). Other features like the standard deviation in time for the top 40 finishers and the speed of the stage were also strong indicators of stage type.

I applied these PCs to cluster the stages using K Means and experimented with between 3 and 5 clusters (ignoring time trials) with the hope that the data would sort itself into sprint/intermediate/flat with uphill finish/medium mountain/high mountain or some similar break-down.

However, inaccurate classification abounds with this process; the three profiles below all award very similar KOM points. Stage 5 2016 is 216km, raced at 39 kph, with a standard deviation of 93 seconds between top 40 finishers. Stage 10 2016 is 197km, raced at 45 kph, with a standard deviation of 236 seconds between top 40 finishers. Stage 5 in 2017 is 161km, raced at 43 kph, with a standard deviation of 44 seconds. None of this information is that informative; each stage was classified the same as either ‘intermediate’ or ‘medium mountain’ depending on the number of clusters used.

tour-de-france-2016-stage-5-profile
Stage 5 in 2016

tour-de-france-2016-stage-10-profile
Stage 10 in 2016

tour-de-france-2017-stage-5-profile
Stage 5 in 2017

But, one is clearly a stage with a summit finish which separated GC contenders (Stage 5 in 2017), while one is clearly a stage the GC contenders won’t contest (Stage 10 in 2016). Just using results data to accurately parse the difference between mountain and intermediate or intermediate and sprint was not possible.

Instead I was forced to gather actual race route data which shows elevation and distance throughout each stage. Linking that data up with data showing categorized climbs gave me a much richer data-set for judging the difficulty of climbing (one for a future post) and for separating stage types.

Some of the helpful features for dividing stages were:

  1. gradient in the final 1 KM
  2. concentration of climbing difficulty (basically the difficulty of the toughest climb in a stage)
  3. total elevation change (the amount of uphill or downhill distance in a stage)
  4. number of categorized climbs
  5. overall climbing difficulty

I achieved better classification results (as judged by my manual classification of historical Tour de France stages) using this method (along with the same PCA/K Means methods from earlier), but still wasn’t getting the precision necessary. Frustratingly, the Stage 5 2017 finish at La Planche de Belles Filles was being classified either as similar to Stage 6 2018 ending on Mur de Bretagne or as similar to the intermediate/hilly stages from earlier.

I attempted to supplement this process by including decision trees trained on previously tagged Tour de France stage data (about 12% of overall stages were tagged). This again provided additional precision, but not the quantum leap needed.

I’ll discuss how I bridged the gap and finally solved this problem using random forests (a model using many decision trees) in a future post.

Introduction

Professional cycling has seen an explosion of statistical content in recent years. We can follow the top professionals on Strava, track GPS data in the most important races, and find what riders have the best historical times on climbs like Alpe d’Huez or Mortirolo.

As a professional working in the sports analytics industry, what I see missing is careful analytical work to provide context to all of that data. The data revolution has already impacted other professional sports like baseball, basketball, and tennis – engaging the public with smarter conversations and more predictive content. This site is designed to fill that niche.

I’ll be producing data-driven content to answer questions like who is likely to win today’s Tour de France stage, how much does form carry-over in sprinting vs climbing, what’s the toughest monument, or how much does Grand Tour form matter?

Ahead of the Tour de France I’ll be posting introductory pieces to ‘show my work’. During the Tour, more of the content will be focused around key stages and opportunistic stories that arise. Post-Tour, there will be time for deeper analysis that digs into some of these new data sources.