Schemes and collusions

All the countries do the same. There are schemes, there is collusion. And if France did not scheme, it wouldn’t win any medals.

This phrase, attributed to Philippe Candeloro, comes from Joy Goodwin’s book The Second Mark. The book is mainly dedicated to the scandal in the Pairs competition at the 2002 Olympic Games, a competition that, as we all know, changed figure skating forever. But there is not just that competition in Goodwin’s book, there is the narrative of an environment in which illicit agreements and pressures seem to be the norm. Everyone knows everything, but it cannot be said, because who speak is out, whether it is an athlete who would be penalized even more in the scores or a judges who would be suspended.

I’ve written a lot about national bias in recent months, it’s the easiest thing to identify, but it’s not the only problem. In 2002 there was an exchange vote, the French judge helped the Russian Pair, the Russian judge was supposed to help the French Ice Dancers. Then both Fusar Poli/Margaglio and Bourne/Kraatz fell during the free dance, Anissina/Peizerat wouldn’t have needed help to win anyway, but we couldn’t have known this in advance. The reigning world champions at that time were the Italians, not the French.

And this is not the only case, only the best known. In 1999 the Russian Sviatoslav Babenko and the Ukrainian Alfred Koritek were filmed exchanging signals to determine which marks assign to the skaters during a World Championship. Obviously they were suspended, but unfortunately they were both reinstated. According to Jon Jackson in 2000 the Russian Myra Oblasova would have made agreements to manipulate the result of the competitions in the Junior Grand Prix Final, forcing Elena Fomina to collaborate with her. I doubt the ISU has ever done an investigation to find out how much truth there is in Jackson’s words. In 2013 the Ukrainian Natalia Kruglova was suspended because she asked another judge to manipulate the result of a competition. In 2017, Belarusian Alexandre Gorojdanov and Russian Maira Abasova may have manipulated the result of a competition. In the latter case, we have a Belarusian and a Russian interested in the outcome of two Spanish ice dancers couples, how do you discover agreements of this type?

These are all episodes where two judges agree on a certain result, not a judge who act autonomously. And when there is an agreement between two judges it is more difficult to see that the result has been manipulated. It is difficult whether there is a conscious consent or if the judges spontaneously assign absurd marks, guided by national bias.

I explain the problem with an example. This, taken from SkatingScores, is Yuzuru Hanyu’s protocol in the short program of the 2016 Grand Prix Final.

I have only deleted the nationality of the judges and highlighted in green when the final sum of the marks awarded by that judge is higher than the final mark of over 1.50 points, in red when the sum is lower than 1.50 points. In this case, with judges 8 and 9 deviating slightly from the average, how did the other judges behave? Was it Judges 1, 4 and 7 who were too generous, or Judges 2, 3, 5 and 6 who were too strict?

The quadruple loop received marks from 0 to -2. I invite you to watch the program keeping in mind bullets and deductions.

The jump is preceded by a spread eagle, +1. It has good heigt and distance, +1. It’s on the music, +1. Three bullets, GOE +1. The only applicable deduction is the one for weak landing, from -1 to -2. The final mark had to be between 0 and -1. Judges 6 and 8 (curiously one of those who give marks close to the final average), were too strict. The other marks are correct.

If we watch the marks for the combination, a +1, three +2 and five +3, we can think that the combination deserves a +2 or a +3, that one judge was a little strict, but that the others give the correct mark. Again, watch the video.

The combination respects all bullets from 2 to 8. The combination deserves a +3, period. Judges 2, 3, 5 and 6 gave an incorrect mark. Checking also the marks given to the other skaters I noticed that, curiously, Judge 2 awarded Javier Fernandez’s combination +3. It fits, Fernandez’s combination also respects seven bullets, but Hanyu’s combination has a better flow, so how is it that Fernandez’s mark is higher?

With the triple Axel it is even worse, there is still a +1 and the +2 are four. But the jump respects all 8 bullets. I too see that the knee is a little bent, but if for another skater a jump landed like this is effortless, then it must also be for Hanyu. And even without bullet 7, there are still seven, this triple Axel deserves a +3.

Doing the math, how were the three jumping elements evaluated by the judges?

Oh, curious. If we look not only at the marks but also how the elements were performed, we find that the two judges who were far above the average actually gave correct marks, while all the judges who were below, and even one of those who was close to the average, were too strict. This means that the average is lower than it should be because as many as five judges have assigned wrong marks. And how is it that such a thing can happen, even if the judges do not make illicit agreements between them? Perhaps, if I write the nationality of the judges, and indicate which skaters participated in the competition, the problem can become clearer.

One of the judges who was above average (a wrong average because it was too low) is the Japanese judge. In reality the Japanese judge was slightly generous on the quadruple loop, where he could have assigned a -1 but, according to the rules, for that jump, both a -1 or a 0 would be a correct evaluation. On the other hand, three of the judges strict with Hanyu had at least one of their compatriots in the competition, as Hanyu’s direct rival: the Spanish judge Marta Olozagarre (the judge who gave a +3 to Fernandez’s combination), the American one Lorrie Parker and the Canadian Beth Crane. In such a situation, even without agreements, can be difficult to detect a biased mark, it need a close look. A look at the protocols, at the rules and at the program. Also the marks of the Russian judge who had no direct interests in the competition were low, but the Russian judge was strict with almost everyone. I did a table.

Let’s look first at the lines 1-7. Column M is the Total Segment Score. In column N I have indicated the marks assigned by the Japanese judge to each skater. Column P indicates the vote of the Spanish judge, column R the vote of the US judge and so on. Column O indicates how far the Japanese judge’s marks differs from the TSS. Column Q indicates how far the Spanish judge’s marks differs… I did this kind of calculation for all the judges. In lines 11-17, I only kept the data from columns O, Q, S, U, W, Y, AA, and AE, so it’s easier to read the numbers. I highlighted in bold when a judge gave much higher marks than the TSS, in red when the marks were much lower. The black-bordered boxes indicate when the skater and the judge are compatriots. The Russian judge was strict with Hanyu, but tended to be a strict judge, so he wasn’t particularly unfair with him. Similar explanation for the French judge, he was generous but, beyond the fact that his marks were correct, that judge was generous with most of the skaters, so he did not help Hanyu. In theory Hanyu was helped by the Japanese judge, in practice it was the others judges who lowered Hanyu’s marks to try to help their compatriots.

I only looked at the jumps, not at the spins and steps, or the components, but Let’s Go Crazy, both at the Grand Prix final and at the World Championship, despite an imprecise jumping element (here the landing of the quadruple loop, at the World Championship the combination) is an extraordinary program, and in my opinion assigning it a mark lower than 9.75 in any of the components and 10.00 in Interpretation, is a crime.

Even without compatriots, sometimes the marks are really absurd. I have already talked here about how underestimated was Hanyu’s short program at the World Team Trophy. A program in which six of the seven elements deserved +5, the sixth deserved a 0 (on the occasion I wrote in Italian, for those interested only in the GOE of the 3A, the explanation is under the photos) and the components deserved at least 9.75. At least. And there is another absurd detail that I have just noticed.

Hanyu missed the landing of the triple Axel. This is an almost unimaginable mistake for him, if we look at the triple Axel executed in international competitions at senior level, from the 2010-2011 season, once, at the 2013 Finland Trophy, he made a mistake and only performed a single Axel, at the 2021 World Team Trophy he got a negative GOE, all the other 49 times he got a positive GOE. Ten times he got the maximum achievable GOE, whether it was +3 (nine times) or +5 (once). Let’s imagine the best possible scenario, a scenario that is anything but improbable: a perfect jump and 4.00 points more. Even so, for the Japanese judge Sakae Yamamoto and for the Italian judge Walter Toigo, the best short program remains that of Nathan Chen. Can I recommend to them an urgent visit to the ophthalmologist? For four judges Chen would have been the best not only in the technical aspect, but also in the components, with Toigo who managed to place Hanyu in third place, behind also Jason Brown (but Brown deserved marks higher than Chen and lower than Hanyu, there are a lot of problem on these marks). I highlighted in orange the entries where Chen received higher marks than Hanyu, but even the same marks, or even 0.25 points to Hanyu’s advantage, is a mistake.

These two short programs, those skated by Hanyu at the 2016 Grand Prix Final, with the score system + 3/-3, and at the World Team Trophy 2021, with the score system + 5/-5, are enough to tell us that every program, every element, should always be looked at carefully, because numbers alone can deceive. With this awareness, I wondered if it was possible to detect strong feelings between nations, overwhelming passions or imperishable hatreds. Each copetition is different, there may be specific agreements, particular situations, different skaters, but… is it possible to find a pattern on large numbers? To understand this I did something a little crazy, I watched the Olympic Games, the World Championships (only senior), the continental championships (European Championship and Four Continents Championship), the Grand Prix competitions (only senior, and I didn’t considered the 2020-2021 season because both the skaters and the judges were almost only from a single nation, what statistics could I have obtained?), and the World Team Trophy, from the 2016-2017 season onwards. Only the men’s competitions, already in this way the numbers are so many. And I haven’t looked at all the skaters, only the most important of the most important nations. How did I decide who was important?

In the Team Event at the 2014 and 2018 Olympic Games, the first six nations were the same, with the inversions of the first two places: Russia, Canada, United States, Japan, Italy and China. To these countries I have added two that have participated in the World Team Trophy in these seasons: Italy and France. And then Spain, which is not a strong nation but up to PyeongChang if you look at the Men’s competitions, you cannot fail to consider Javier Fernandez. This is for the nations. And the skaters? I chose the ones who got on the podium at the Olympic Games, the World Championship or the Continental Championship. Going in alphabetical order by nation (order according to the official international acronym, which is the one I use in my tables), in Canada it would not have any skater who meets these requirements. Patrick Chan has won three World Championships, an Olympic silver and several other medals, but they all date back to before the period in question. Wanting to include Canada in the checks, I did my checks on Chan, who however up to PyeongChang was undoubtedly the most representative skater in his nation. The Olympic Games were Chan’s last competition. As I kept wanting a Canadian, I also checked out Keegan Messing’s results. Because he? Because he was the only Canadian who took part in a Grand Prix Final and because this year, with only one place available, Canada sent him to the World Championship. Right now the most representative Canadian skater is Messing, although for at least a couple of seasons Kevin Reynolds enjoyed better considerations and perhaps even Nam Nguyen.

For China there is no doubt, I looked at Boyang Jin, even if the Jin from after PyeongChang is less strong than the Jin before. It’s a situation that happens with several skaters, not all of them are equally strong before and after. I can not do anything about it. Well, maybe I could do the statistics in a different way, and maybe sooner or later I will do them, for now I have done so. For Spain, I looked at Fernandez, even if after PyeongChang he only went to one competition and therefore the statistics are a bit lacking. But really the other Spaniards cannot be considered if we are talking about strong skaters. For France I chose Kevin Aymoz. Now he is undoubtedly the strongest skater, even if before the Olympic Games the best French was Chafik Besseghier. But no, I can’t include Besseghier in these checks, if I let him in I have to give space to several other skaters stronger than him that I have neglected. So the French data is a bit lame, in the reverse of the Spanish data.

For Italy I looked at both Matteo Rizzo and Daniel Grassl. Actually, in the first season the strongest was Ivan Righini, but Righini did not compete much in the period I considered, and he was never really strong, so I rule him out without problems. Grassl has never won a medal in one of the major senior competitions, in theory I should have ruled him out, but right now who is the strongest Italian skater? If we talk about what I like best, it is undoubtedly Rizzo, but in terms of potential, which one can more easily compete for the medals that matter? So I looked at them both. For Japan I watched Yuzuru Hanyu, Shoma Uno and Yuma Kagiyama. With the criteria that I have set myself, Keiji Tanaka remains out of my checks. He can compete without problems with some of the skaters I have considered, but I had enough of Japanese skaters.

The Russians are even more, Dmitri Aliev, Mikhail Kolyada, Maxim Kovtun and Alexander Samarin. Here among the excluded is a certain Sergei Voronov, who climbed the European podium before the period I considered, and who, in the period I considered, participated in two Grand Prix Finals. But he was no longer one of the leading Russian skaters, and I had had enough of Russian skaters. Also on Artur Danielian I did not do a specific check, despite a European silver, but Danielian was in only one competition among the ones I watched, so he enters a statistic by nation, but specific statistics on him make no sense (as they don’t for Kagiyama, who made two competitions). For the United States I checked Nathan Chen, Jason Brown and Vincent Zhou.

How did I make my calculations? With the next five screenshots there are explanations of how I did my calculations, if you are not interested in the method you can move on. The results (partial, I still have to think about it) are in the last two tables.

This is just a tiny part of my file. In column A we see the indication of the competition (this is the short program, to avoid confusion for the free skate I used italics, but I have not taken screenshots of the free skate, I assure you that these data exist but for the moment I do not publish them, I need too much time) and the names of the skaters. There is not always the same number of skaters. In each competition I looked at the data of who got on the podium, regardless of their identity (and therefore at Skate America, for example, there is Adam Rippon, while at the Rostelecon Cup there is Alexei Bychenko), plus all the skaters who I have listed a little further up, regardless of their final ranking. Column B is the Total Segmens Score. Column C, the nationality, was useful for when I worked on the data, in order to easily group the skaters according to their nationality. And then there are the judges’ marks. Column D, for Skate America, includes the Canadian judge’s votes, for Skate Canada the Chinese judge’s marks, and so on. The next column, E, is the difference. I purposely made the screenshot in this way: showing you in the top line that what you see as numbers are the results of operations for which I wrote the formula. I don’t do the calculations, I write the formulas, then the computer does the sums. Column E, for Skate America, indicates how much the marks of the judge differed from the TSS of the skater. Each column has its own formula, so we can easily see if a particular judge has been strict with a specific skater or not.

As we saw earlier with Hanyu, we cannot understand that something is wrong if everyone assigns strange marks, we can only see how the judges cope with each other. My system is imperfect, it must be used bearing in mind its limitations.

Line 7, average difference, tells us if a judge tends to be generous or strict with his marks. The Canadian judge on the whole is severe, 1.06 points below the average (but he was very generous with Boyang Jin and particularly severe with Maxim Kovtun), the Australian is very strict, 3.42 points below the average, the most generous is the Kazakh. Once I figured out if a judge was strict or not, I looked at how he behaved with individual skaters.

I have selected the skaters of each nation. This is a part of the file related to Canada, I have done the same job with all the countries. There were no Canadian skaters at Skate America, so my control starts with Skate Canada, and sometimes a few competitions are missing.

From the original file I kept, for each competition, the line relating to the name of the competition and the nationality of the judges (for the previous screenshot these are lines 8, 25, 43…), those relating to the skaters of the nation I was checking, and those relating to the average difference, so lines 12, 30, 44… For Canada I have checked all the competitions of Chan and Messing only, but at Skate Canada Kevin Reynolds got on the podium, on the occasion he he was an important skater, so despite not having made specific checks on him, for that competition his data entered the national average.

Ok, here I finally found the numbers that interest me. Column E contains exactly the same data that was present in the other screenshot, the difference between the mark assigned by that particular judge and the TSS for that skater, and an indication as to whether the judge is strict or not, and of how much. Now I’ve checked if the judge is strict with that skater. Overall, the US judge was 3.04 points below the average. With Chan he was under the average by 3.11 points, so with him he was strict by only 0.07 points. I used the same formula for all the marks.

After that I deleted the columns Em G, I, K, M, O, Q. S, U and lines 51, 54, 57… This way I have only the bias of each judge left with each skater.

At this point, I played by dragging the data in order to have a row for each competition, and a column for the judges of each nation.

For each column up to row 16 is the data I have published above. Line 18 contains the sum of all data between line 2 and line 16. Line 19 indicates the number of short program judged by the judges of that country. Line 20 is the average.

We can ignore the judges who judged Canadian skaters in few competitions. In the short program of the two-year period 2016-2017 and 2017-2018 the Canadian judges helped their compatriots on average by 1.33 points (but they did not behave in the same way with everyone, among the data that I do not publish now there are indications of the averages for individual skaters. Chan received a help of 2.17 points, Messing of only 0.10). The Chinese judge was corrected with the Canadians, only 0.15 points above the average. On the other hand, the Israeli judges does not particularly like them, they remained below the average of 1.16 points. Still better than the French, who remained below 1.65 points.

I did the same calculation with all the skaters, and all the nations, which I have listed above, both for the short program and for the free skate, in all seasons. The data is a lot and I haven’t really looked at it yet. The judges I checked are from 40 different nations, in my selection there are only 23 nations. They are those whose judges have judged at least five programs, between short program and free skate (lines 3, 6, 9, 12…), for skaters from at least five different nations. The bias I indicate (lines 2, 5, 8…) is given by the sum of the short program bias with the sum of the free skate bias.

Some biases are very high, others very low, even for skaters who are not compatriots of the judges who judged them, or who are not direct rivals of their compatriots, but these strange biases are not always associated with so many competitions, and if the competitions are few, the statistics means little. I have to look at them more calmly, to understand if there is something strange, even with all the data that I have not published because it would be so many screenshots (but if I were to find something interesting I would publish everything). What stands out is the very high bias, in many competitions, of the Kazakh judges in favor of Russian skaters. His bias is higher than what Canadian, French, Italian, Japanese, Russian and US judges have with their compatriots. In practice, if a Kazakh judge judges a Russian skater, that skater is helped by a judge who does not risk being suspended for national bias, because they come from different federations, so the Russian skaters can be aided by two judges, not one. And if the strange marks came from more than a judge, the marks seems less strange, it’s more difficult to see the bias. Nice…

While waiting for me to find the time to take a good look at everything, I have obtained a further table from these data. I watched how the judges of these eight nations behave with the skaters of the other seven. Given that everyone helps their skaters a little, how much do they help them compared to skaters from other nations? The Chinese judges, the more biased ones, were particularly strict with Javier Fernandez, seen as a direct rival of Boyang Jin. The Spanish judges on the other hand were strict with the Chinese skater, Jin was Fernandez’s rival, and the Japanese skaters, Hanyu and Uno, while they were non interested (relatively, the bias is still high) at the French, at the American (Chen was a direct rival of Fernandez for a short time, the others have never been) or at the Italian skaters. So, divided into columns, this is the bias of the judges of the eight nations with the skaters of the other seven nations. Some skaters are helped a lot by the judges that are their compatriote, others not too much. And with this I stop for now.

This entry was posted in pattinaggio and tagged . Bookmark the permalink.

Leave a ReplyCancel reply