Archives
Note #150: The slings and arrows of outrageous statistics (2024.8.5)
Last night we watched Brady Ellison of the US literally (yes, literally) come within an arrow’s width of winning the gold in men’s individual archery against Korea’s Kim Woojin. I wanted to see Ellison win, but it was a hard-fought match and I think both athletes should be proud of how they performed. It’s not the match itself that I want to write about today, though. Instead, I thought I would rant a little about something I saw in the broadcast. HJ says that half the entertainment value she gets from watching television with me is seeing me complain about things that are wrong or inaccurate. I’d like to say that it’s an occupational hazard—that, as a professor, it is my job to be critical—but I think it’s more likely that I became a professor because I am normally critical, not the other way around.
I’ll preface this little rant by saying that I don’t remember which particular channel we were watching over the weekend, as all the big channels generally showed the same events, so this might have been limited to one broadcast. At any rate, whenever a Korean archer was on the line getting ready to shoot, there would be a little statistic on the screen that read something to the effect of: “10-point probability: 53.4%.” That is, the probability of this particular athlete shooting a ten was 53.4%. This confused me greatly at first, because I couldn’t figure out how they could possibly calculate something like that. I soon realized that what they were actually showing was the percentage of all shots made by the athlete so far during this Olympics that had been tens.
I shouldn’t need to say this, but I will: This is very much not the probability that they will shoot a ten. There is a saying: “Past performance is no guarantee of future results.” I don’t know where it comes from, but it is applicable here. Whether or not an athlete shoots a ten depends on a lot of different factors. Wind intensity and consistency (that is, is it a constant wind, or are there gusts or swirling winds?) are probably the biggest factors, but other environmental factors like humidity, heat, precipitation, sunshine, etc. might come into play as well.
You could probably create a model that accounted for all such external factors, although it would be very complex and still probably a bit hit or miss (like predicting the weather). But you still wouldn’t be anywhere near a true probability because there are a lot of internal factors that need to be considered as well. Archery, like many sports, is about mental fortitude as much as it is about technique and physical prowess. Maybe the athlete got up on the wrong side of the bed that day, or something happened that put them in a bad mood, or they’ve been going through a rough time in general. These things seem impossible to model, mainly because you can’t really measure them, but if you wanted to put together as complete a model as possible you would need to factor in everything you could measure. For example, what is the athlete’s average score on an arrow after their opponent shoots a ten? Even there, you would need a finer approach: Having to shoot a ten in order to win the match is very different from only having to shoot a seven or higher.
Ideally, though, if you got deep enough into the nitty-gritty, you probably could put together a somewhat reasonable model, right? Well, there’s a problem with that. While archers do end up shooting a lot of arrows over the course of an Olympics—especially Korean archers, who generally reach at least the quarterfinals in both group, individual, and mixed events—the finer grained you get the smaller your sample sizes are going to be for each situation. For example, how many times is an athlete going to be in a situation where they only have to shoot a seven to win the set, and where winning the set gives them the match? A few times at best, I would imagine. Given all the variables you would need to account for, you’re probably not going to have enough data to reach any meaningful conclusion.
When it comes down to it, trying to calculate the probability that an archer shoots a ten seems like a lost cause. That being said, I don’t think that the statistic they showed on screen was useless or uninteresting—it was just labeled improperly. I think it is actually quite interesting and helpful to know that a certain archer shot tens around 63% or 64% of the time (as Kim Woojin did). While past performance may not be a guarantee of future results, it is certainly an important factor. If two competitors stepped up to the line and you saw that one shot tens 60% of the time while the other only shot tens 40% of the time, you would perfectly justified in expecting—all other things being equal—the first athlete to win the match. So it’s a good stat. I just wish they had not misrepresented it, that’s all.
I could say more about the Olympics, but I’ll leave it at that. HJ’s heard enough of my rants about various little things that are ultimately unimportant, so my goal here today was just to spare her yet another one.