I’ve been reading (like everyone else) “Game of Shadows,” the book about Barry Bonds and steroids and the BALCO scandal. It’s quite a remarkable feat of reporting. What striking about it is something that you might not notice unless you’ve journalist—which is the absence of obvious lawyering. If you ever write something even remotely critical of someone, the lawyers invariably go through it, and before you know it your prose is strewn with “apparently” and “allegedly,” and “according to” and so-so “denies. . .” There’s almost none of that in “Game of Shadows,” which is amazing considering the book accuses, in devastating detail, several of the biggest names in sports—Barry Bonds, Gary Sheffield, Tim Montgomery and Marion Jones, among others—of being serious steroid users. The two writers—Mark Fainaru-Wada and Lance Williams—must have had really impeccable sourcing. When the book first came out, and several baseball writers predicted that Bonds’ reputation was destroyed and his chances are getting in the Hall of Fame seriously damaged, I thought they were overstating things. Now I’m not so sure. “Game of Shadows” is a death sentence for Bonds. More to the point, it’s impossible to read the book and accept that Bonds has a right either to the single season home-run record or, assuming he keeps playing, the career home run mark.
So what should we do? I think we need to set the bar a little higher for record-setters. Justin Wolfers, an economist at Penn, just did a study analyzing college basketball scores, concluding that there is ample statistical evidence for point-shaving in about five percent of college games. Steven Levitt (I know, I know. I’m obsessed with him) has done the same kind of work on student test scores. Forensic economics look at large data sets and draw surprisingly sophisticated inferences about behavior and intention. I think we should loose the forensic economists on all record-setters, and require that athletes pass a statistical plausibility test in the wake of their achievements.
Obvious example:
Florence Griffiths Joyner, in 1988. Before that year, her best times in the hundred meters and the two hundred meters were, respectively, 10.96 and 21.96. In 1988, at the advanced (for a sprinter) age of 28, a suddenly huskier FloJo ran 10.49 and 21.34, times that no runner since has even come close to equaling. At the time, people in the track world just rolled their eyes. But since FloJo never failed a drug test, there was nothing they could do. Well, there is something we can do. We can bring in the forensic economists—and any statistical analysis of the career marks of world class sprinters would have told us that marginally world-class 28 year old do not, in the absence of some kind of help, suddenly turn into the greatest runners the world has ever seen.
Bonds falls into the same category. From the moment he started his late career surge, everyone who knew anything about baseball suspected mischief. “Game of Shadows” points out that Bonds had the second, ninth and tenth greatest offensive season in baseball history at the ages of 36, 37, and 39 respectively—and the average age of everyone else on that list (Gehrig, Foxx, Ruth and Hornsby) is 27. No one—no one—turns himself into one of the greatest hitters of all time in his late 30’s. His home run record should have been denied as statistically implausible.
Will raising the bar this way mean we occasionally deny a genuine record? It’s certainly possible. Bob Beamon jumped 29 ft, 2.5 inches at the Mexico City Olympics, and had never jumped more than 27 ft, 3 inches before that, and never again jumped more than 27 feet. No one has ever doubted that Beamon was clean. But it’s a totally weird performance. On the other hand, it was at altitude. Because of the difficulty in hitting the board, long jump performances are highly variable. And the effect of drug enhancement is sufficiently long-lived, that a single anomalous performance in an otherwise quiet career is more statistically plausible than a string of closely-linked anomalous performances in an otherwise quiet career. FloJo had a fantastic year in 1988, which is why she raised so many eyebrows. She wasn’t Beamon. She was Bonds. I think if we’re smart about it, we can learn to distinguish the fluke performances from the phony performances.
One obvious objection to this idea is that we have a tradition of presuming people innocent until proven guilty, and prima facia statistical tests violate that. But the presumption of innocence is a legal principle. We’re dealing with sports records here, and it seems reasonable, particularly in this day and age of advanced athletic chemistry, to ask a bit more of record holders.
I actually kind of like this idea. I think it would be a useful tool to use in consideration of records, but I am not sure it should be an up or down disqualifier.
Having said that, what about Lance Armstrong? He was a "marginally world-class" cyclist prior to his cancer treatments and came out of that as arguably the greatest cyclist in history. His long reign at the top suggests that he was either truly gifted (and just a bit lazy early in his career) or that he was the best cheater in sports history - since unlike baseball cycling has had extensive testing required for Armstrong's entire string of wins.
Posted by: Will McKenna | April 07, 2006 at 06:20 PM
Armstrong's first tour win came at age 27, the same age as Miguel Indurain's first win. There's nothing statistically out of the ordinary there, aside from the recovery of cancer of course. The point is, it's typical for a bike racer to mature into a tour winner right around that point in their life.
Posted by: Phil Aaronson | April 07, 2006 at 07:20 PM
"I think we should lose the forensic economists on all record-setters, and require that athletes pass a statistical plausibility in the wake of their achievements."
I think that a lot of the time we don't seem to care very much about the statistical significance of some result. Occasionally someone will calculate Chess-style ratings for players in some sport, but no-one ever seems to care all that much about the result. Who actually *did* win seems to be more important than who, hypothetically speaking, would win most of the time.
In fact, individual races seem particularly important in athletics--to win the 100m at the Olympics, for example, you don't have to win the race a statistically significant number of times, you just have to win it once. (If the race were repeated 100 times, how many times would the actual winner have won?)
Posted by: Michael S. | April 07, 2006 at 07:30 PM
It's certainly an interesting idea, but I have mixed feelings about taking records away from athletes absent a proverbial smoking gun. In the case of Bonds, if you take away his single season HR record, to whom do you you give it? Mark McGwire? Um, no. Sammy Sosa? That's pretty dicey. Sosa has long been *suspected* of using steroids, but his career-high 66 HR came at the age of 29, which isn't too far outside the statistical norm for a player to have his peak season. There are all sorts of other records that could come toppling down if we were to apply this principle consistently. Furthermore, we really don't have a very good idea of what players in the past might have done, nor do we have anything close to an understanding of the extent to which performance-enhancing drugs and legal, over-the-counter supplements help athletes. And even if we did have an idea, what would we do about it? For instance, do we discount modern achievements because athletes now have access to legal supplements like creatine, or even protein shakes? Hank Aaron is/was a vegetarian, for goodness sake; imagine what he might accomplished if he had access to a weight room and a bucket of whey protein isolate. While we're comparing records, should we discount everything Babe Ruth did because he played in a lily-white league and smacked homeruns off of guys who worked as coal miners in the off-season? I'm aware that I'm sort of all over the place, so I'll say that my point is that when you compare records (in baseball, anyway), you have to consider the era in which they occurred. The sport is constantly changing, whether it is by rule changes, improved training, better equipment, or a new drug testing policy. These changes put some records out of reach, and render others easily breakable. Instead of trying to figure out which numbers we should toss out and which we should keep, I think it's much nicer to consider statistics only in the context in which they occurred. For me, that means that Hank Aaron's 755 will always be more impressive than whatever Barry Bonds finishes with. For that matter, Roger Clemens' 314 is more impressive than Cy Young's 511.
Posted by: Jeff Watts | April 07, 2006 at 07:57 PM
To Malcolm's point, we have to be careful in labeling something as statistically abnormal.
Lance Armstrong has been a gifted cyclist throughout his entire career. One of his most impressive victories was in 1993 when, at the tender age of 21, he won the World Championship Road Race in Oslo, Norway. And he did so amidst an impressive field of seasoned cycling veterans, including would-be 5-time Tour winner Miguel Indurain.
Sure, Lance couldn't have won the Tour back then, but for those who have followed cycling then and now, it's really not a big suprise to see Lance find success in the Tour. He is physiologically better than just about everyone else (cyclist or not), but his physiology is consistent with other great cyclists, including Indurain and Greg Lemond. We should really only be suprised if someone like Lance does not produce amazing results.
Posted by: Kaan | April 07, 2006 at 09:03 PM
I completely agree, but it's very hard to convince people that statistical analysis can serve as a good indicator of cheating. Many of the people I've hate numbers: primarily because they often get proven wrong by them.
I've got the book on order from Amazon. If it lives up to the hype and praise you've given it, it should be as fantastic as I'm expecting it to be.
Posted by: xian | April 07, 2006 at 10:29 PM
Mark is out, Bonds is out, but man, Sosa is a tough out, he would be the record holder under your presciption.
There is way too much evidence of the work that Armstrong did to increase his oxygen carrying capacity, as well as the number of times that he won, as well as the support that he had. Armstrong is clean.
What about Michael Jordan? Not exactly an upstanding citizen in many ways.
Posted by: Dale | April 07, 2006 at 10:52 PM
I think such tool would be useful to force a mandatory blood test on performance out of the ordinary. However, in no way should it be enough ground to disqualify a record. On the armstrog case,there is a a smokin gun: sample taken in 2000. The investigation blew the case by not having the proper control samples.
Posted by: rakoto | April 07, 2006 at 10:52 PM
The idea's no good. There is no end to the controversy if you let those guys loose on the record books.
Test these athletes constnaly (on their nickel) and penalize them with real penalities. Rip that fat cash right out of their hands.
Posted by: Fajita | April 07, 2006 at 10:53 PM
I don't think it can be applied to cycling. Cycling is a bizare team sport in that there are individual times but these are sacrificed for the benefit of the leader. Lance Armstrong did not get a team position that made it possible for him to win the tour until his mid twenties. Comparing his statistics as a team rider and as team leader are meaningless.
Forget baseball though. Why not do meansurements that really matter? Like take a look at the schemes that Katherine Harris used to rig the 2000 election ballot in Florida before a vote was cast. The configuration of the voting machines differed by precinct. White republican precinct the optical scanners were configured to beep when the ballot was missread. Black democratic precinct they simply ate the ballot without complaint.
Similar games were played in 2004 in Ohio. Thats the reason there were people queued up to vote at 3am in the morning.
If you take a hard look at the 2000 election count the precincts fell to Bush in a pretty improbable fashion.
Posted by: Phill | April 07, 2006 at 11:57 PM
"I think we should lose the forensic economists on all record-setters"
Shouldn't we loose them?
Posted by: Harry Adler | April 08, 2006 at 12:31 AM
The problem is that if an athlete manages to avoid detection of the use of illegal performance enhancing drugs at the time of their world bearing performance, then regardless of how dubious it may appear from circumstantial evidence (statistical implausability, new found huskiness of voice, early death from heart failure...) then that record stands unchallengable in perpetuity.
But previously undetectable drugs routinely become detectable in later years. The compulsory blood and tissue sampling of any world record athlete, such samples being maintained for future testing, would perhaps provide a more robust mechanism for detecting (perhaps in conjunction with statistical analysis for identifying records to hold up to more scrutiny) conclusive use of illegal performance enhancing drugs. Leaving very little room for debate, which unfortunately the statistical approach always will.
john
Posted by: john Allsopp | April 08, 2006 at 01:03 AM
Here is an interesting related post on the use of statisical sampling in professional cycling
http://www.tdfblog.com/2005/07/stage_1_tt_grap.html
It might perhaps be interesting to correlate outlier perfrmances with subsequent detection of illegal drug use to test the efficacy of the technique.
j
Posted by: john Allsopp | April 08, 2006 at 01:05 AM
The appproach suggested in the post fails - very badly - for the Australian cricketer Don Bradman, whose career batting average was several standard deviations further above the mean than the next best batsman in history.
You can see this really strikingly at wikipedia:
http://en.wikipedia.org/wiki/Batting_average
The little blip on the very far right of the wikipedia graph is Bradman. Corresponding graphs for, e.g., basketballers and baseballers, are much less striking.
Despite this, so far as I'm aware, no-one has ever suggested Bradman was anything but clean, perhaps because his batting career was in the 1930s and 1940s.
Posted by: Michael Nielsen | April 08, 2006 at 01:21 AM
I feel like there's a lot of complexity here. Do we need to distinguish an athlete who is doping from an athlete who has a glandualar cancer that is generating massive amounts of testosterone or human growth hormone. How do we distinguish humans who have been produced with a cornucopia of supplements from other humans who have been sustained on a less-designed diet?
If you can run better because you have been biologically and genetically engineered by your country, does that mean you shouldn't be allowed to run in the Olympics?
Do we even need to distinguish things like this? Could we let everyone compete as best they can? Could we allow that the level playing field is just the universal laws of physics?
Posted by: Jayakumar | April 08, 2006 at 04:03 AM
There was an article in Wired about 2 years ago (I can't seem to find it at the moment) that argues for a two-tiered structure in sports. Athletes would be separated into dope-free and anything-goes categories and they would not compete with each other. Kind of like the Olympics vs. professional sports, before it became OK for pros to compete in the Olympics. There will always be athletes who'll experiment with doping and there will always be an audience that's eager to see just how far the human body can be pushed. The same goes for so-called clean athletes, although drawing the line might prove to be tricky (think blood doping or Epo). Instead of spending all the time and effort to fight doping and keep up with the ever-advancing technology, why not let the market sort it out?
Posted by: Peter Orosz | April 08, 2006 at 04:20 AM
"Athletes would be separated into dope-free and anything-goes categories and they would not compete with each other."
I for one welcome our new mutant athlete overlords.
Posted by: doug | April 08, 2006 at 07:30 AM
Mr. Gladwell-
You quote the article as saying 5% of games are "shaved", but thats not what the article says:
These data suggest that point shaving may be quite widespread, with an
indicative, albeit rough estimate suggesting that around 6 percent of strong favorites have
been willing to manipulate their performance. Given that around one-fifth of all games
involve a team favored to win by at least 12 points, this suggests that around one percent
of all games (or nearly 500 games through my 16 year sample) involve gambling-related
corruption."
Posted by: DZOP | April 08, 2006 at 09:10 AM
Someone has already taken up the challenge. Economics professor Art De Vany, retired from UC-Irvine, presents his results here.
(http://www.arthurdevany.com/webstuff/images/DeVanyHomeRunMS.pdf)
Abstract: The greatest home run hitters are as rare as great scientists, artists, or composers.
The greatest accomplishments in these fields all follow the same universal law of genius,
as I show in this paper. There is no evidence that steroid use has altered home run
hitting but there is great confusion, plain ignorance, and deception (sports writers and
politicians are particularly guilty) in the criticisms addressed at MLB and the great
home run hitters of the present era. The same universal law holds now that held 40
years ago — there has been no change in the distribution of home runs. To argue
that the great achievements of McGwire, Sosa, and Bonds (they did it in that order) is
due to steroids is about as silly as saying you could create a Mozart or Beethoven by
injecting them with a music drug. And, it is to deny them their fair due for their great
accomplishments.
Posted by: JC | April 08, 2006 at 09:26 AM
There's just one problem with this idea: If Bonds took steroids, he wasn't breaking any baseball rules at the time. You can't apply the new rules ex post facto.
Posted by: JW | April 08, 2006 at 09:42 AM
Someone probably already pointed this out, but if drug use is the problem, why use statistical analysis as a proxy for drug testing? Why not beef up the drug testing procedures and penalties instead?
Posted by: ML | April 08, 2006 at 10:40 AM
>His home run record should have been denied as >statistically implausible.
The problem with this analysis is that you run into the Law of Truly Large Numbers:
"The law of truly large numbers says that with a large enough sample many odd coincidences are likely to happen."
With the large number of baseball players (past and present), you could reasonably expect one player to have a career path such as Bonds.
Posted by: Mark S. | April 08, 2006 at 11:43 AM
"The sport is constantly changing, whether it is by rule changes, improved training, better equipment, or a new drug testing policy. These changes put some records out of reach, and render others easily breakable."
This is exactly on the money. There are certain records in baseball, such as the single season wins record for pitchers, that are completely out of reach for today's pitchers simply because of how the game is played. Randy Johnson could win every single one of his starts this year and he still wouldn't come close to that record. There are many more records that are similar. Every era is unique, thus creating records that reflect that era. Baseball recently got out of an offensive era, and has the records to show for it.
And besides, how can baseball take away his records, Bonds wasn't breaking any MLB rules.
If you want to look at statistical improbability, go look at Ruth's numbers compared to his "white's only" peers. Ruth was hitting more HR's than entire teams back then, in what was considered a "dead ball" era. Should Ruth's numbers be ripped away?
Posted by: ThetaFarm | April 08, 2006 at 11:50 AM
How does all this apply to student test scores? I was in a high school that desperately needed to keep its test scores high, and so students who were exempt from certain tests (due to already having obtained the highest score in a previous year) were required to take the test again, in order to bring up the average grade. Would this be the equivalent of steroid use?
Or do you mean this is about examining the test scores of individual students to see if their scores over the course of a year or more seem to point to cheating? In the case of non-athletic performances, an individual's performance is dependent on so many personal and unmeasurable variables.
And as for Bonds, as his steroid use wasn't actually illegal at the time, he shouldn't be punished. I believe he should punish himself, though, and withdraw himself from consideration for any honors.
Posted by: newyorkette | April 08, 2006 at 12:02 PM
Babe Ruth's entire career would almost certainly would have been considered statistically implausible in the 1920s.
Ned Williamson's single season home run record of 27 in 1884 would most certainly have been considered statistically implausible. Heck, it would have been literally impossible had they not made the ground rules of his park ridiculous that year, resulting in Williamson hitting 25 home runs at home and 2 on the road. Yet he held the record for 35 years.
Heck, are we going to take the 1986 World Series and the 2004 ALCS from their winners, because the reults of those series were certainly statistically implausible?
If you start running this kind of test, you are going to be throwing out whole chunks of the record book. Including many of the most interesting parts.
Anyway, 90% of the major league baseball players of the past half century have used some kind of illegal performance enhacing drug. Why single out Bonds?
And the two writers of Game of Shadows do not have impeccable sources. I'm not arguing that they're wrong, but two of their main sources have major credibility problems.
Baseball records cannot be wiped out like track records anyway. If you delete the home runs, you have to delete all the runs that resulted from those runs, changing the results of many past games and seasons. Some pitchers would end up with better ERAs. Of course, you'd have to deal with the results of match-ups featuring steroid-aided pitchers vs. steroid-aided batters, and I have no idea what one would do with those. It would be a nightmare.
Records are really just a recording of the past. If we try and create an alternate history where different stuff happened than really happened, it's going to lead somewhere none of us like.
Posted by: Greg Spira | April 08, 2006 at 01:12 PM