To the best of my knowledge, the kind of run participation by Jackson and Dirks described above has never been formally tracked like runs scored and RBI. My goal is to track this run involvement for all players with the help of play-by-play data at Retrosheet.org. I want to account for every single instance of a player helping to create a run, whether it be a run scored, run batted in or an indirect contribution for all games where play-by-play data are available.
Limitations of Runs Scored and RBI
The above example illustrates that the runs scored and RBI statistics do not always give players the credit they deserve for participation in run scoring, but that is not their only limitation. Many analysts eschew these metrics because they measure things that are, to some extent, out of control of the batter. Unless a batter hits a home run or steals home, he needs teammates to help him score runs. Even a relatively poor baserunner will score a lot of runs if he gets on base frequently and has good hitters behind him. Who bats behind him in the line-up is as important as baserunning skill in determining how many runs a player will score.
The RBI statistic has similar limitations to runs scored. Unless he smacks a home run, a player needs teammates on base in order to drive in runs. If a player has hitters batting in front of him who frequently get on base, then he is more likely to drive in runs than if he has weaker hitters setting him up. Thus, a player on a good hitting team has more chances to drive in runs than a player on a poor hitting team.
A batter’s position in his line-up also influences his runs scored and RBI totals. For example, a leadoff hitter usually has fewer opportunities to drive home runs than a clean-up hitter, since the generally weaker 7-8-9 hitters bat in front of him. The RBI leaders at the end of a season are as likely to be the players with the most opportunities as the players most proficient at hitting with men on base.
A batter’s position in his line-up also influences his runs scored and RBI totals. For example, a leadoff hitter usually has fewer opportunities to drive home runs than a clean-up hitter, since the generally weaker 7-8-9 hitters bat in front of him. The RBI leaders at the end of a season are as likely to be the players with the most opportunities as the players most proficient at hitting with men on base.
Many mathematically-minded fans would like to see RBI and Runs become extinct in favor of statistics, such as on-base percentage, Weighted On-base Percentage (wOBA) and Batting Runs, which isolate a player's contribution from those of his teammates. Despite the shortcomings of these measures however, most traditional fans still like the concreteness of runs scored and RBI. Players like it too which is understandable. A batter does not want to reach base to improve his on-base percentage, but rather to put himself in position to score a run. Similarly, a batter up with a runner in scoring position is not focused on his slugging average. He's thinking about driving in the run.
The Origins of Runs and RBI
The runs scored and RBI statistics both have long histories. Shortly after Alexander Cartwright and the New York Knickerbockers established the first set of modern baseball rules, the first box score appeared in the New York Morning News on October 25, 1845. The only statistics that were included in this box score were hands out (Today, they are simply called “outs”.) and runs for batters. Some of the early baseball writers had ties to cricket, a relative of baseball, and early box scores reflected that association. Hits that did not result in runs were not included because, in cricket, one either scores a point by reaching the opposite wicket or is out.
The runs batted in statistic was recorded in newspapers in 1879 and 1880 and was an official statistic in the National League in 1891. However, fans complained that the measure was unfair to leadoff batters and too dependent on opportunity and it was quickly dropped. Ernie Lanigan, an important baseball statistician in the early 20th century, personally tracked runs batted in and included the statistic in New York Press box scores starting in 1907. It became an official statistic again in 1920 under the name, “Runs Responsible For”. The RBI statistic gradually gained acceptance and eventually became even more popular than the runs scored metric.
The runs scored and RBI statistics both have long histories. Shortly after Alexander Cartwright and the New York Knickerbockers established the first set of modern baseball rules, the first box score appeared in the New York Morning News on October 25, 1845. The only statistics that were included in this box score were hands out (Today, they are simply called “outs”.) and runs for batters. Some of the early baseball writers had ties to cricket, a relative of baseball, and early box scores reflected that association. Hits that did not result in runs were not included because, in cricket, one either scores a point by reaching the opposite wicket or is out.
The runs batted in statistic was recorded in newspapers in 1879 and 1880 and was an official statistic in the National League in 1891. However, fans complained that the measure was unfair to leadoff batters and too dependent on opportunity and it was quickly dropped. Ernie Lanigan, an important baseball statistician in the early 20th century, personally tracked runs batted in and included the statistic in New York Press box scores starting in 1907. It became an official statistic again in 1920 under the name, “Runs Responsible For”. The RBI statistic gradually gained acceptance and eventually became even more popular than the runs scored metric.
Runs Assisted
Because of their extensive history and their popularity with fans, media and players, the runs scored and RBI metrics are not going to disappear as some in the sabermetric world would like. I would argue that they really shouldn't be eliminated altogether even from the sabermetric community. While they should not be used as overarching player evaluation measures, it is good to know how actual runs were scored along with how they theoretically should have been scored.
If one is going to use actual runs scored in any analysis of players though, it is a good idea to consider the entire run as opposed to the popular practice of just looking at RBI. To that end, I have created the Runs Assisted (or RAS to distinguish it from the pitching metric "Run Average") statistic which gives players credit for contributing to runs without a run scored or RBI. Here are the ways a batter can get a Run Assisted:
If one is going to use actual runs scored in any analysis of players though, it is a good idea to consider the entire run as opposed to the popular practice of just looking at RBI. To that end, I have created the Runs Assisted (or RAS to distinguish it from the pitching metric "Run Average") statistic which gives players credit for contributing to runs without a run scored or RBI. Here are the ways a batter can get a Run Assisted:
- A batter advances a runner on first to either second or third with a single, double, base on balls, hit batsmen, error, sacrifice bunt, or another kind of out. If that runner then scores, the batter who advanced him is given a Run Assisted. If the run scored on a triple or home run, a Run Assisted would not be credited, because the advancement would be unnecessary in scoring the run.
- A batter advances a runner on second to third with a single, base on balls, hit batsmen, error, sacrifice bunt, or an other kind of out. If that runner then scores, the batter who advanced him is given a Run Assisted. If the run scored on a double, triple or home run, a Run Assisted would not be credited, because the advancement would be unnecessary in scoring the run.
- A batter reaches base and is removed for a pinch runner or is replaced by another runner on a force out. If the new runner then scores, the batter who originally reached base is given a Run Assisted.
The 2012 American League Runs Assisted Leaders are listed in Table 1 below. Catcher Joe Mauer of the Twins led the league with 59 Runs Assisted. Mauer's teammate Ben Revere was one of the more surprising names among the leaders. Revere's runs scored (70) and RBI (32) would suggest that he was not very involved in team runs scored, but his 48 assists tell a different story. My first thought was that he had a lot of sacrifice bunts, but he only had six, so that does not explain it. In later analyses, I will look more at what kinds of players accumulate a lot of Runs Assisted.
Table 1: AL Runs Assisted Leaders, 2012
Data Source: Retrosheet.org
Runs Participated In
Table 1: AL Runs Assisted Leaders, 2012
Team | PA | R | RBI | RAS | |
Joe Mauer | MIN | 641 | 81 | 85 | 59 |
Elvis Andrus | TEX | 711 | 85 | 62 | 51 |
Robinson Cano | NYA | 697 | 105 | 94 | 50 |
Paul Konerko | CHA | 598 | 66 | 75 | 48 |
Ben Revere | MIN | 553 | 70 | 32 | 48 |
Billy Butler | KCA | 678 | 72 | 107 | 47 |
Jason Kipnis | CLE | 672 | 86 | 76 | 47 |
Asdrubal Cabrera | CLE | 616 | 70 | 68 | 44 |
Alcides Escobar | KCA | 648 | 68 | 52 | 44 |
Josh Willingham | MIN | 615 | 85 | 110 | 42 |
Michael Brantley | CLE | 609 | 63 | 60 | 41 |
Prince Fielder | DET | 690 | 83 | 108 | 40 |
Miguel Cabrera | DET | 697 | 109 | 139 | 39 |
Torii Hunter | ANA | 584 | 81 | 92 | 38 |
David Murphy | TEX | 521 | 65 | 61 | 38 |
Runs Participated In
The addition of Runs Assisted allows us to expand the Runs Participated In (RPI) measure. The current RPI definition is the number of runs to which a player made a direct contribution. It is calculated by adding runs scored and RBI and then subtracting home runs:
RPI = RS + RBI - HR
RPI was first introduced as runs produced in the 1950s by Sports Illustrated writer Bob Creamer but was more recently renamed RPI by Tom Tango. If Boston Red Sox second baseman Dustin Pedroia doubles and then scores on a single by David Ortiz, neither player actually produces the run by himself. Both participate in creating the run but neither is 100% responsible for producing the run. Thus, the name “runs participated in” is more appropriate than "runs produced". Home runs are subtracted in the RPI formula, so that a player does not get credit for two runs (an RBI and a run scored) when he only participated in one team run.
Adding Runs Assisted to the RPI formula yields:
RPI = RS + RBI + RAS - HR
One might question whether a Run Assisted should count as much as a run scored or an RBI since it is more likely to also produce an out. I would guess that a player getting an assist typically contributes less to the run than a player with a run scored or RBI, (although the opening example shows that is not always the case). More complicated statistics involving linear weights are better for answering that question. By definition, runs scored, RBI and Runs Assisted will count the same in the Runs Participated In measure..
Also, remember that RPI does not address the biases of runs scored and RBI (and RAS for that matter). It is still the case that some players have more opportunities to contribute to runs based on their teammates and batting order position. RPI is not a replacement for something like Batting Runs, but rather a simple alternative for those that prefer to look at actual runs scored.
Keeping the above caveats in mind, the American League RPI Leaders are listed in Table 2 below. AL MVP winner Miguel Cabrera led the league with 243 RPI, well ahead of Rangers slugger Josh Hamilton at 220. Mauer, who would have finished 14th by the old definition of RPI, was 5th with 215.
Table 2: AL Runs Participated In Leaders, 2012
Data Source: Retrosheet.org
One might question whether a Run Assisted should count as much as a run scored or an RBI since it is more likely to also produce an out. I would guess that a player getting an assist typically contributes less to the run than a player with a run scored or RBI, (although the opening example shows that is not always the case). More complicated statistics involving linear weights are better for answering that question. By definition, runs scored, RBI and Runs Assisted will count the same in the Runs Participated In measure..
Also, remember that RPI does not address the biases of runs scored and RBI (and RAS for that matter). It is still the case that some players have more opportunities to contribute to runs based on their teammates and batting order position. RPI is not a replacement for something like Batting Runs, but rather a simple alternative for those that prefer to look at actual runs scored.
Keeping the above caveats in mind, the American League RPI Leaders are listed in Table 2 below. AL MVP winner Miguel Cabrera led the league with 243 RPI, well ahead of Rangers slugger Josh Hamilton at 220. Mauer, who would have finished 14th by the old definition of RPI, was 5th with 215.
Table 2: AL Runs Participated In Leaders, 2012
Player | Team | PA | R | RBI | RAS | HR | RPI |
Miguel Cabrera | DET | 697 | 109 | 139 | 39 | 44 | 243 |
Josh Hamilton | TEX | 636 | 103 | 128 | 32 | 43 | 220 |
Robinson Cano | NYA | 697 | 105 | 94 | 50 | 33 | 216 |
Mike Trout | ANA | 639 | 129 | 83 | 34 | 30 | 216 |
Joe Mauer | MIN | 641 | 81 | 85 | 59 | 10 | 215 |
Josh Willingham | MIN | 615 | 85 | 110 | 42 | 35 | 202 |
Prince Fielder | DET | 690 | 83 | 108 | 40 | 30 | 201 |
Billy Butler | KCA | 678 | 72 | 107 | 47 | 29 | 197 |
Albert Pujols | ANA | 670 | 85 | 105 | 36 | 30 | 196 |
Curtis Granderson | NYA | 684 | 102 | 106 | 31 | 43 | 196 |
Elvis Andrus | TEX | 711 | 85 | 62 | 51 | 3 | 195 |
Jason Kipnis | CLE | 672 | 86 | 76 | 47 | 14 | 195 |
Torii Hunter | ANA | 584 | 81 | 92 | 38 | 16 | 195 |
Adrian Beltre | TEX | 654 | 95 | 102 | 34 | 36 | 195 |
Edwin Encarnacion | TOR | 644 | 93 | 110 | 32 | 42 | 193 |
Now that they have been defined, other analyses can be done with RAS and RPI. These statistics will probably be more useful over longer careers where noise created by team environment tends to be minimized. Thus, I plan to go back to past years to determine career totals for players going back to 1950, the first year of complete Retrosheet data. I'd also like to investigate whether certain types of players accumulate a lot of RAS and whether they do it consistently from year to year. Correlations between RPI and numbers like Batting Runs would also be interesting. Finally, I might attempt to create a simple rate statistic which somehow takes opportunities into consideration. You can expect multiple articles on this topic throughout the winter.
The information used here was obtained free of charge from and is copyrighted by Retrosheet.
Interested parties may contact Retrosheet at "www.retrosheet.org".
No comments:
Post a Comment