While working for the ESPN Stats and Information department from 2007 to 2010, I had countless conversations with my colleagues on the lack of a quality all-encompassing pitching statistic. Right around this time stats like WAR were becoming more mainstream and we were doing our best to play a role in their inclusion on shows like Baseball Tonight and SportsCenter. And while we had some success pitching ideas with advanced hitting statistics such as OPS+ and Runs Created, and some fielding stats like Defensive Runs Saved, there was a noticeable lack of pitching statistics available.
The issue with advanced pitching stats is primarily that they all focus on the big picture. Stats like WAR and xFIP do a better job of evaluating pitchers than the baseball card stats that came before them, but the scope is so widespread that it actually limits their use. Let’s say we were working on a Baseball Tonight show in May. how can we use advanced stats to tell a story about a pitcher? WAR and xFIP are options, but the same size is so small. Additionally, there’s a certain amount of guess work involved in each stat.
What I wanted to see was a stat I could use both on a daily basis and over an extended period of time. Something I could use to judge a pitcher’s performance in an individual game, but also over the course of a season, or a career.
The closest thing that existed was the Quality Start (and for an extended period of time, Quality Start Percentage). But as a former major leaguer and ESPN colleague of mine said in an email exchange about Quality Starts a few years ago: “that’s ass.”
The main issue with Quality Starts is that it’s a yes or no stat. Three runs over six innings is equal to nine scoreless frames. Quality Start percentage can start to separate pitchers, but it still leaves a lot to be desired.
Game Score also exists to evaluate pitchers on a game-by-game basis, but it has substantial flaws, mainly that it favors high-strikeout pitchers. It’s far more of a fantasy baseball stat than a true measure of a pitcher’s value.
So when I left ESPN in 2010, I decided to spend some of my own time developing a quality start alternative—something that could be used on a per-game basis but also as a season-long statistic.
What I came up with was what I have called Pitching Efficiency Rating (or PER).
PER is an attempt to simply measure how often a pitcher does his job. Unlike Game Score, it doesn’t care how the job gets done and unlike Quality Starts, it places no judgement on what is and isn’t quality (at least not without substantial statistical backing, but we’ll get to that later).
What PER measures is simply how efficiently a pitcher does his job, and it does so by evaluating three major components:
Percent of Batters Retired: (IP*3)/((IP*3)+Hits+Walks+HBP)
Technically this isn’t an exact percentage. It removes errors from the equation in order to attempt to neutralize the main factor which is out of the pitcher’s control. The (IP*3) portion of the equation generates the number of batters the pitcher should have faced, had his defense held up behind him. For example, if a pitcher throws a 7 innings, allowing 3 hit, 2 walks, and 1 HBP his equation would look like this:
21/(21+3+2+1)= 77.8
Percent of Batters Failing to Score: 1-(ER/BF)
This portion of the formula is a little more straight forward. Only earned runs are included again to neutralize any poor defense of which the pitcher may be a victim.
Percent of Game Pitching: IP/9
A complete game gives a pitcher a full score in this category. I’m sure someone will be quick to point out that a pitcher doesn’t always have the opportunity to pitch nine innings. This is true, and it is a minor flaw in the formula. Unfortunately I haven’t found a way to account for this without manually going through 1,000s of box scores to determine if the pitcher tossed the full amount of innings. Also, I’m not sure if it would be fair to do that even if we could. If a pitcher has a perfect game through 5 innings and the game is called, should that really receive the same score as a nine-inning perfect game?
You may have noticed that each category will give you a number no greater than 1 (unless a pitcher throws more than nine innings). This allows us to add the three categories together and divide by three giving a score which, except for a few rare occasions, will be no greater than 1.00 – essentially looking like a batting average.
So how does PER look in practice? Here’s a look at the top average PER over the past three seasons (2011-2013) with a minimum of 50 starts.
Clayton Kershaw – .824
Cliff Lee – .818
Justin Verlander – .810
James Shields – .804
Chris Sale – .800
And the worst…
Juan Nicasio – .702
Jeff Francis – .704
Tyler Chatwood – .705
Edinson Volquez – .708
Derek Lowe – .709
And the Indians in that span (min 25 starts)…
Justin Masterson – .761
Josh Tomlin – .752
Corey Kluber – .735
Ubaldo Jimenez – .727
Zach McAllister – .726
Scott Kazmir – .725
Fausto Carmona – .719
Carlos Carrasco – .710
Jeanmar Gomez – .701
Now that you’ve been introduced to the stat, let’s address some questions that you may be wondering about.
Sounds great, but does it have any value as an evaluation tool?
I’ll get to this in detail a bit, but the short answer is yes. There is an extremely strong correlation between a team’s winning percentage and a high PER from the starting pitcher.
So a perfect game isn’t the best possible score?
The formula is set up so that a perfect game yields a score of 1.000—sort of like a perfect batting average. But since innings pitched is a factor in the equation, technically you can end up with a value higher than 1.000. However, it’s almost impossible to reach this level in today’s era. Cliff Lee was the last pitcher to go beyond nine innings in 2012, and produced a PER of .974 with seven base runners and zero runs in 10 innings of work. I don’t love the fact that you can go over 1.000, but it makes sense. My goal was to produce a formula to essentially show what percentage of his job the pitcher did in any given start. Asking a pitcher to go beyond 27 outs is asking him to do far more than what is expected of him on a given day.
Aren’t you over-valuing pitching deep into games?
This was an initial concern I had also, and I toyed around with other versions of the formula with this portion watered down. But I quickly discovered the correlation between PER and winning percentage dropped significantly when innings pitched wasn’t a factor. It sounds almost too simplistic, but your odds of winning increase substantially the deeper your starting pitcher goes into the game.
Why count percent of batters retired? Isn’t not allowing runs really all that matters?
When looking strictly at an individual game, this is a fair criticism. If your pitcher throws a shutout, ultimately you don’t care how many base runners he allowed. But I wanted this to also be a way to measure a pitcher’s efficiency over an extended period of time. Two pitchers with an ERA of 2.50 over 100 innings might be considered equal. But the pitcher with the higher PER is more likely to continue his success. Basically, more base runners equals more opportunities for something to go wrong. You want the guy who limits those opportunities.
Wouldn’t one bad game destroy a pitcher’s PER?
Absolutely. This is the one area where I think Quality Start Percentage does a best job of gauging long-term performance. One bad start doesn’t hurt the team as much as it hurts the pitcher’s PER, and that’s a flaw. But it’s no different than the effect a bad game has on ERA and other similar statistics.
What about relief pitchers?
I’ve played around with a version for relievers, but it’s far more difficult due to the number of variables involved. Accounting for inherited runners, leverage and other variables makes it more complicated than with starters.
Have other questions… let me know @TribeFanMcC
As I referenced earlier, there is a strong correlation between PER and winning percentage, which is why I love the statistic.
The chart below shows the team’s winning percentage when it’s starting pitcher produced a certain PER (PER is on x-axis, winning percentage on y-axis).
To get the largest and most relevant sample size, I used over 14,000 games from the past five years (a similar chart from the 1960s or mid 1990s would yield similar but slightly different results)
Without going too deep into the nerdy details of statistics, this is about as close to a perfect correlation as you could hope to find. In the center of the chart you’ll see “R2 = .9896” — this essentially means the given equation (demonstrated by the trend line) explains 98.96 percent of the variation in the data.
Based on this chart, we can say with confidence that when a pitcher produces a PER of .900 or higher, his teams will win over 90 percent of the time. Similarly, we can say that a pitcher with a PER under .600 will cause his team to lose at least 80 percent of the time.
Using the equation on the chart, we can manipulate PER to give us numbers that are easier to digest. Let’s use Justin Masterson’s 2013 season as an example.
Masterson produced an average PER of .774 in 2013. This sounds good, but unless you’re familiar with PER the number itself doesn’t mean much.
But if we plug each of his starts into the equation provided in the start, we find that on average he gave the Indians an expected winning percentage of 60.8 percent.
If we want to take it a step further, we can also evaluation to what extent he increased their chances of winning (assuming every game starts with 50/50 odds for each team. Since Masterson gave his team a 60.8 percent chance of winning, that equates to an increase of 21.5 percent.
These winning numbers allow us to better compare pitchers. As you can see from the chart, the correlation between winning percentage and PER isn’t a perfect linear correlation (a straight line). So it’s tough to know just how much better a PER of .850 is compared to a PER of .600. But PER Win % puts that into an easier to understand context.
Here are the 2013 PER Win % leaders (it’s the exact same order as PER leaders, just an easier to understand number)…
Clayton Kershaw – 74.0%
Cliff Lee – 70.8%
Matt Harvey – 70.3%
Adam Wainwright – 69.8%
Chris Sale – 68.8%
And the five worst in 2013…
Barry Zito – 42.9%
Wade Davis – 43.3%
Lucas Harrell – 44.6%
Mike Pelfrey – 44.9%
Dylan Axelrod – 45.3%
So how will PER be used?
I’ll post updates on this year’s PER leaders every so often on IPL and provide some analysis on the Indians pitchers. If you want the data for a specific team or pitcher, send me a note a Twitter and I’d be happy to provide it.
I also have data for each of the years in the chart below. I’m not going to post the full sheets, but I do have them in Excel files if you’re interested in viewing them.
American League | National League | |
2013 | Chris Sale (CHW) – .810 | Clayton Kershaw (LAD) – .835 |
2012 | Justin Verlander (DET) – .821 | R.A. Dickey (NYM) – .813 |
2011 | Justin Verlander (DET) – .838 | Roy Halladay (PHI) – .829 |
2010 | Felix Hernandez (SEA) – .827 | Roy Halladay (PHI) – .838 |
2009 | Roy Halladay (TOR) – .824 | Chris Carpenter (STL) – .815 |
2008 | Roy Halladay (TOR) – .824 | Johan Santana (NYM) – .805 |
2007 | CC Sabathia (CLE) – .805 | Brandon Webb (ARI) – .800 |
2006 | Johan Santana (MIN) – .809 | Brandon Webb (ARI) – .806 |
2005 | Johan Santana (MIN) – .816 | Chris Carpenter (STL) – .822 |
2004 | Curt Schilling (BOS) – .811 | Randy Johnson (ARI) – .822 |
2003 | Roy Halladay (TOR) – .819 | Jason Schmidt (SFG) – .826 |
2002 | Derek Lowe (BOS) – .813 | Randy Johnson (ARI) – .831 |
2001 | Freddy Garcia (SEA) – .804 | Curt Schilling (ARI) – .825 |
2000 | Pedro Martinez (BOS) – .861 | Greg Maddux (ATL) – .814 |
1999 | Pedro Martinez (BOS) – .831 | Randy Johnson (ARI) – .845 |
1998 | Pedro Martinez (BOS) – .812 | Greg Maddux (ATL) – .835 |
1997 | Roger Clemens (TOR) – .848 | Pedro Martinez (MON) – .855 |
1996 | Pat Hentgen (TOR) – .819 | Kevin Brown (FLA) – .834 |
1995 | Randy Johnson (SEA) – .822 | Greg Maddux (ATL) – .852 |
1994* | David Cone (KC) – .826 | Greg Maddux (ATL) – .873 |
1993 | Randy Johnson (SEA) – .814 | Greg Maddux (ATL) – .830 |
1992 | Roger Clemens (BOS) – .840 | Curt Schilling (PHI) – .844 |
1991 | Roger Clemens (BOS) – .843 | Tom Glavine (ATL) – .820 |
1990 | Roger Clemens (BOS) – .828 | Dennis Martinez (MON) – .815 |
1989 | Bret Saberhagen (KC) – .835 | Orel Hershiser (LAD) – .827 |
1988 | Mark Gubicza (KC) – .832 | Orel Hershiser (LAD) – .845 |
1987 | Jack Morris (DET) – .831 | Orel Hershiser (LAD) – .821 |
1986 | Roger Clemens (BOS) – .843 | Mike Scott (HOU) – .841 |
1985 | Ron Guidry (NYY) – .837 | Doc Gooden (NYM) – .861 |
1984 | Dave Stieb (TOR) – .829 | Fernando Valenzuela (LAD) – .822 |
– | – | – |
1968 | Denny McLain (DET) – .869 | Bob Gibson (STL) – .915 |
– | – | – |
1940 | Bob Feller (CLE) – .864 | Bucky Walters (CIN) – .865 |
– | – | – |
1927 | Ted Lyons (CHW) – .872 | Pete Alexander (STL) – .867 |
I’m slowly but surely working my way back. Eventually I’ll have stats back as far as retrosheet’s data goes. To break up the monotony of the project I may be inputting data in a random order, so don’t be alarmed if certain years appear to be missing. They’ll show up eventually. [Leaders are with a min. 25 starts]
good stuff man. but was wondering about the name. “per” sounds too basketbally, since they already have a stat of the same name in that sport. how about “rep” or Rating the Efficiency of Pitching. sounds more sporty too. getting your reps in. quality reps, etc
I know Ubaldo finshed 2013 strong, but before that he was horrendous. I’m surprised his number is so high.