Thursday, May 2, 2024
MLB

An introduction to pBB%+ (Pitchers)

About a month ago, I wrote an article introducing pK%+ (predictive strikeout rate plus).

Today, I plan to introduce another metric for pitchers: pBB%+ (predictive walk rate plus).

The goal of predictive walk rate plus is to do a better job of predicting a pitcher’s walk rate the following year than regular walk rate does while maintaining a fairly strong correlation to walk rate during the season.

pBB%+ factors in five variables:

  • Strike% (all strikes!)
  • % of strikes that are foul balls (swings only)
  • % of 3-1 and 3-2 pitches that are swung at
  • % of pitches thrown when the pitcher is ahead in the count that result in swings
  • O-Contact% (% of pitches swung at outside the zone that the hitter makes contact with)

The formula for pBB%+ is…

Combined Z-score = (Strike% Z-score * 0.4875) + (% of strikes that are foul balls Z-score * 0.1225) + (% of 3-1 and 3-2 pitches that are swung at Z-score * 0.0825) + (Swing% when pitcher is ahead in count Z-score * 0.1525) + (O-Contact% Z-score * 0.155)

pBB%+ = (Combined Z-score * 54.266) + 100.09

I’ll delve deeper into each individual variable at this point.

Strike%

Strike percentage is very simple. It is just strikes divided by total pitches. This number includes balls in play. Pitchers that throw strikes at a higher frequency tend to walk batters at a lower rate.

Top 10 highest from last year

  1. Javy Guerra (70.1)
  2. Emilio Pagan (70.0)
  3. Sean Doolittle (69.8)
  4. Matt Strahm (69.7)
  5. Chris Paddack (69.6)
  6. Josh Hader (69.4)
  7. Nick Anderson (69.4)
  8. Kenley Jansen (69.4)
  9. Taylor Rogers (69.1)
  10. Max Scherzer (69.0)

% of strikes that are foul balls

This stat is also easily comprehensible, as it is calculated by dividing the number of foul ball swings by the overall number of strikes. The linear relationship between % of strikes that are foul balls and BB%+ is very weak, but including it along with other elements augments the strength of the model. In the equation, a higher % of strikes that are foul balls is associated with higher walk rates, which makes sense intuitively considering a foul ball keeps a plate appearance going most of the time, and the other three types of strikes can/do end a plate appearances (called strike, swinging strike, ball-in-play).

Top 10 highest from last year

  1. Sean Doolittle (41.1)
  2. Chad Green (34.7)
  3. Josh Hader (33.5)
  4. David Hess (33.4)
  5. Javy Guerra (33.4)
  6. Brandon Woodruff (32.9)
  7. Shawn Armstrong (32.7)
  8. Daniel Hudson (32.6)
  9. Chris Paddack (32.5)
  10. Vince Velasquez (32.4)

% of 3-1 and 3-2 pitches that are swung at

When it comes to the % of non- 3-0 pitches that are swung at, a higher percentage tends to be associated with a lower BB%+. If a hitter does not swing, there is a chance that the pitch taken could be ball four. In the case that it is a called strike, that could end the PA as well if the count is 3-2.

Top 10 highest from last year

  1. Josh Tomlin (83.8)
  2. Sean Doolittle (81.5)
  3. Hyun-Jin Ryu (80.5)
  4. Ryan Yarbrough (80.5)
  5. Max Scherzer (80.2)
  6. Madison Bumgarner (79.4)
  7. Josh Osich (79.2)
  8. Cal Quantrill (79.0)
  9. Roberto Osuna (79.0)
  10. John Means (78.8)

% of pitches thrown when the pitcher is ahead in the count that result in swings

In the pBB%+ equation, a higher swing rate when the pitcher is ahead in the count is a negative. While I don’t know the exact reason why this is the case, I guess one explanation could be that pitchers that are at the top of this category tend to have better swing and miss stuff; therefore, they often try to get hitters to chase out of the zone. A pitch taken out of the zone is (almost always) a ball.

Top 10 highest from last year

  1. Javy Guerra (65.4)
  2. Josh Hader (62.9)
  3. Kevin Gausman (61.5)
  4. Sean Doolittle (61.4)
  5. Robert Stephenson (61.3)
  6. Kevin McCarthy (60.4)
  7. Emilio Pagan (60.3)
  8. Blake Treinen (59.5)
  9. Brett Martin (59.3)
  10. Joe Jimenez (58.9)

O-Contact%

O-Contact%, once again, is the percentage of time a hitter makes contact with a pitch given that he swung at it outside the strike zone. A lower O-Contact% is what you want for a pitcher (a higher whiff%), but in terms of walk rate, a higher O-Contact% is “better” because there is a higher chance of the plate appearance ending (a ball is being put into play or being fouled off).

Top 10 highest from last year

  1. Dario Agrazal (80.5)
  2. Marco Gonzales (77.3)
  3. Ross Detwiler (77.0)
  4. Zach Davies (76.0)
  5. Rick Porcello (75.8)
  6. Jesse Chavez (75.6)
  7. Nick Kingham (75.4)
  8. Daniel Mengden (75.3)
  9. Thomas Pannone (74.0)
  10. Jeff Samardzija (73.9)

The coefficients for the Combined Z-score formula from the beginning of this article were determined by creating a linear model where the y-variable was BB%+ in season n+1 (2018) and the explanatory variables were the five Z-scores from season n (2017).

The pBB%+ formula (in the form of y =mx + b) was generated from another linear model, this time involving the Combined Z-scores as the x-variable (2017-2019 single-seasons) and BB%+ as the y-variable (2017-2019 single-seasons).

Here are two graphs in which season n is 2018 and season n+1 is 2019 (this was my out-of-sample testing).

The correlation for pBB%+ in season n to BB%+ in season n+1 is higher than for BB%+ in season n to BB%+ in season n+1 (0.620 vs 0.578). The RMSE is less (22.3 vs 25.6).

In spite of the fact that I maximized pBB%+’s predictive value, the metric still has a very strong correlation to BB%+ in-season (0.887). In fact, I have yet to see a higher one.

pBB%+ is unsurprisingly a sizably more stable metric than BB%+ (which had an R^2 of .3344 from this same time frame [2018 is season n, and 2019 is season n+1]).

Here are the pBB%+ leaders from 2019

  1. Matt Strahm (32)
  2. Josh Tomlin (39)
  3. Mike Leake (41)
  4. Taylor Rogers (42)
  5. Zack Greinke (48)
  6. Kyle Hendricks (50)
  7. Yusmeiro Petit (50)
  8. Ryan Yarbrough (50)
  9. Chris Paddack (51)
  10. Hyun-Jin Ryu (52)

Pitchers w/ the biggest positive difference between pBB%+ and BB%+ from 2019

(pBB%/BB%+)

  1. Bryan Shaw (140/110)
  2. T.J. McFarland (99/74)
  3. Brett Martin (95/70)
  4. Pablo Lopez (87/73)
  5. Robert Gsellman (115/93)

Pitchers w/ the smallest negative difference between pBB%+ and BB%+ from 2019

(pBB%+/BB%+)

  1. Jeurys Familia (138/170)
  2. Derek Holland (109/140)
  3. Brandon Workman (145/176)
  4. Joe Jimenez (75/105)
  5. Aaron Nola (86/111)

Lowest pBB%+ single-seasons since 2017

  1. 2017 Kenley Jansen (26)
  2. 2019 Matt Strahm (32)
  3. 2018 Miles Mikolas (34)
  4. 2017 Bartolo Colon (34)
  5. 2017 Josh Tomlin (35)
  6. 2018 Yusmeiro Petit (37)
  7. 2019 Josh Tomlin (39)
  8. 2018 Bartolo Colon (40)
  9. 2017 Alex Claudio (40)
  10. 2017 Nick Vincent (40)

The 2019 pBB%+ leaderboard can be accessed here.

To summarize, pBB%+ is more descriptive, predictive, and reliable than any other walk rate metric I have seen.

Notes

  • The extent to which pBB%+ regresses is independent of volume (I want pBB%+ to be a stat just like BB%, where TBF does not play into the calculations)
  • pBB%+ and BB%+ exclude intentional base on balls
  • I only went back to 2017; prior to that point, pitchers had to actually intentionally walk the hitter (that data would be a pain to clean up)
  • The minimum total batters faced requirement to be included was 250
  • I made all the Z-scores so that a negative is associated with a lower walk rate
  • A higher (p)BB%+ means more walks