With the season so, so close to its beginning and all the relevant roster moves decided upon, I think a short primer is called for. Early season baseball means wild stat swings as each plate appearance carries lots of weight. But it's hard to know at what point to be actually concerned about a player or a particular component of his performance. Fortunately, research has been conducted by Russell Carleton (aka Pizza Cutter) for our benefit, summarized here.
The idea behind the research is to find the correlation of, for instance, a hitter's K/PA to itself given a certain number of samples. If there's a lot of variance, we can say there's a lot of noise in the data and drawing strong conclusions from that data is a bad idea. If not, we can say there's a good chance that what we're seeing is the player's actual skill level when it comes to making contact. Remember: what happens on the field is some part Actual Skill and some part Crazy Variance.
In particular, the threshold Russell looked for was the point at which it's proper to regress 30% to the league mean.* So at 150 PA, a good estimate of a hitter's K/PA skill = 70% of his current rate and 30% of the league average rate. If Josh Fields is striking out in 50% of his 100 PA and the league average is 20%, then a good estimate is 0.7*0.5 + 0.3*0.2 = 0.41 K/PA.
I bring up K/PA because it's one of the first component stats to stabilize and it's very important in determining whether a prospect or aging vet can cut it at the big league level. For a guy like Brent Morel, it's going to be imperative that he limit his strikeouts because his history and scouting reports suggest he doesn't have a ton of skill walking or hitting for power. Moreover, when prospects flame out, it's not usually because they didn't take enough walks or hit home runs. Yeah, patience and power separate the good MLBers from the bad. But getting the call to the Show at all is a matter of making contact in the first place
And while 150 plate appearances doesn't seem like that much, it's still about a month and a half's worth of games for an everyday player. Once we get even that far into the season that, all we have is K rate for hitters. It takes another month and a half for the rest of the really important hitting stats to normalize. I tend to focus on walk rate and HR/FB since the other stats are either some subset of those or have other issues. And then of course there's BABIP, which after 650 PA still needs to be regressed more than 40% of the way to the mean. For pitchers, it's K rate and groundball rate and then you don't really get walk rate until well into the season. So if you're thinking about caterwauling after a golden sombrero from AJ Pierzynski, in the very least don't cite his K rate a week into the season. Tell us about what you think you see that's different.
Tell us what you think you see. That's the biggest takeaway here. The stats only tell us so much. They take a long time before they are really meaningful. But between a bunch of baseball obsessives who talk to each other constantly, a kind of community scouting arises, one that's a lot more powerful than what any one of us thinks he or she sees. Sure, yeah, we'll be biased in various ways. We're all White Sox fans, and a certain kind of Sox fan for that matter. But we've figured out often enough when to scoff at the experts and their fancy stats at this point.** Because the stats only exist to confirm our theories. We don't have to wait for them to start complaining.
Here's the full list:
Offense Statistics:
- 50 PA: Swing%
- 100 PA: Contact Rate
- 150 PA: Strikeout Rate, Line Drive Rate, Pitches/PA
- 200 PA: Walk Rate, Ground Ball Rate, GB/FB
- 250 PA: Fly Ball Rate
- 300 PA: Home Run Rate, HR/FB
- 500 PA: OBP, SLG, OPS, 1B Rate, Popup Rate
- 550 PA: ISO
Pitching Statistics:
- 150 BF – K/PA, grounder rate, line drive rate
- 200 BF – flyball rate, GB/FB
- 500 BF – K/BB, pop up rate
- 550 BF – BB/PA
*Or whatever more specific cohort you think is reasonable, but league average is perfectly fine.
**Jose Lopez.