It's pretty egregious that I've never sat down and explained any of the stats I use on a regular basis. Having been duly browbeaten by larry, I'm giving it a go. Fortunately for everyone, better/smarter folks have already taken a shot at such things. I'll do as many of these as there's interest for, so let me know. Links and quotes, interspersed with commentary to follow. I'll do my best to make sure liberal arts kids don't feel out of place.
Note: all of the pieces quoted have much much more at the link, including comments sections. So click through, dadgummit.
Note The Second: I got an email indicating that any time I thought I was quoting Scott from WFNY, I was actually quoting Jon Steiner, whom WFNY brought on to do sabr-style articles such as the one linked.
Why begin with wOBA? I use it a lot, for one, so this will hopefully lead to readers understanding what I'm going on about. But, second, it's not really that complicated. wOBA, essentially, is runs/PA. The rate at which the batter produced runs for his team. What else matters? Runs are the bottom line in baseball.
But before we get to wOBA, it's worth justifying the necessity of a new stat. So let's look at what folks have typically used to evaluate hitters. I'm going to ignore runs and RBI, but Scott covered them too.
The first of three components to the oft-used triple slash AVG/OBP/SLG. Here's Scott at Waiting For Next Year (inexplicably not a Cubs blog):
Batting average measures the rate at which a player hits his way safely onto base per official AB. Seems straightforward, simple, and useful. In fact, it is. But it certainly misses some important components of a hitter’s job. First, it does not account for the value of the walk, since a walk is not counted as an official AB. And, believe it or not, a walk can be quite valuable. Second it doesn’t account for the difference between a double and a single, or the difference between a single and HR for that matter. It’s a good stat that can tell us a good deal about a player, but it obviously isn’t a solid measuring stick if it tells us that Paul O’Neill’s 1997 season (.324 avg) was more valuable than Jim Thome’s (.286 avg). Keep going.
Batting average just doesn't tell us a whole lot. Avoiding outs via hits is important, but there are other ways to get on base and there are very different kinds of hits. Not too many folks will argue a single should be counted the same as a home run, but that's exactly what average does. If we want to describe a hitter's worth to his team, we need to be able to make these kinds of distinctions. And, moreover, think about how precisely we can make them. A home run with the bases empty is worth 1 run. Exactly how much more or less is a single worth in the same circumstances? And how often are the bases empty anyway?
So batting average tells us something. But more is needed.
On Base Percentage
If your team doesn’t make outs, you score an infinite number of runs. It’s hard to lose when you score infinity runs!
Not making outs is a big deal. Baseball games don't end after some period of time, they end once a certain number of outs has been attained. Tango says much the same:
So, OBP becomes the stat of choice. It describes each PA rather clearly: safe or out. And that’s what baseball is about at its core. OBP, while not as popular as BA yet, has staying power. While the continued existence of BA is iffy (a large part of its being is simply inertia), OBP will always exist. If it didn’t exist, it would have been invented. BA enjoys no such fundamental truth.
If you want to keep scoring runs, avoid outs! And on base percentage measures the rate at which a given hitter reaches base safely per plate appearance. So its import is fairly obvious in that light. On the other hand, it doesn't differentiate between ways of reaching base. Which leads to...
Slugging has the same problem as batting average in that it considers hits but not walks or HBP. But it does differentiate between hits. Since SLG = TB/ABs, singles are counted as 1, doubles 2, etc. So we have two measures, one that can't tell between different kinds of hits and one that excludes other ways on base. I wonder what could be next.
CPPB is the standard in candy quality measurement
On Base Plus Slugging (OPS)
So, obviously, the thought behind adding OBP to SLG is to cancel out the obvious downsides to each I've mentioned. OBP can't tell the difference between a single and a home run? No problem. SLG forgot about walks? Psh, whatevs. Add 'em together, problem solved. But just like arsenic turned out to be the secret ingredient in Reese's*, OPS has its own dark secrets.
First, there's the denominator problem. Here's Scott:
The problems here are less obvious to the casual reader, but for starters, we’re adding metrics that are measuring different samples: slugging is measured in ABs and OBP is measured in PAs. Therefore, the result is a bit difficult to interpret.
And now Joe, from River Avenue Blues:
The denominator in OBP is plate appearances, while the denominator in SLG is at-bats. True, they’re expressed in decimal format, and that might make it easier to slap them together. That doesn’t mean it is correct.
This doesn't exactly explain why it's a problem, only that it is. Here's my half-assed** explanation: If we wanted to add 1/2 to 1/3, we'd have to find the lowest common denominator, which then changes the numerator. That step is not taken in mashing together OBP and SLG, so the weighting of OBP to SLG should be somewhat affected. Here's Joe again:
OBP is almost always going to be lower than SLG, because OBP is binary. You either reached base or you didn’t, meaning you get a 1 if you succeed and a 0 if you fail. SLG, on the other hand, measures total bases, so a player receives 4 for a home run, 3 for a triple, 2 for a double, and 1 for a single. And, again, it works with a smaller denominator, since at-bats is a subset of plate appearances.
That gets at it to some degree, I think. But let's go on, since that's not really the biggest problem with OPS anyway. That would be weighting OBP and SLG equally. Which is to say, OBP is half the equation, SLG the other half. But that doesn't value OBP properly. Scott:
But the underlying flaw in OPS is even more interesting: a player’s ability not to make an out is actually more valuable than a player’s ability to hit for power (slugging) by a factor of about 2 to 1. So if Player A has an OBP of .385 and a slugging percentage of .400, he is more valuable than Player B, who has an OBP of .360 but a slugging of .415, even though they both posted OPS’s of .785. Why?
Michael Lewis answered this by asking his readers in Moneyball to think about which team will score more runs, the one that posts 1.000 OBP or the one that posts 1.000 SLG. Like Scott said, if you never make an out, you'll never stop scoring. So we should probably give getting on base more weight than on base percentage. But Scott was specific, mentioning a 2:1 ratio. Where'd he get that figure?
[EDIT 2/18: Updated to include useful comments]
Sky Kalkman here:
True, there’s the denominator problem. And you need to weight OBP more. But it’s not because OBP is more important (as shown by regressing OBP and SLG against team scoring). It’s because one point of OBP means more than 1 point of SLG. Basically, you need to spread OBP values out so that a gap of 1 point of adjusted OBP means as much as 1 point of SLG. And then I’d talk about how (if you ignore denominators), you’re basically doing… OPS That’s obviously not a bad estimate, but is a HR really worth five times as much as a walk? Is a triple twice as good as a 1B? Isn’t it just a little too convenient that the relative worth of these events lines up like this? Wouldn’t it be nice to know how valuable each thing is to the others, exactly? Whoila, linear weights and wOBA.
= OBP + SLG
= times on base + total bases
= (BB + 1B + 2B + 3B + HR) + (1B + 2×2B + 3×3B + 4xHR)
= BB + 2×1B + 3×2B + 4×3B + 5xHR
True, there’s the denominator problem. And you need to weight OBP more. But it’s not because OBP is more important (as shown by regressing OBP and SLG against team scoring). It’s because one point of OBP means more than 1 point of SLG. Basically, you need to spread OBP values out so that a gap of 1 point of adjusted OBP means as much as 1 point of SLG.
And then I’d talk about how (if you ignore denominators), you’re basically doing…
That’s obviously not a bad estimate, but is a HR really worth five times as much as a walk? Is a triple twice as good as a 1B? Isn’t it just a little too convenient that the relative worth of these events lines up like this? Wouldn’t it be nice to know how valuable each thing is to the others, exactly? Whoila, linear weights and wOBA.
Which echoes the comment below by Tango:
Basically, when you add OBP and SLG, you are getting something in-between the weiights of OBP and SLG. wOBA simply tells you what those weights actually should be.
Enter Linear Weights
This whole thing basically comes down to how to weight certain events (outs, singles, walks, etc.) in terms of their contributions to runs scored. OBP weights a single the same as a home run. SLG neglects to weight walks and OPS weights OBP more or less the same as SLG. And the implication, if there is to be a new stat, is that this weighting is not done correctly.
So we need to look at how teams actually score runs and how, say, an additional triple contributes to runs. As Tango lists, there are multiple ways to compute this. But at its most basic, it's a matter of run expectancy. If there are runners on first and second with two outs and the batter walks, how many runs should you expect your team to gain? Well, if you take a play by play database of past MLB seasons, you can compute the average change in runs scored after a given event in a given base/out state. If you know how often on average a given base/out state will occur, then you can compute the average value of an event independent of the base/out state.*** Here's Tango:
There are 24 base-out situations that a batter faces (8 different combinations of men on base, and the 3 outs). Each of those 24 situations has a particular Run Expectancy (RE). For example, at the start of each inning, the average team will score on average 0.56 runs. This is simple enough to figure. If the average team scores 5 runs per 9 inning game, then the average R/I is 5/9 or 0.56. To fill out the rest of the matrix, you need play-by-play data, or a simulator. For example, with the bases loaded and 0 outs, the average team will score 2.4 runs from that point on, to the end of the inning.
Every Plate Appearance (PA) has a start state and end state. That is, before the PA, the batter is facing one of the 24 base-out states, and after the PA is over, there is (possibly) a new base-out state. Since each base-out state has its own RE, the difference in RE (plus any run scored) is the run impact of that batting event that caused that change in state.
For example, with a man on 2B, and 0 outs, the RE for that situation (the start state) is 1.2 runs . If the batter hits a double, the RE for the end state is of course 1.2 runs. As well, a run scored. So, the run impact of this particular batting event is 1.0 runs. If you have a man on 1B with 1 out, the RE is 0.57 runs. A double-play brings us to the end of the inning, and an RE of 0 runs. The double-play in this case is worth -.57 runs.
Sweet We've got the weights. We can't be too far from...
Weighted On Base Average (wOBA)
Tom Tango, the creator of wOBA, on how he did it:
From the preceding section, we know the run values of each event. For example, we know that the run value of the HR is 1.4 runs above average, and 1.7 runs above the run value of the out. In rate measures, like OBP, the value of the out in the numerator is zero. If we recast the run values of the most common events relative to the out (rather than relative to the result of an average plate appearance), we get the following:
HR 1.70, 3B 1.37, 2B 1.08, 1B 0.77, NIBB 0.62.
Those numbers are the values of each of our events (again, relative to an out, which now has a value of zero). If we apply these weights to the statistics of a league-average hitter, and divide by plate appearances, we end up with a rate of almost 0.300. This is a fairly convenient number for an average, but we can do better. Since we like OBP as a measure of a batter’s effectiveness, let’s scale our new statistic so that the resulting values are similar to OBP values. It turns out that, if we add 15% to this 0.300 figure, we get the league-average OBP. Therefore, we will add 15% to the weights of each event and define our new statistic as follows:
(0.72xNIBB + 0.75xHBP + 0.90x1B + 0.92xRBOE + 1.24x2B + 1.56x3B + 1.95xHR) / PA
Uh-oh. Scaaaary numbers. How about Alex Rimington:
The run value of each outcome is compared to the run value of an out, which is defined as zero. (We're ignoring the notion of "productive" outs, because we're only concentrating on the person at bat.) The values are then multiplied by 15 percent to scale wOBA so that average wOBA is defined as equal to league OBP. As Tom Tango says: "In other words, an average hitter is around 0.340 or so, a great hitter is 0.400 or higher, and a poor hitter would be under 0.300."
MLB average wOBA has actually been around .330 or so the past couple seasons. Tango thought it would be more user friendly to scale wOBA to OBP. This means that wOBA isn't exactly Runs/PA. But finding runs contributed isn't that hard, either:
Runs Above League Average = (wOBA - lgAVGwOBA)/1.15*PA
The 1.15 divisor undoes the inflation done to put wOBA on an OBP scale. Aside from that, it's pretty straightforward. "Above" indicates that you should subtract league average. The basic form for wOBA is runs/PA, so to get runs you have to multiply by PA. If this doesn't make sense to someone who passed high school math, it's because I can't write. And that's what the other links are for, so no excuses. You are now required to know what wOBA is.
*Totally, definitely accurate. Wikipedia said so.
**Explaining this concisely in a way that makes sense to me and the reader is proving difficult. Most folks seem to skip over this as obvious, prima facie, so maybe I'm missing something stupid/apparent.
***Hence "linear" in linear weights. Each marginal event is worth the same regardless of the situation (it goes up linearly, instead of being changed by the situation). This is not exactly accurate, but it's close enough.