Tuesday, July 1, 2014

Where Advanced Stats Fail

I am a stat nerd. I look up advanced statistics and use them to figure out what players I want the Cubs to target, and more importantly, which players I want to target for fantasy baseball. I am not smart enough to do the math behind most of them, but I am a huge proponent of using them for a greater understanding of player performance. However, I do feel like advanced stats fail when it comes to stating how a player is currently performing.

There are two examples I want to use to illustrate my point. I was listening to the Effectively Wild podcast, and they made the statement that Brandon McCarthy has been pitching very well this year. A listener called them out, and they had to defend their position. He's been striking out a decent amount of guys, and has been walking very few. His big issues is that a lot of balls in play have fallen for hits, and when guys get on base, they have been scoring at a very high rate. Those things tend to regress towards the mean, so you would expect McCarthy to improve later on this season. Honestly, I think that you can expect for him to improve later on this season, but that still doesn't mean he has been pitching well this year. His job to prevent runs, and he has not done that, as he has an ERA over 5.00. It is partially due to bad luck, but those balls are actually falling in for hits, and it is only a hypothetical that they will stop falling in for hits. Results matter, and although advanced stats are great for predicting the future, they sometimes lose sight of real things that have actually happened.

In 2011, Justin Verlander won the Cy Young award. However, I think a teammate of his may have had just as strong of a case. That teammate? Jose Valverde. I'll wait for your laughter to die down, but just here me out. Jose Valverde, by all rational thinking, had a lot of luck in 2011, but luck doesn't matter. What really matters is that he came in for 49 save chances, and he completed 49 save chances. He did not have a single blown save throughout the entire season. Now the statistically inclined will argue the point that he didn't strike out a batter an inning, allowed a lot of walks, and even with all of that his 2.24 ERA was good but not anything great for a closer. These are all completely true, but at the same time, a closer's role is very specific: Come in and protect a lead in the final inning. He was perfect in that role. Hypothetically, he shouldn't have been, but the actual results were him being a perfect closer. It doesn't matter that everybody knew he wouldn't be as good the next season, for the 2011 season, he was the best closer in the game.

So the question becomes, what would you rather have? A workhorse starting pitcher that will give you 250 innings, with an ERA in the mid 2s and tons of strikeouts, or a guy that when you put him out there in the 9th inning with a lead, he will protect that lead. Ultimately, the latter pitcher could give up zero runs with a one run lead, one run with a two run lead, two runs with a three run lead, and so on and so forth. You're probably going to have a guy with an ERA in the 5-6 range, but on executing his job, he would be perfect. Your team would win every game that it had a lead in the final inning. I'm not smart enough to actually figure out what would be more valuable, and I'm still leaning dominant starter, but for actually providing wins for the team, it's got to be pretty close, right?

And that's where advanced stats fail. Don't get me wrong; I love using them, but sometimes it is beneficial to just dumb it down and look at archaic categories like ERA and Saves. Those things actually happened, and what actually happened is far more valuable than what should have happened.

Just don't bet your future on it.

