With the Statcast data now making its way out to the public, we’re going to see new metrics proposed. This is going to be a turnoff to some people and that’s okay. What you have to ask yourself is: Am I happy with my current level of knowledge and understanding of the game or am I open to the idea that there may be things out there that would push my knowledge and understanding forward?
Whatever you’re talking about – whether it be sports or music or religion or politics or whatever – there are going to be people who are stuck in a certain time period or with a certain mindset. Shoot, I still think of the Smashing Pumpkins as a new band. My music horizons have not expanded much past the early 1990s. There are people my age who still follow new music and can tell you about Paramore or Deafhaven or Deerhunter. They don’t think that anything that’s new stinks and they’re undoubtedly right. They’re open to hearing new music and they’re richer for it.
The idea that someone else would put down or dismiss people for listening to music made in this century is just sad. I don’t put anyone down for listening to it; it’s just not where I estimate my time being best spent. To me hearing, “She had hair like Jeannie Shrimpton back in 1965” still makes me happy and fills my music needs.
But this isn’t a music blog – it’s a baseball blog whose focus is on the past, present and future. We’re not stuck in 1969 or 1986 or even 2015. We like to visit those times, for sure. But we also live in the present and think about the future. And new metrics are a big part of both now and tomorrow. To ignore them would be foolish.
We use statistics to try to help us understand the game better. Some stats can be used primarily to understand what has happened, while some have that plus have utility to forecast what might happen in the future. Those latter stats are focused on ability, rather than 100% actual performance. One example of that is FIP, which attempts to put a pitcher’s ability on the familiar scale of ERA. Unlike ERA, FIP focuses on the things that are assumed to be in control of the pitcher – strikeouts, walks/hit by pitch and homers.
The beauty of FIP is in its simplicity. It takes a few ordinarily available inputs and is able to create something that does a better job of predicting future ERA than a pitcher’s actual ERA does. If you were going to create a stat from scratch, you would make it simple and you would make it predictive. FIP accomplishes both of these things.
So it may be somewhat surprising that we are going to talk about making FIP harder to potentially make it better.
Not all home runs are created equal. You can have a Yoenis Cespedes upper deck shot that would be out in any park in the majors. And then you have wall scrapers that would be outs in many parks. On the flip side, you can have a ball that was absolutely crushed but it was hit to the deepest part of the park and caught up against the wall. Does it not seem at least possible that these blasts are more predictive to future performance than just counting them as the same as any other out?
Here’s a snippet from an interview with Taylor Tankersley, a guy the Mets signed to compete for a LOOGY spot prior to the 2012 season. Tankersley went to Spring Training with the club but never played for the Mets. But he was a likeable, intelligent guy and if you have time, you should read the whole interview. But here’s the relevant part:
Your last 8 games you got hit pretty hard (.412/.450/.1.059) including 3 HR to your last 20 BF. What happened in this stretch?
Fractions of an inch. If you were to go back and look at video of those splits that you spoke of, I can remember getting Adrian Gonzalez out twice in Dolphins Stadium and combined he hit the ball about 750 feet and both got caught on the warning track in center field. I got Josh Hamilton out; he hit a screaming line drive right at the left fielder. Things like that, the ball was bouncing my way, I had good fortune. The last half of my big league outings last year the ball was falling in the gap or sinking over the fence. On paper it looked like a dramatic difference but there really wasn’t.
Shouldn’t we be interested in those blasts allowed to Gonzalez and Hamilton, even if they didn’t result in homers at the time? In the past, we couldn’t do it; we were limited by the data. But with Statcast, we can now do it.
Casey Boguslaw from RO Baseball recently had a great article where he talked about using barrels instead of home runs in calculating the FIP statistic. Barrels are a stat created by Tom Tango and Daren Willman. Here’s how Boguslaw put it:
Simply put, a barrel attempts to define the best outcome a hitter can produce. Tango and Willman set it as any combination of batted ball exit velocity and launch angle which leads to an expected batting average of at least .500 (calculated from historical data) and an expected slugging percentage of at least 1.500; an almost sure base hit. In 2016, barrels did much better than those minimums; resulting in a batting average of .822 and a 2.386 slugging percentage; an almost sure extra-base hit.
You can find information on barrels over at Willman’s excellent Baseball Savant site.
Meanwhile, Boguslaw calculated FIP with barrels instead of homers for both the top 10 and bottom 10 FIP performers. There wasn’t a ton of new information there. Perhaps where it became most interesting was when he compared the biggest changes between FIP and what he called BarrelFIP. Here’s one of his interesting examples:
J.A. Happ received Cy Young votes this season due to his back-of-the-baseball-card stats sparkling with 20 wins and a 3.18 ERA. Happ’s 22 homers were acceptable, but his 37 barrels are far from.
Happ went from a 3.96 FIP to a 4.45 BarrelFIP. And recall that his FIP was already substantially above his ERA. And for guys in the other direction, we have Anibal Sanchez, Carlos Rodon and Jon Lester, among others. Rodon goes from a 4.04 ERA and a 4.01 FIP to a 3.35 BarelFIP.
We only have barrels for the 2015 and 2016 seasons. To the best of my knowledge, we don’t have a ton of information on the reliability of BarrelFIP. You shouldn’t throw out FIP or ERA at this point. But it’s a potentially exciting area and one worth keeping an eye on going forward.
So, how do the Mets pitchers rank with this new information? Here are the 15 guys who pitched at least 35 innings for the club in 2016:
Name | ERA | IP | HR | BB | SO | HBP | FIP | Barrels | BarrelFIP |
---|---|---|---|---|---|---|---|---|---|
Bartolo Colon | 3.43 | 191.2 | 24 | 32 | 128 | 3 | 3.99 | 29 | 3.83 |
Noah Syndergaard | 2.6 | 183.2 | 11 | 43 | 218 | 2 | 2.29 | 14 | 1.99 |
Jacob deGrom | 3.04 | 148 | 15 | 36 | 143 | 3 | 3.32 | 24 | 3.61 |
Steven Matz* | 3.4 | 132.1 | 14 | 31 | 129 | 5 | 3.39 | 15 | 2.98 |
Matt Harvey | 4.86 | 92.2 | 8 | 25 | 76 | 1 | 3.47 | 12 | 3.53 |
Logan Verrett | 5.2 | 91.2 | 16 | 43 | 66 | 4 | 5.51 | 27 | 6.59 |
Jeurys Familia | 2.55 | 77.2 | 1 | 31 | 84 | 1 | 2.39 | 4 | 2.38 |
Addison Reed | 1.97 | 77.2 | 4 | 13 | 91 | 0 | 1.97 | 8 | 2.14 |
Hansel Robles | 3.48 | 77.2 | 7 | 36 | 85 | 1 | 3.56 | 10 | 3.56 |
Seth Lugo | 2.67 | 64 | 7 | 21 | 45 | 4 | 4.33 | 18 | 6.07 |
Robert Gsellman | 2.42 | 44.2 | 1 | 15 | 42 | 1 | 2.63 | 3 | 2.71 |
Antonio Bastardo* | 4.74 | 43.2 | 8 | 21 | 46 | 3 | 5.07 | 11 | 5.49 |
Jerry Blevins* | 2.79 | 42 | 4 | 15 | 52 | 1 | 3.05 | 3 | 2.24 |
Erik Goeddel | 4.54 | 35.2 | 5 | 14 | 36 | 1 | 4.21 | 10 | 5.57 |
Jim Henderson | 4.11 | 35 | 7 | 14 | 40 | 2 | 4.83 | 8 | 4.70 |
Note that FIP has a constant added to the results to put it on the same scale as ERA. BarrelFIP does too, but it’s not the same constant as regular FIP, as there were many more barrels than HR hit last year.
We see Syndergaard’s ERA was really good last year. His FIP was even better and his BarrelFIP was best of all. On the flip side of that is Lugo, as his ERA was very good but his FIP was not and his BarrelFIP was even worse. Blevins has some interesting results, too. He had a very good ERA but his FIP was higher. Yet his BarrelFIP was even better than his ERA. Not only that, it was better than Familia’s and almost as good as Reed’s.
It’s hard to argue that Bastardo was worth a 2/$12 deal last year and Blevins isn’t now. In 2015, Bastardo had a 2.98 ERA, a 3.33 FIP and a 3.29 BarrelFIP. Last year Blevins had a 2.79/3.05/2.24 marks. But does Blevins have visions of a three or four-year deal dancing in his head after seeing some of the contracts that lefty relievers have received this offseason?
Matz also looks really good by this measure. Gsellman’s barrel numbers are pretty close to his FIP. By this measure, the difference between him and Lugo looks even bigger. Perhaps the biggest surprise to me is that Robles doesn’t measure worse here. He seems to give up a lot of hard hit balls but obviously not enough to register a ton of barrels, as he surrendered 10 barrels compared to seven homers. If not Robles, the biggest surprise is deGrom, who does not fare particularly well by this measure.
Again, let’s be clear that to the best of my knowledge this has not been tested vigorously yet. But it makes theoretical sense so let’s keep tabs on it and check in on it again at the end of the 2017 season. Maybe it’s not better than the original FIP and since it requires a stat not found on every website, it wouldn’t be worth using.
Yet maybe it will be worthwhile. And that chance is very exciting to me.
Consider me one of the “unenlightened” followers of baseball who trusts his eyes more than cybermatrics. With that being said this stat is unnecessary unless you are an agent or someone who really doesn’t “see” baseball. Essentially this article is about who gives up the hardest hits and who doesn’t and these numbers are redundant to a baseball man. Back when Pete Rose was managing the Reds a reporter asked him a random question that he thought would catch Rose off guard and that was “who hits the most fly balls in the National League” and Rose thought for about a second and a half and said “Gary Redus.” The reporter was stunned that Rose knew that statistic off the top of his head but that’s what baseball people do. Trust your eyes! I have been evaluating minor league players, on paper and off of statistics, for over 40 years and here is what I’ve been using long before they became official stats; whip, k/BB ratio, K/9 innings. Whip is simple, the fewer base runners the fewer runs scored. K/BB ratio is simple. As a former baseball coach I knew that walks lead to runs and if the ratio is walks are over 50’% of K’s the strikeouts won’t matter but my biggest was K/9 because if a minor leaguer can’t miss bats in the minors he won’t be able to miss them in the majors and soft line drives and warning track fly balls become screaming line drives and home runs in the majors. If those numbers track over the years, without benefit of seeing these players in person, the statistic you outlined isn’t needed. The reason minor league managers call in their reports every night is they see these players every day and the fly ball out in historic Grayson Stadium will be relayed back to the big club as a home run anywhere else because they “see” what is happening with their eyes. My apologies for the long windedness of this reply that also veered off track a bit but I am avoiding shoveling snow.
Holmer – thanks for the great insight.
If you watch every game and have total recall, you don’t need any stats whatsoever.
I don’t know anyone who falls into this category.
Rain Man will remember all of the stats.
In theory, a closer, more nuanced accounting of pitchers would yield us more “knowledge.”
The “barrel” info seems pretty weak at this point. Matt Harvey gave up only 12 last year? It seemed louder than that. Noah gave up 11 hRs but only 14 “barrels”?
I like stats, but I am very much a believer that some folks confuse “data” for “insight,” and that the signal is becoming increasingly lost in the noise. Chasing every new stat won’t necessarily make you smarter. In fact, I see the reverse as more perilously true.
As an old time fan, it never ceases to amaze me all the advances in technology which then result in newer sabermetrics. Thanks Brian for sharing. The bFPI is interesting and based on having watched many Met games it passes my personal “eye test” . Thanks Holmer ^ for sharing your insights and experience as well. Syndergaard is just so awesome. Matz was surprising. Lugo was just as expected which doesn’t bode well for his future. I am on board the Gsellman band wagon.
This new stat indeed reinforces the value of Jerry Blevins. Of 105 NL relief pitchers that pitched at 30 innings, Blevins produces the following line: FIP 3.05 (22), K/9 11.14 (14), WHIP 1.21 (40), HR/9 0.86 (54). The number in parenthesis is his rank. He has pitched well in NY.He should not be pigeon holed as a LOOGY. If it takes a two or three year contract so be it. If it is a choice of salary dumping Bruce and signing Blevins then that is what Alderson should do.
This seems like a variation of xFIP in that it tries to normalize HR rates.
It’s interesting that Blevins has 1 less barrel than HR… i don’t suppose we can isolate which HR that was and why it didn’t qualify?
I just did a quick comparison of whether BarrelFIP was better/worse at estimating the season ERA and the results are pretty stark
For 3 pitchers, BarrelFIP was closer
For 11 pitchers, FIP was closer
For 1 pitcher, FIP and BarrelFIP was the same.
But it’s just for 1 season and we know that FIP/BarrelFIP is more relevant over longer sample periods. We only have 2 years, but it would be interesting to know which one is a better estimate if we combine those 2 years together, and for more players than just the Mets as well
Also, the name is a misnomer. If you use barrels, It’s not “Fielding Independent” anymore.
xFIP looks to normalize just in proportion to the # of fly balls. This looks to normalize based on type of contact, regardless of where or how many.
Actually one of the barrels against Blevins came on a double. Judging by the counts, it looks like the homers he allowed to Oscar Hernandez and Adonis Garcia did not qualify as a barrel. The one to Garcia had an exit value of 97.3 while the one to Hernandez had an EV of 100.6
There’s a variable scale of EV and launch angle used to determine barrels. You need to have a minimum EV of 98, which is why Garcia’s didn’t count. Hernandez’ must not have had the launch angle necessary.
Brian, reading this article reminded me of my subscription to Scientific American, and I asked myself: “Is Brian just showing off today”? This was really cool.
One thing to add to your analysis is deGrom’s bFIP. DeGrom had a tough year and pitched injured through at least half of it. While every, sorry, most pitchers throw with some discomfort, between the baby and getting hurt his first start, then healing by about June to only get hurt with a month left, he also wasn’t as bad as I would’ve expected.
On Facebook, Jon Springer also wondered about deGrom. He wanted to know what his numbers looked like here before the last 3 or 4 starts, where according to Jon, he was pitching in obvious pain.
The information that Baseball Savant has is tremendous and we’re lucky it’s in the free domain. However, it’s not necessarily lined up in an easy to access format, like a B-R page might be.
OK, I eliminated JDG’s last 3 outings and it looks like he would have a BarrelFIP of 3.38 compared to a 2.30 ERA.
Ok, better for sure. Thanks.
Went to the baseball savant site yesterday evening and put in a couple of Mets to see their numbers, but after pressing on first the statcast then the spray charts, nothing would open. Launch angle gave me nothing either. Tried a couple of things for a few minutes with no results, so I gave up. Maybe later again.
Did notice on their top 30 guys that Miguel Cabrera led the league in exit velocity, followed by Nelson Cruz. How did Cruz become such a superstar?
Looking at his FG page, it appears he went from a good fastball hitter to a great one. And just as impressively, he hasn’t become susceptible to offspeed stuff, like he’s just selling out for FB on every pitch.
Robles was tremendously inconsistent, time frame to time frames—Sucked for weeks…Great For Weeks..on and off, with repeats. For Pitchers, the consistency of quality may be the most importants factor
I’m not as big a stats geek as most of the other Mets360 writers but you’ve got my attention with this one. The pitching side could use a few more impartial measures. Wins are too reliant on run support and bullpen support. Even ERA is a bit flawed. I always point to guys like Colon and Arroyo as good examples. They’ll deliver quality starts and give their team a good chance to win 4 outings out of 5, but they’all get shellacked in that 5th outing and it will skew their ERA. That’s how you wind up with 15 game winners with ERAs of 3.8 or even 4.0 with solid run support. I happen to like quality starts as another measure. If I were looking at free agent starters I would see what the ERA looks like once you remove their worst 4 or 5 starts. One start where a guy gets shelled and knocked out in the first inning can make a big difference.
All these metrics are quite fascinating. I always pose the question: to what end? As someone that deals with numbers daily at a reasonably high level, one thing is clear, if data analysis is not directed at a clearly directed problem or hypothesis and with variables in the system accounted for, then really the crunched answers have zero value. This certainly makes FIP interesting for pitchers because variables related to other players, which lie to a large degree outside the pitchers control, are not part of the assessment. That certainly has appeal. But what question does FIP answer? It favors high strike out guys over contact pitchers that may go 20 outs but only get 3 Ks. If you run a simple example of something like that where 2 pitchers yield 1 HR, 2BB, and 0 HBP, but one has 9Ks and the other 2Ks with the remainder of outs on balls in play the they get radically different FIP values (assuming a FIP constant of 2), namely 2.14 and 4.14. In reality the contact pitcher easily may have thrown less pitches. The outcome in the game is identical however. Is the high K guy “better”? Surely the eye test would get you the same answer in this situation. Furthermore, the expression fails to account of other variables outside balls in play that really do matter: day v night? wind out/in or calm? cold v hot? All things that figure strongly into how a pitcher performs. Lastly, it is essential to keep in mind that reducing the total number of PAs to those only involving the pitcher (and catcher…is there a catcher-normalized FIP value?), unfortunately reduces the number of total inputs that are measured. Say in a game of where there is 40 PAs for a team, only a third of the PAs will contribute to calculating FIP. As a result, it gives a considerably down sampled data set.
This century there are 181 pitchers who’ve logged at least 1,000 IP. The top 3 in K/9 ratio are Randy Johnson, Chris Sale and Kerry Wood. The bottom three are Kirk Reuter, Aaron Cook and Carlos Silva. I don’t believe anyone would confuse one group from the other and if you polled 100 people who were watching baseball since 2000 that all 100 would prefer the first group. The first group had an ERA of 3.36 while the second group had an ERA of 4.55
While you might have been able to have similar results over one game without striking out as many batters, you cannot have it over a large group of innings. The much ballyhooed eye test might have told you they were just as good over one or a small group of starts but clearly the low K guys could not sustain that pace over time.
A great thing about FIP is that it needs so few inputs to produce meaningful results. Of the top 20 Hi K guys, 18 of them had a career FIP within .25 of their career ERA. It does not fare nearly so well in the bottom 20. These guys were able to survive thanks to something not picked up by our three simple inputs. Which is why we should consider if there are ways to improve the FIP formula.
Is the ability to restrict hi quality contact – as measured by barrels – what allowed pitchers like Tom Glavine or Mark Buehrle to outperform their FIP? That seems like a reasonable assumption. On the flip side, did Tim Lincecum and Francisco Liriano underperform their FIP because they gave up too much hi quality contact, beyond just homers? Would incorporating barrels instead of homers allow you to make better decisions on guys who still had strong strikeout rates?
A few years ago, a lot of people wanted the Mets to chase Josh Johnson. His K rate was still good but he noticeably underperfomed his FIP in 2012 and was not remotely as good as his FIP in 2013. Would barrels have allowed us to pick up on that after 2011 or 2012?
Not every soft tossing lefty is Glavine. Not every fireballing righty is Tom Seaver. Anything that allows us to identify which lefty is Glavine and which one is Kenny Rogers (ERA 4.51 vs FIP 4.53) sooner is a desirable thing.
Some people are wondering Lugo versus Gsellman if the Mets do indeed start the year with Wheeler in the bullpen. Both had great ERAs but FIP showed a significant difference between the two. If Gsellman had Lugo’s total of barrels, it would be a strong indication that he was lucky with his HR rate. But he didn’t. Now, it doesn’t mean that Gsellman is bound for glory. But it does mean that including a more precise measure of hard contact into our equation did not change the results.
If barrels turn out to be a meaningful device, it’s easy enough to identify and there’s no reason that B-R and FG wouldn’t include it on their standard player page. So, what’s a tad difficult to apply today could be easily accepted as standard in a few years.
There still is a statistical issue when 2/3 of the actual events cannot be accounted to the formula. The math needs to address a specific question.
If you said a high strike guy that doesnt give up HR or many walks has an excellent FIP, then great, but I can see that easily otherwise. I do not discredit the notion of such a calculation, but only that it talks to those things. It has no mathematical way of capturing the totality of events on a field. The idea that a single metric of any sort can record all the complexities of a game with a high number of variables is folly. That said, dissecting out the nuggets that monitor certain things is important, and provides part of the picture. Anyone that walks into the doctor and is proclaimed healthy because their BP is 120/70 would be insane. Such is the case with any single metric.
The base reality is that it is a team sport.
But we’re not trying to capture the totality of events on a field.
We’re interested if we can devise something that will tell us what ERA will (eventually) tell us. Only we’re hoping it will tell us that sooner. Having only a few inputs is a feature not a bug.
Having only a few inputs is a feature not a bug.
Not statistically!
The barrel data is extemely weak. There’s just not a lot of meat to this. And the numbers are so low as to be meaningless. Twelve barrels for Matt Harvey in 2016? Moreover, those “game events” are already accounted for in traditional statistics: BA, SLG. It’s not like those shots were missed by previous accounting methods.
But ultimately, there’s this: I don’t see what I’ve learned here besides some minutia. My sense of the pitchers has not changed. And all the data in the world is not going to teach me that Tom Glavine was better than Kenny Rogers. That information already exists in countless, multi-faceted ways.
I’m not against all new data. The exit speed stuff is interesting, and I think it will be instructive over time. But mostly it’s just a way to quantify the qualitative, to put a number on it. All in all, I find it rather tedious and boring. Sorry, I know you hate that, Brian.
I could not care less if you like this.
I let your first comment go without a response because you should have the right to like or find value in whatever you choose. But when you show up more than once in the same thread to contribute the baseball equivalent of “today’s music sucks!” I’m not going to let that pass.
The value is not in after their careers are over determining that Tom Glavine was better than Kenny Rogers. If that’s your takeaway here then at best I think you’re being deliberately obtuse.
You can check the archives here and see how many people were advocating for the Mets to sign Josh Johnson because his K rate was still strong.
Edit – This got published before I was close to being through. Here’s the rest.
If barrels had existed in 2011, would that have changed how those people felt about Johnson? Obviously we can’t answer that question. What we do know is that he wasn’t good in either 2012 or 2013 and there’s a chance that this would have led people to that conclusion following the 2011 season. And to me that’s worthwhile.
And there are other potential uses for this. I’m a big Robert Gsellman fan. But we know he had some fortune last year that’s likely unsustainable, one being his HR rate allowed. But we can see that he just didn’t luck out with well struck balls that didn’t go over the fence. His rate of barrels was right in line with what FIP showed looking only at homers. Compare and contrast that with Lugo. If you don’t find that interesting, sorry but that’s your loss.
You come in with such an extreme distaste for anything that’s a new metric. Everyone should approach things with a healthy skepticism. But you blow past that to outright hostility. And ultimately that says more about you than the metric.
At no point did I present this as groundbreaking or revolutionary or “forget everything you thought you knew about baseball!” Instead, I said that this was something that showed promise that we should investigate once more results came in.
But you couldn’t give it that. You made an immediate snap judgment that because it was new that it sucked and it wasn’t worthwhile. You were judge, jury and executioner and you acted before all the evidence was in.
And that’s my definition of anti-intellectualism.
At the risk of getting my head chewed off, please allow me to intervene. It’s nice that we all get to discuss our feelings here and we seem to get along. Hurt feelings are blown off, after all, this is cyberspace. But, Brian has to write an article or two during the off-season every week, and don’t you guys want to read something different other than “Let’s trade for McCutchen” or “We need an upgrade at catcher”?
Even if you think this stat is rubbish, it’s different! Bet you if there was a segment on it on MLB Now, you’d watch the whole thing. I’m not a saber guy, and while I don’t understand how they come up with all these stats, I can at least compare them for some kind of conclusion. Wish it was like figuring out slugging percentage or earned run average, but nothing is so simple anymore.
All of these Stats provide information that I don’t see with my own eyes..after all, I see most players just a handful of times (when they’re playing against “My Team”).
The fact that Jimmy and Brian may have an argument about stats is as old as Baseball itself…. You can go to Baseball Reference and argue about Frank Robinson and Roberto Clemente all day Long….or you can argue about Eaton and Cutch.
It’s good stuff.