Friday, July 11, 2008

Team Stat Crunching

Okay, I've built a database for the 2007-8 season. Right now, I think my primary focus is going to be on whole teams and goalies, rather than breaking everything down for each skater. I'm working on a big database for them, but it's going to be a much longer, slower, process, and I want to get some information out there.

So, thirty teams. That's the easy part. Using the NHL's Stat Machine (not to be confused with the NHL Highlight Machine), I broke out the following statistics:

Wins, Losses, OTLs, Points, Goals For, Goals Against, 5 on 5 ratio, Power Play Scoring %, Penalty Killing %, Shots for, Shots against, Minor Penalties, Major Penalties, Total Penalties in Minutes, # of Power Plays, # of times Short Handed, Power Play Goals, Short Handed Goals, Short Handed Goals Against, Goal Differential, Total Cap Hit, Cap Hit for Forwards/Defensemen/Goalies, Face Off Percentage, and Playoff Wins.

So, there's two goals here: What lines up with Points (regular season success) and what lines up with Playoff Performance (playoff wins).

At this point, we've got a hot, steaming mess of a spreadsheet to look at. Let's see what comes out of the pile:


All right, it may be the New NHL, but defense still ruled the day in 2008. In almost every category, a team's defensive performance was more important than its offense - that is, it correlated more strongly to both points and playoff wins. Goals For led only to a 0.37 correlation coefficient for points and a 0.40 correlation for playoff wins, while Goals Against gave -0.78 and -0.54 respectively.

Math aside: What this means - Correlation is a number that shows how closely linked two different sets of numbers are - how often they either go up together or go down together, or even go in opposite directions at the same time. A perfect correlation is 1.0: Your age in years and your age in months have a correlation of 1.0. It means there are two numbers that are giving the same information, basically. At the same time, a correlation of -1.0 is also very meaningful. It means that as you accumulate more of one thing, you lose something else at the exact same rate. The correlation between the number of Pringles in the can and the number you've eaten is -1.0. Anything in between means that the numbers are more and more randomly linked, until you hit 0.0, which means that there is absolutely no connection between the two sets you're looking at.

So with that established, what does it mean? Well, -.78 and -.54 are both strong correlations for goals against - negative, here, because it means that teams that give up fewer goals win more games. It doesn't hurt that the three best defensive teams this year were Detroit, Anaheim, and San Jose, in that order. You have the Cup team/President's Trophy winner leading the way, followed by teams #2 and #4 in points. Tampa Bay is at the bottom, so that rush of offense they've added this year may not help them so much.

Detroit had a good offense too, though: #3 in goals scored. But they didn't have much company up there. Ottawa, Montreal, Buffalo, and Carolina round out the top five, with five playoff wins between them. Anaheim only beat out Columbus and the Islanders in scoring, yet ran up over 100 points, and San Jose got to 108 points with only pedestrian scoring.

As we expand the sample, we'll see if this was a one-year quirk, but there is strong evidence in other categories that defense wins across the board.

No comments: