Saturday, July 14, 2007

Hurricane Forecast

Miami's projections are ready.

I'm going to take a few days off from generating the projections. Half of them are up, and with the season still almost 4 months away I don't feel any real rush to get the rest out there. Instead I'm going to spend some time on a new project.

In developing the similarity scores used to generate the player projections, it's really jumped out at me just how close a correlation there is between a player's size and the statistics that he generates. Amazingly enough, height and weight are not used at all in generating the similarity scores, yet the top comps are almost always dead ringers, from a body-type perspective, for the base player.

I've been using player size as a proxy for position in PAPER, but that leads to a few different kinds of problems. First, sometimes there are guys--Ishmael Smith and Will Bowers, to cite one from either extreme off the top of my head--who are so far outside the normal range of players that the regressions that tell me what the "league average" player of their size should be doing just don't have enough good data to work with. They wind up with funky results (a player T.J. Bannister's size in 2005, for example, should have blocked -0.006 shots per defensive possession according to the model) that I've either got to just roll with or jury rig out somehow. In the grand scheme of things, it's not a huge deal, or one that affects the bottom line number much, if at all, but I've never cared for it.

The second problem is that it treats players who are big or small for their position differently than they probably should be. There's an adjustment using BMI that refines the raw height number into something a little bit closer to the truth, essentially adding an inch or so to the height of the stouter players while shaving one off of the scrawnier ones, but PAPER still treats Mamadi Diane and Greivis Vasquez exactly the same, even though they have vastly different responsibilities on the court.

Getting to the point, I think there may be a way to make PAPER better by scrapping the size adjustments and using positional adjustments instead. I'm already making what I think is a pretty safe assumption: players of similar size generally tend to play the same position. But if I can break that down, to find some markers in the numbers that say, "This guy is a shooting guard," or, "That guy is a defensive specialist," it will only make the system better.

I don't know if this will go anywhere or not, but I'm going to think on it for a few days and see what happens. In the meantime, if there's a player on a team that hasn't been posted yet, and you're just dying to see what the Magic Spreadsheet says he's going to do next year, drop me an email. Running the individual projections takes like 5 seconds, and I do them out of curiosity all the time; it's just the formatting everything into a nice little package that takes a couple hours for each team.

No comments: