Tuesday, January 22, 2008

Tale of the TAPE

One of the things I was hoping to do with the site this season was to produce PAPER numbers for the entire season, not just the conference portion of the schedule. The problem with that is that schedules are so varied that it's hard to find a way to make an aples-to-apples comparison. I thought about using items that were already in the ACC database--specifically treating each opponent from a given conference as the same team--but rejected that idea pretty quickly. There's just too much variation in quality within conferences for that to work.

With no easy solution available, I decided to go big. I've entered all the linescore data from every game between Division I teams into the database and I've used that to adjust every team's offensive and defensive rates across all the statistical categories that relate to team scoring. That data, which represents what each team would have done over the course of the season if all games were played against a Division I-average opponent on a neutral floor, can be found here.

With all that done, it's just a matter of plugging numbers into the same probabilistic model used for PAPER to arrive at an estimate of the number of points each team in Division I would score and allow against neutral competition. The Pythagorean expectation derived from those numbers (an exponent of 9.2 maximizes the r-squared in this model) is the Team Adjusted Probabilistic Effectiveness, or, in keeping with the office supplies theme, TAPE. The current Top 25 (through the games of 1/21/08) are:

The full list of TAPE is available here. Other information on the spreadsheet includes each team's full raw winning percentage and Pythagorean expectation (exponent 8.5), as well as their adjusted records converting each opponent into a hypothetical average team.

Coming up next: using the component pieces of TAPE to predict the future.

