The Matchup Zone Blog

Tuesday, April 12, 2016

New & Improved Projection Model

The projections for the 2017 season are now live. My goal is to try as best as I can to keep up with all the transfer and early-entry decisions, and run an update at least 2 or 3 times a week through the end of the academic year, after which everything will calm down until the start of practice in September. 

For now, I've made the decision to remove any player from the rosters who has declared himself eligible for the NBA draft, whether or not he's hired an agent. As the inevitable returning-to-school announcements are made, I'll add them back to the rosters, and try to flag any big movers on Twitter.

This year's projections should be better than ever.

For the past three seasons, I've used a similar-player model to find comparable players for every returning player, and then used those players' collective year-over-year changes to make predictions for how the returnees would improve or decline. Those individual projections were then run through a program which combined them with all the other players on each team to generate a team projection.

This method worked pretty well, especially for the teams in power conferences and other top-100 type teams, but it had a nasty habit of overestimating everyone else's prospects. And while it did an okay job of predicting conference wins, the fact that I was missing so badly on the mid- and low-major teams' predicted TAPE ratings, and that other systems were able to consistently do better at predicting conference records was enough to send me back to the drawing board.

The result is a system which is similar to the old method in that it uses individual projections to build team predictions, but differs in a couple of key areas. The big difference is that the comparable-player model has been scrapped. My hypothesis when building this model was that it would be a good one for identifying potential breakout candidates. If Player X looks a lot like these other players, the thinking went, and a lot of them broke out, then this guy should, too.

Three years later, it's clear that this just didn't work. With short seasons, small  sample sizes, and the inherent unevenness of player development inherent in 19- to 22-year-old basketball players, there was just too much noise there, and it wound up just being a model that said, in essence, everybody's going to probably get a little bit better. Problem is, it did a pretty lousy job at even predicting the extent of that improvement, especially among the bottom 2/3 of Division I players.

In its place now is a simpler regression model in which the year-zero performance of a similar cohort of players--"high-major sophomore big men who played starter minutes," for example--forms the basis of every returnee's projection. But whereas the old system merely had a sanity-check at the end which would, for example, nudge up Shaka Smart's players' steal rate if it wasn't sufficiently high based on historical norms, the new model uses team and coach development history as a much bigger factor at the front end of the process. Likewise, incoming players have a projection built on how, say, other true freshman consensus top-30 shooting guards coming into high-major programs have performed in the past, there's a further refinement based on the numbers that others coming into that same program have posted. 

The end result is a system which winds up being one that's far more impacted by a program's own history than the previous one was, while still, I think, accurately reflecting roster quality. Most importantly, it's one that will just do a better job of predicting how teams will play in the upcoming season. The average error for predicted 2015-2016 conference wins was 2.20 under the old system. Re-projecting the season under the new method (using only information that was in the database as of October 2015, of course) yielded an average error of 2.04 conference victories.

Monday, March 14, 2016

2016 NCAA Advancement Odds

The 2016 NCAA Tournament Advancement Odds are up and running. Kansas is the favorite, with a 1-in-6 chance to cut down the nets. The long shot of all long shots is Fairleigh Dickinson, who would win the championship in 1 out of every 5.2 billion parallel universes.

Tuesday, January 19, 2016

(Re-)Introducing STAPLE

No college basketball-focused website is complete without a periodic prediction of the NCAA Tournament bracket. And this site is no exception. For the past several years I've done bracket watch posts whenever my scheduled has allowed me to. 

Unfortunately, as much as I loved compiling those posts--and I really do; looking at who's trending up and down on a weekly basis is a great way to stay engaged with the totality of what's going on in college basketball--with a wife, two kids, and a day job, I'm just not able to write them up with the regularity I'd like. So I did what any good nerd would do: I plowed the time I would've spent working on those posts over the past few months into writing a program which would do the work for me.

The result is the new-and-improved STAPLE (Simple Tournament Algorithm for Predicting the Likelihood of Election--because I loves me some backronyms). Now, with each run of the TAPE ratings, there will be an updated predicted NCAA Tournament bracket and sortable leaderboard of who's likely to be in or out. 

Unlike most other brackets out there, this one is not based on an if-the-season-ended-today model. It takes the long view of the season, and incorporates future schedules and projected future results into account. As such, it's a lot less volatile than most bracket predictions tend to be. It reacts to results, it just doesn't over-react to them.

At the top of the STAPLE page is the most recently updated bracket prediction. It's assembled programmatically and follows all bracketing principles set forth by the Selection Committee. Below that is a sortable table which includes all teams with non-zero at-large chances

As the name implies, STAPLE is pretty simple. Its inputs are few: RPI, TAPE, auto-bid, and scaled bonuses and penalties for good wins and bad losses, respectively. These are scaled into a model which attempts to match the always-moving target of the Selection Committee's at-large selection and seeding criteria. These are weighted into a single number--listed in the table below the bracket simply as "Points."

The second column unique to the STAPLE listings, labeled "STAPLE," is each team's chances, as of the most current system run, of qualifying for an at-large bid.

As always, if I'm missing anything, or you see a glaring hole in my program's interpretation of the bracketing principles, let me know. 

Tuesday, November 10, 2015

The Calendar Is Turned

I've turned on all of the lights for the 2015-16 season, and everything should default to this season without incident when navigating the site now. 

The final projection run has been completed, and the 2016 Projections are now locked. I'm still working on adding some new features to display--some visualizations to go on each team's projection page are in the works. One that I want to point out that's now working and I think is pretty cool is the lineup projections.

In the process of running the player and team projections, I've also run every combination of five-on-the-floor for each team. They're included under each team's projections page, and can be accessed by clicking "Lineups" in the sub-navigation bar. (Villanova's can be found here, for example. You can see there why Daniel Ochefu is probably one of the most important players in the country.) Each lineup has its own TAPE rating, as well as adjusted offensive and defensive efficiencies and normalized four factors. They're all sortable, too, so if you're curious about which of your team's lineups is going to force the most turnovers or be best on the boards, there you go.

Once we get into the season and the PAPER ratings are available (usually sometime in mid-December), I'm planning on using the same model to build on-the-floor ratings that will use the actual season data as well. 

Monday, April 13, 2015

2016 Projections Are Live

The 2016 Season Projections page is now live. Unless a player has formally announced that he will make himself eligible for the NBA draft, he is still included in his team's 2016 projection; Duke's projection, for example, does not include Jahlil Okafor but includes Justise Winslow and Tyus Jones. As players are added to and removed from next year's expected rosters, I'll re-run the projections and try to keep them as up-to-date as possible. With several high-profile players (both incoming and outgoing) expected to make their eligibility decisions known over the next couple weeks, there should be some shifting at the top before the roster news slows to a trickle around the end of the academic year.

These projections use historical player comparisons to generate a statistical projection for each returning and incoming player for every team in Division I. It then combines all the player projections into a team projection based on the strengths of that team's players and how their profiles tend to interact with each other to output a team projection.

If there's a player missing from a team's roster, or someone there who doesn't belong, please let me know (by email, twitter, or the contact form) so that I can make the necessary changes. If the player is there but you think his numbers look out of whack, feel free to inquire about that as well. The answer is probably as simple as, "The computer thinks your favorite player isn't as good at basketball as you think he is," but it never hurts to ask.

Sunday, March 15, 2015

Final Bracket Projection

The outcome of the Wisconsin/Michigan State game will not affect this bracket. The Duke pod in the Midwest region would be in Charlotte, not Portland.

Locks (51):

Kentucky, Villanova, Wisconsin, Arizona, Duke, Kansas, Iowa St., Virginia, Gonzaga, Notre Dame, North Carolina, Baylor, Utah, Oklahoma, SMU, Michigan St., Arkansas, Northern Iowa, Louisville, Maryland, Wichita St., Georgetown, Providence, West Virginia, Xavier, VCU, Butler, Ohio St., Iowa, Stephen F. Austin, Buffalo, Georgia State, Valparaiso, Harvard, Wofford, Wyoming, UC Irvine, Northeastern, New Mexico State, Eastern Washington, North Dakota State, Albany, Belmont, Coastal Carolina, Lafayette, Texas Southern, North Florida, Manhattan, Robert Morris, Hampton

Near-Locks (9):

N.C. State (99%), Davidson (99%), Georgia (99%), Oregon (98%), San Diego St. (98%), Texas (98%), Cincinnati (97%), St. John's (95%), BYU (93%)

Probably In (3):

Dayton (82%)
Indiana (82%)


On The Bubble (9 teams for 5 spots):

Boise St. (78%)
Temple (72%)
Purdue (59%)
UCLA (57%)
Stanford (43%)
LSU (28%)
Tulsa (26%)


Long Shots:

Florida (8%)
Richmond (8%)

Monday, March 9, 2015

Bubble Babble: Tourney Time


Usually, the bubble picture is crystallizing by this point, but this year it seems that the bubble is actually expanding as we enter Championship Week. There's lots of movement at the top, where the top two seed lines--and who gets the dreaded #2 seed in the Midwest opposite Kentucky--will be decided by who wins their conference Tournaments. Right now, after Virginia's loss at Louisville over the weekend, Villanova, Arizona, and Wisconsin would seem to have the inside track on #1 seeds ahead of Duke and Virginia.

Locks (29):


Near-Locks (10):

San Diego St. (99%), Oregon (99%), Texas (99%), Cincinnati (98%), Georgia (98%), St. John's (98%), N.C. State (98%), Boise St. (96%), Oklahoma St. (95%), Colorado St. (94%)

Good Shape (2):

BYU (91%) - Avoid a loss to Portland this evening, and the Cougars will be dancing.

Temple (81%) - The Owls are probably in even with a loss to Memphis in the AmCon quarters, but I'm sure they'd rather not test that hypothesis.


On The Bubble (11 teams for 5 spots):

Dayton (79%) - The Flyers' win at VCU last week filled in a big hole on their resume. So long as they're able to avoid an embarrassing loss to St. Bonaventure or St. Joe's, they'll be dancing.

LSU (70%) - After taking a home loss to Tennessee, things looked bleak for the Tigers; a road win at Arkansas brightened their prospects considerably. 

Mississippi (63%) - A matchup with South Carolina could be a tough test for the Rebels in the SEC Tournament. Pass and they'll earn a trip to Dayton.

Old Dominion (61%) - Teams with top-35 RPIs are usually dead-ceartain locks for inclusion, but the Monarchs, whose best wins are a neutral-site victory over LSU and a home win over VCU--both in November--might be an exception to that rule.

Indiana (50%) / Purdue (50%) - The cut line is right here as of today, with room for only one of these teams. With a sweep of the season series, Purdue would likely have the edge if it came down to these two for the final bid.

Stanford (47%) - Losing 7 of 10 to close out the season is a fine way to move oneself from near-lock status to the wrong side of the bubble. Without a run to the championship game in Las Vegas, the Cardinal will be NIT-bound.

Tulsa (32%) - Ten days ago it appeared that winning one of their final three games would be enough to push Tulsa into the dance; an overtime win at Memphis and two losses later, though, that doesn't seem to be the case. They absolutely cannot afford to lose their opening game in the AmCon Tournament against Tulane or Houston, and they likely need to show well against Cincinnati in the conference semis as well.

Stephen F. Austin (30%) - The Lumberjacks get a bye all the way to the semifinals of the Southland tournament, and thus only need to win two games against vastly inferior competition to earn the automatic bid. A loss in either of those games would essentially disqualify them from consideration. It's auto-bid or NIT.

UCLA (27%) - The Bruins kept the bubble in sight by winning their final three games of the season, but home victories over Washington, Washington State, and Southern Cal impress absolutely no one. The real work begins in Vegas, where the Bruins must win their quarterfinal matchup against Arizona State in order to set up a must-win game against Arizona in the semifinals.

Buffalo (23%) - There are probably 80 teams better than Buffalo this season, but the Bulls have gamed the RPI beautifully and will, unless they flame out early in the MAC tournament, have an RPI ranking in the top 35. Their best win is a loss at Kentucky; their second-best win was a loss at Wisconsin; their third-best win was a win in the season opener against South Dakota State. The Bulls won't be dancing without winning the MAC championship, but their high RPI ranking ought to be held up as Example A in the case against the NCAA's preferred metric.


Work To Do:


Texas A&M (14%) - The Aggies likely need to reach the SEC finals to earn an at-large bid. That would require a victory over Kentucky.

Miami (FL) (13%) - The Hurricanes' victory at Pittsburgh essentially meant that the two ACC schools traded places in the bubble conversation, but there 

Richmond (10%) - This year's "Atlantic 10 team who gets hot at the end of the season to insert themselves into the Bubble discussion" is Richmond. Winners of their final six games of the regular season, the Spiders could earn a trip to Dayton with a trip to the A-10 championship game.

Saturday, February 28, 2015

Bubble Watch: The Second Season Nears A Close


I haven't been able to do these (or much of anything with the site, for that matter) as much as I would like this year because, well, life. But we're 15 days away from Selection Sunday, and now's as good a time as any for a better-late-than-never look at where things stand.

Locks (29):



Near-Locks (7):

Oklahoma St. (99%), San Diego St. (99%), N.C. State (99%), Georgia (98%), Davidson (97%), Stanford (96%), St. John's (93%)
Stanford is this high because the laptops--which essentially act as a proxy for the "eye test" in the STAPLE formula--love the Cardinal. Real eyes, probably not so much. Johnny Dawkins's team is probably in the field, just because they've got to scrounge up 68 teams from somewhere, but they're overvalued here.


Good Shape (3):

Colorado St. (88%) - Avoid a loss in Reno tonight, don't wet the bed against Utah State or flame out too early in the MWC Tournament, and the Rams are fine.

Texas (88%) - The Longhorns' recent freefall would be much more problematic for their chances of making the field if the rest of the teams below them on this list weren't doing their damndest to make UT's landing as soft as possible.

Tulsa (87%) - The Golden Hurricane finish up the regular season with games at Memphis and at SMU sandwiching a home game against Cincinnati. That's about as tough of a three-game stretch as the AmCon can offer this year, and Frank Haith's team can't afford to strike out; win one of three, though, and they'll punch their ticket.


On The Bubble (10 teams for 8 slots):

BYU (78%) - The Cougars' position is not really as strong as their numbers here would indicate. Win or lose, they'll get an RPI bounce from playing Gonzaga tonight and they'll be right around the #50 mark at the conclusin of the WCC regular season. But then they'll get a team with a losing record in the WCC quarters and it'll drop back 10 or so spots again.

Cincinnati (73%) - While Tulsa was able to navigate the AmCon's minefield of lousy second division teams without taking a loss, the Bearcats' slips at East Carolina and against Tulane in Cincinnati could come back to haunt them. They can't afford another loss to Tulane today.

Iowa (70%) - The Hawkeyes need two more wins to punch a ticket to Dayton for the second time in as many years. A third win would put them in the Tourney field proper.

Oregon (63%) - By winning 9 of their last 11 games, the Ducks are in position to need just one more win to hear their name called on Selection Sunday.

Boise St. (59%) - TAPE thinks the Broncos are the best team in the Mountain West, and they're in the bracket above as the league's auto-bid. They can take the pressure off themselves heading into Vegas by winning at San Diego State tonight. The two games after that--at San Jose State and a home game against Fresno State--are must-wins.

Mississippi (59%) - Kentucky's dominance of the SEC has obscured the fact that there are some pretty good teams in that league from 2 through 11. If Ole Miss can win 2 of their final 3 regular season games and show well in Nashville, they'll be deserving of a bid.

Dayton (50%) - A top-40 RPI is a nice thing to have. That and the Flyers' Elite Eight run of a year ago are what likely have them higher on most boards than they are here, where they're the last team in the field. A strength of schedule outside the top-100 and no quality wins are big obstacles to overcome.

Temple (49%) - The Owls destroyed Kansas before the Jayhawks found themselves, and they avoided taking a bad loss in the AmCon. But they're also just 1-5 against the other nationally-relevant teams in their own conference.

LSU (48%) / Texas A&M (46%) - If either of these teams had been able to close the deal against Kentucky they'd be breathing easy. Instead, they both need to keep winning and have a lot of things go right.


Work To Do:

Purdue (19%) - The Boilermakers dug such a deep hole for themselves with losses to Gardner-Webb and North Florida that it's taken an 8-1 stretch in Big Ten play just to get themselves back within shouting distance of the bubble. They'll need to keep it up and win at least three more games to earn a bid.

Pittsburgh (15%) - The Panthers need to win their next four games.

UCLA (12%) - The Bruins' RPI will drop several spots even with wins in their final two regular season games. They'll need a deep run in Vegas--probably to the Pac-12 Championship game--in order to earn an at-large bid.


Long Shots:

Stephen F. Austin (7%) / Buffalo (4%) - It's hard to see either of these teams earning an an invitation even with losses in their respective conference championship games. But the Lumberjacks will have a gaudy record, and the Bulls will have an RPI ranked in the top-40, so if the Committee decides to give the last spot to a true mid-major, they'll be able to justify either of these selections without having to do too many gymnastics.