A couple years back, a college-basketball enthusiast took me aside and—in a sign of our eternal friendship—confided his secret for filling out his annual NCAA “March Madness” bracket: he let his seven-year-old daughter pick the teams. She based her selections on the colors of the various teams’ uniforms. And those teams tended to actually win.
The point is, when it comes to predicting which college basketball team will triumph in a field of 68 rivals, one method is supposedly as good as any other. You can sacrifice a chicken and dance around a basketball court dressed in its feathers, screaming to the spirits to give you a sign, or you can devise a complex formula based on all sorts of historical data—neither is a surefire thing, given the tournament’s unexpected upsets and come-from-behind triumphs.
But can analytics software do better? After all, we’re entrusting more and more elements of our lives—from healthcare to overnight shipping—to Big Data. What can an algorithm meant for massive datasets tell us about a couple hundred kids hurling balls at hoops?
The Dance Card
Three business professors are using SAS analytics software to refine their annual “Dance Card,” which tries to predicting which NCAA Men’s Basketball teams will be “at large,” or selected to play in the actual NCAA Tournament. The model takes several different factors into consideration:
- RPI (Rating Percentage Index), which ranks teams based on wins and loses, as well as “strength of schedule.”
- Wins against top 25 teams.
- Wins against teams ranked 26-50.
- “Neutral” court wins.
- Record and rank in-conference.
- Strength of conference.
- Sagarin rankings from USA Today
According to a March 14 posting on the SAS official blog, the professors (Jay Coleman of the University of North Florida in Jacksonville; Allen Lynch of Mercer University in Macon, Georgia; and Mike DuMond of Charles River Associates and Florida State University in Tallahassee) also use the “Dance Card” as a teaching tool for students: “the same analytics used to predict NCAA Tournament teams are also used by businesses and governments” to predict all sorts of things.
The professors’ Dance Card also de-emphasizes certain factors held dear by generations of sports reporters, including a team’s record against teams ranked 50-100 and a team’s record in the last 10 games. “The Dance Card finds that a strong finish is not important to the [NCAA] Selection Committee vs. a team’s overall performance,” the blog added. “Of course, those “hot teams” are often doing some of the other things—winning on neutral courts, and beating teams in the top 25 or top 50—that do help boost their chances according to the Dance Card.”
Over at the University of North Florida’s Website, the Dance Card is on display in all its glory, where you can compare it to the teams actually picked. Its creators that “the ‘unbiased’ Dance Card has now correctly predicted 73 of 74 bids over the last two seasons, or 98.6 percent.”
This year’s Dance Card placed Kansas first with 100 percent chance of being picked, followed in order by Louisville, Indiana, Miami and Duke—no surprise, given those teams’ histories. Things were a little chancier further down the list; for example, Minnesota and Boise St. had a 73.34 percent and 67.28 percent chance, respectively, of being selected for the tournament, while La Salle had a 65.61 percent chance. The full list is too long to reproduce here in its entirety, but it’s worth checking out, if only to see how accurately the analytics proved this year.
Meanwhile, noted statistician Nate Silver—who talked quite a bit at this year’s South by Southwest show about his methodologies, not to mention the strangeness of fame—has deployed his predictive skills to this year’s NCAA tournament.
Silver’s model remains unchanged from previous years: he develops a “power rating” for each team, which incorporates two human ratings (tournament seeds and the Associated Press pre-season poll) and four computer ratings. The power rating is further adjusted based on three factors: injuries, the team’s performance in tournament games “so far,” and each game’s geographic location. As Silver explained during SXSW, “away” teams have to compensate for the wear and tear of travel, which can affect their performance.
“By comparing two teams’ power ratings, we can estimate the likelihood that any team in the 68-team field will beat any other in any given game,” Silver wrote back in 2011. “This allows us to play out the rest of the tournament and estimate the probability that any team reaches any subsequent round, or wins the national championship.”
Based on those power ratings, Silver has created a series of predictions for this year’s tournament. He feels that Louisville has a 22.7 percent chance of becoming the NCAA champion, while Duke—a favorite of many—has a 5.7 percent chance. Florida and Kansas have a 12.7 percent chance and 7.5 percent chance, respectively, of seizing the big prize. Indiana, another popular pick, is given a 19.6 percent chance of emerging triumphant, while Gonzaga has a 6.1 percent chance and Ohio State has a 5.8 percent chance. Silver breaks things down still further by Sweet 16, Elite 8, Final 4, and so on; while the charts are too large to reproduce here, they’re well worth perusing on his New York Times-hosted FiveThirtyEight blog.
Or you could always sacrifice a fowl and choose teams based on your favorite uniform colors—those methods work for some people, too.