项目作者: danvk

项目描述 :
NCAA brackets in JSON form
高级语言: HTML
项目地址: git://github.com/danvk/march-madness-data.git
创建时间: 2018-03-17T20:14:22Z
项目社区:https://github.com/danvk/march-madness-data

开源协议:

下载


March Madness Data

This repo contains JSON files for all the NCAA brackets from 1985–2017.

Results

Sums of Seeds

After #16 seed UMBC became the first to beat a #1 seed, I was curious what the highest sum of seeds in a game was. This was harder to find out than I expected, so I grabbed some data from Wikipedia and found the answer. It’s 25!

  1. (1989) 25: Minnesota 11 vs 14 Siena
  2. (1991) 25: Eastern Michigan 12 vs 13 Penn State
  3. (1991) 25: Temple 10 vs 15 Richmond
  4. (1991) 25: Connecticut 11 vs 14 Xavier
  5. (1992) 25: New Mexico State 12 vs 13 Southwest Louisiana
  6. (1993) 25: George Washington 12 vs 13 Southern
  7. (1997) 25: Texas 10 vs 15 Coppin State
  8. (1998) 25: Washington 11 vs 14 Richmond
  9. (1998) 25: Florida State 12 vs 13 Valparaiso
  10. (2001) 25: Georgetown 10 vs 15 Hampton
  11. (2001) 25: Gonzaga 12 vs 13 Indiana State
  12. (2008) 25: Villanova 12 vs 13 Siena
  13. (2008) 25: WKU 12 vs 13 San Diego
  14. (2009) 25: Arizona 12 vs 13 Cleveland State
  15. (2011) 25: Richmond 12 vs 13 Morehead State
  16. (2012) 25: Xavier 10 vs 15 Lehigh
  17. (2012) 25: South Florida 12 vs 13 Ohio
  18. (2013) 25: Mississippi 12 vs 13 La Salle
  19. (2014) 25: Tennessee 11 vs 14 Mercer
  20. (2015) 25: UCLA 11 vs 14 UAB
  21. (2016) 25: Syracuse 10 vs 15 Middle Tennessee
  22. (2018) 25: UMBC 16 vs 9 Kansas State
  23. (1997) 24: Chattanooga 14 vs 10 Providence
  24. (1993) 22: Temple 7 vs 15 Santa Clara
  25. (2012) 22: Florida 7 vs 15 Norfolk State
  26. (2013) 22: Wichita State 9 vs 13 La Salle
  27. (2013) 22: San Diego State 7 vs 15 Florida Gulf Coast
  28. (1986) 21: Cleveland State 14 vs 7 Navy
  29. (1998) 21: Rhode Island 8 vs 13 Valparaiso
  30. (2011) 21: VCU 11 vs 10 Florida State
  31. (2014) 21: Dayton 11 vs 10 Stanford

All the 25s are in the Round of 32. This happens whenever there are two first-round upsets in the
same part of the bracket. You can’t get a higher sum than 25 until the third round or later, and
this has yet to happen. The closest was 14 Chattanooga vs. 10 Providence in 1997.

Sweet 16:

  1. (1997) 24: Chattanooga 14 vs 10 Providence
  2. (2013) 22: Wichita State 9 vs 13 La Salle
  3. (1986) 21: Cleveland State 14 vs 7 Navy
  4. (1998) 21: Rhode Island 8 vs 13 Valparaiso
  5. (2011) 21: VCU 11 vs 10 Florida State
  6. (2014) 21: Dayton 11 vs 10 Stanford
  7. (2016) 21: Gonzaga 11 vs 10 Syracuse
  8. (2002) 20: UCLA 8 vs 12 Missouri
  9. (1990) 18: Loyola Marymount 11 vs 7 Alabama
  10. (2001) 18: Temple 11 vs 7 Penn State

Elite Eight

  1. (2000) 15: North Carolina 8 vs 7 Tulsa
  2. (2002) 15: Indiana 5 vs 10 Kent State
  3. (1990) 14: Arkansas 4 vs 10 Texas
  4. (1997) 14: Arizona 4 vs 10 Providence
  5. (2000) 14: Wisconsin 8 vs 6 Purdue
  6. (2002) 14: Missouri 12 vs 2 Oklahoma
  7. (1986) 12: Kentucky 1 vs 11 LSU
  8. (1990) 12: UNLV 1 vs 11 Loyola Marymount
  9. (1994) 12: Boston College 9 vs 3 Florida
  10. (2001) 12: Michigan State 1 vs 11 Temple

Final Four

  1. (2011) 19: VCU 11 vs 8 Butler
  2. (2006) 14: Florida 3 vs 11 George Mason
  3. (1986) 13: LSU 11 vs 2 Louisville
  4. (2000) 13: Florida 5 vs 8 North Carolina
  5. (2016) 11: North Carolina 1 vs 10 Syracuse
  6. (1985) 10: Villanova 8 vs 2 Memphis State
  7. (1992) 10: Michigan# 6 vs 4 Cincinnati
  8. (2010) 10: Michigan State 5 vs 5 Butler
  9. (2013) 10: Louisville 1 vs 9 Wichita State
  10. (2014) 10: Wisconsin 2 vs 8 Kentucky

Finals

  1. (2014) 15: Connecticut 7 vs 8 Kentucky
  2. (2011) 11: Connecticut 3 vs 8 Butler
  3. (1985) 9: Georgetown 1 vs 8 Villanova
  4. (1988) 7: Kansas 6 vs 1 Oklahoma
  5. (1992) 7: Duke 1 vs 6 Michigan#
  6. (1989) 6: Seton Hall 3 vs 3 Michigan
  7. (2000) 6: Florida 5 vs 1 Michigan State
  8. (2002) 6: Maryland 1 vs 5 Indiana
  9. (2010) 6: Butler 5 vs 1 Duke
  10. (1991) 5: Kansas 3 vs 2 Duke

Craziest Final Four

Or what was the craziest final four (i.e. highest sum of seeds)? It was 26, in 2011.
The least crazy was 2008’s final four, the only with four 1 seeds.

  1. 26 2011 Kentucky (4) Connecticut (3) VCU (11) Butler ( 8)
  2. 22 2000 Florida (5) North Carolina (8) Michigan State ( 1) Wisconsin ( 8)
  3. 20 2006 LSU (4) UCLA (2) Florida ( 3) George Mason (11)
  4. 18 2014 Florida (1) Connecticut (7) Wisconsin ( 2) Kentucky ( 8)
  5. 18 2013 Louisville (1) Wichita State (9) Michigan ( 4) Syracuse ( 4)
  6. 15 2016 Villanova (2) Oklahoma (2) North Carolina ( 1) Syracuse (10)
  7. 16 2018 LoyolaChicago (11) Michigan ( 3) Villanova ( 1) Kansas ( 1)
  8. 15 1986 Duke (1) Kansas (1) LSU (11) Louisville ( 2)
  9. 13 2010 Michigan State (5) Butler (5) West Virginia ( 2) Duke ( 1)
  10. 13 1992 Duke (1) Indiana (2) Michigan# ( 6) Cincinnati ( 4)
  11. 12 2017 South Carolina (7) Gonzaga (1) Oregon ( 3) North Carolina ( 1)
  12. 12 1990 Duke (3) Arkansas (4) Georgia Tech ( 4) UNLV ( 1)
  13. 12 1985 Georgetown (1) St John's (1) Villanova ( 8) Memphis State ( 2)
  14. 11 2005 Illinois (1) Louisville (4) North Carolina ( 1) Michigan State ( 5)
  15. 11 1996 Massachusetts (1) Kentucky (1) Miss. State ( 5) Syracuse ( 4)
  16. 10 2015 Kentucky (1) Wisconsin (1) Michigan State ( 7) Duke ( 1)
  17. 10 1988 Duke (2) Kansas (6) Oklahoma ( 1) Arizona ( 1)
  18. 10 1987 Syracuse (2) Providence (6) Indiana ( 1) UNLV ( 1)
  19. 9 2012 Kentucky (1) Louisville (4) Ohio State ( 2) Kansas ( 2)
  20. 9 2003 Syracuse (3) Texas (1) Marquette ( 3) Kansas ( 2)
  21. 9 2002 Maryland (1) Kansas (1) Indiana ( 5) Oklahoma ( 2)
  22. 9 1998 North Carolina (1) Utah (3) Kentucky ( 2) Stanford ( 3)
  23. 9 1995 Oklahoma State (4) UCLA (1) North Carolina ( 2) Arkansas ( 2)
  24. 9 1989 Duke (2) Seton Hall (3) Michigan ( 3) Illinois ( 1)
  25. 8 2004 Oklahoma State (2) Georgia Tech (3) Duke ( 1) Connecticut ( 2)
  26. 8 1994 Florida (3) Duke (2) Arkansas ( 1) Arizona ( 2)
  27. 7 2009 Michigan St. (2) Connecticut (1) Villanova ( 3) North Carolina ( 1)
  28. 7 2001 Duke (1) Maryland (3) Michigan State ( 1) Arizona ( 2)
  29. 7 1999 Duke (1) Michigan State (1) Ohio State ( 4) Connecticut ( 1)
  30. 7 1997 North Carolina (1) Arizona (4) Minnesota* ( 1) Kentucky ( 1)
  31. 7 1991 North Carolina (1) Kansas (3) Duke ( 2) UNLV ( 1)
  32. 6 2007 Florida (1) UCLA (2) Georgetown ( 2) Ohio State ( 1)
  33. 5 1993 North Carolina (1) Kansas (2) Kentucky ( 1) Michigan * ( 1)
  34. 4 2008 North Carolina (1) Kansas (1) Memphis ( 1) UCLA ( 1)

Using the data

The data comes from Wikipedia articles. It’s all in data/YYYY.json. For example:

  1. {
  2. "year": 1997,
  3. "regions": [
  4. [
  5. [
  6. [
  7. {
  8. "round_of": 64, "seed": 1,
  9. "team": "North Carolina", "score": 82,
  10. },
  11. {
  12. "round_of": 64, "seed": 16,
  13. "team": "Fairfield", "score": 74
  14. }
  15. ],
  16. ...
  17. ],
  18. ...
  19. ],
  20. ...
  21. ],
  22. "finalfour": [
  23. [
  24. [
  25. {
  26. "round_of": 4, "seed": 1,
  27. "team": "North Carolina", "score": 58
  28. },
  29. {
  30. "round_of": 4, "seed": 4,
  31. "team": "Arizona", "score": 66
  32. }
  33. ],
  34. [
  35. {
  36. "round_of": 4, "seed": 1,
  37. "team": "Minnesota*", "score": 69
  38. },
  39. {
  40. "round_of": 4, "seed": 1,
  41. "team": "Kentucky", "score": 78
  42. }
  43. ]
  44. ],
  45. [
  46. [
  47. {
  48. "round_of": 2, "seed": 4,
  49. "team": "Arizona", "score": 84
  50. },
  51. {
  52. "round_of": 2, "seed": 1,
  53. "team": "Kentucky", "score": 79
  54. }
  55. ]
  56. ]
  57. ]
  58. }
  • There are four regions.
  • Each contains an array of four rounds.
  • Each round contains an array of games.
  • Each game is an array of two teams.
  • Each team is an object with round_of, seed, team and score keys.

If you’re working in Python, you can find some helper functions in utils.py and some
example code in find_highest_seeds.py and craziest_final_four.py:

  1. $ ./craziest_final_four.py data/*.json
  2. 26 2011 Kentucky ( 4) Connecticut ( 3) VCU (11) Butler ( 8)
  3. 22 2000 Florida ( 5) North Carolina ( 8) Michigan State ( 1) Wisconsin ( 8)
  4. 20 2006 LSU ( 4) UCLA ( 2) Florida ( 3) George Mason (11)
  5. 18 2014 Florida ( 1) Connecticut ( 7) Wisconsin ( 2) Kentucky ( 8)
  6. 18 2013 Louisville ( 1) Wichita State ( 9) Michigan ( 4) Syracuse ( 4)
  7. ...

Updating the data

To regenerate (or update) the data, you’ll need Python 3.6 or later.
Set up your virtual environment and run:

  1. pip install -r requirements.txt
  2. ./extract_wiki_source.py pages/*.html
  3. ./extract_bracket.py pages/*.wiki
  4. mv pages/*.json data/

To add a new year, use curl to put a new HTML file in pages/YYYY.html. You can
use the URLs in urls.txt as a template.