Resources
This is a semi-structured collection of sports analytics, statistics, data science and programming resources that I’ve found useful and educational.
Sports analytics topics and problems
Compilations
- Presenting Multiagent Challenges in Team Sports Analytics - https://arxiv.org/pdf/2303.13660.pdf
- Methods of performance analysis in team invasion sports: A systematic review - https://www.tandfonline.com/doi/full/10.1080/02640414.2020.1785185
- Devin Pleuler: Soccer Analytics Handbook https://github.com/devinpleuler/analytics-handbook
- “A curated list of awesome machine learning applications in the sports domain” - https://github.com/AtomScott/awesome-sports-analytics
- SFU seminars: http://www.sfu.ca/sportsanalytics/Seminars.html
- Soccer Analytics 2020 Review https://janvanhaaren.be/2020/12/30/soccer-analytics-review-2020.html
- Soccer Analytics 2021 Review https://janvanhaaren.be/2021/12/30/soccer-analytics-review-2021.html
- The collection, analysis and exploitation of footballer attributes: A systematic review - https://content.iospress.com/download/journal-of-sports-analytics/jsa200554?id=journal-of-sports-analytics%2Fjsa200554
- F1 - https://twitter.com/F1DataAnalysis
- Basketball analytics - https://squared2020.com/
- UFC analytics - https://literalfightnerd.com/
Communication
- How to watch basketball: https://cleaningtheglass.com/how-to-watch-basketball/
Reinforcement learning/AI in sport
- How A Bot Made Team New Zealand Faster and Smarter https://www.sailingworld.com/story/racing/how-a-bot-made-team-new-zealand-faster-and-smarter/
- Discovering Diverse Athletic Jumping Strategies https://arpspoof.github.io/project/jump/jump.html
- Game Plan: What AI can do for Football, and What Football can do for AI - https://arxiv.org/pdf/2011.09192.pdf
- Advancing sports analytics through AI research - https://deepmind.com/blog/article/advancing-sports-analytics-through-ai
- A Reinforcement Learning Based Approach to Play Calling in Football - https://drive.google.com/file/d/1j0kBqbRUL3HTdEDWYVLVYc6B21MT56G_/view
- TOWARDS OPTIMIZED ACTIONS IN CRITICAL SITUATIONS OF SOCCER GAMES WITH DEEP REINFORCEMENT LEARNING - https://arxiv.org/pdf/2109.06625v1.pdf
- Markov Cricket: Using Forward and Inverse Reinforcement Learning to Model, Predict And Optimize Batting Performance in One-Day International Cricket - https://arxiv.org/ftp/arxiv/papers/2103/2103.04349.pdf
- Learning to play Table Tennis using Multi-agent Reinforcement Learning - https://sowmyavoona96.github.io/csci527/TP%20(2).pdf
- Q-Ball: Modeling Basketball Games Using Deep Reinforcement Learning - https://www.aaai.org/AAAI22Papers/AAAI-8152.YanaiC.pdf
- Deep reinforcement learning in a racket sport for player evaluation with technical and tactical contexts - https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9775086
- How to run a world record? A Reinforcement Learning approach
- TacticAI: an AI assistant for football tactics - https://arxiv.org/pdf/2310.10553.pdf
Strategy proposal and simulation
- Insights from the Application of an Agent-Based Computer Simulation as a Coaching Tool for Top-Level Rugby Union - https://journals.sagepub.com/doi/10.1260/1747-9541.8.3.493
Working in sports analytics
Playing ‘style’ and player ‘similarity’ (teams and players)
- Player Vectors: Characterizing Soccer Players’ Playing Style from Match Event Streams - https://ecmlpkdd2019.org/downloads/paper/701.pdf
- A pilot study to measure game style within Australian football - https://www.tandfonline.com/doi/pdf/10.1080/24748668.2017.1372163
- MEASURING THE SIMILARITY BETWEEN PLAYERS IN AUSTRALIAN FOOTBALL - https://www.researchgate.net/profile/Karl-Jackson-6/publication/305388519_MEASURING_THE_SIMILARITY_BETWEEN_PLAYERS_IN_AUSTRALIAN_FOOTBALL/links/578c16f308ae59aa667c4c91/MEASURING-THE-SIMILARITY-BETWEEN-PLAYERS-IN-AUSTRALIAN-FOOTBALL.pdf
- 6MapNet: Representing Soccer Players from Tracking Data by a Triplet Network - https://arxiv.org/pdf/2109.04720v1.pdf
- Pass2vec: Analyzing soccer players’ passing style using deep learning
- Archetypoid analysis for sports analytics - https://dl.acm.org/doi/abs/10.1007/s10618-017-0514-1
- A scalable framework for NBA player and team comparisons using player tracking data - https://content.iospress.com/articles/journal-of-sports-analytics/jsa0022
- Tired: PCA + kmeans, Wired: UMAP + GMM - https://tonyelhabr.rbind.io/posts/dimensionality-reduction-and-clustering/
- Coach2vec: autoencoding the playing style of soccer coaches - https://arxiv.org/ftp/arxiv/papers/2106/2106.15444.pdf
- The origins of goals in the German Bundesliga - https://www.tandfonline.com/doi/full/10.1080/02640414.2021.1943981
- Classifying ball trajectories in invasion sports using dynamic time warping: A basketball case study - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0272848
Forecasting player performance
- PECOTA - https://en.wikipedia.org/wiki/PECOTA
- CARMELO - https://fivethirtyeight.com/features/how-were-predicting-nba-player-career/
- DARKO - https://apanalytics.shinyapps.io/DARKO//
- Predicting the Potential of Professional Soccer Players - http://ceur-ws.org/Vol-1971/paper-02.pdf
- Predicting the future performance of soccer players - https://onlinelibrary.wiley.com/doi/full/10.1002/sam.11321
- Can Elite Australian Football Player’s Game Performance Be Predicted? - https://sciendo.com/de/article/10.2478/ijcss-2021-0004
- Estimation of player aging curves using regression and imputation - https://link.springer.com/article/10.1007/s10479-022-05127-y - and https://arxiv.org/pdf/2110.14017.pdf
- Large data and Bayesian modeling—aging curves of NBA players - https://link.springer.com/article/10.3758/s13428-018-1183-8
- Forecasting basketball players’ performance using sparse functional data - https://onlinelibrary.wiley.com/doi/pdf/10.1002/sam.11436?casa_token=OwxCz_66t3MAAAAA:yqa3qwEjzcj_O0H6MphXXyydwyvZr8S3KmJQqGqJrVBT2VaG35Un_8jvdwHZn-SewKpdC_cJhqiMuQ
- Bayesian Hierarchical Modeling Applied to Fantasy Football Projections for Increased Insight and Confidence - https://srome.github.io/Bayesian-Hierarchical-Modeling-Applied-to-Fantasy-Football-Projections-for-Increased-Insight-and-Confidence/
- Bayesian prediction of winning times for elite swimming events - https://www.tandfonline.com/doi/full/10.1080/02640414.2021.1976485
- Bayesian modelling of elite sporting performance with large databases - https://www.degruyter.com/document/doi/10.1515/jqas-2021-0112/html
- Estimating the effects of age on NHL player performance - https://www.degruyter.com/document/doi/10.1515/jqas-2013-0085/html
- Modelling the dynamics of change in the technical skills of young basketball players: The INEX study - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0257767#abstract0
- Modelling age-related changes in executive functions of soccer players - https://arxiv.org/pdf/2105.01226.pdf
Drafting
- Draft efficiency - https://statsbylopez.com/2017/04/25/evaluating-the-evaluators/
- What Does It Mean to Draft Perfectly in the NHL?
- Major League Draft WARs: An Analysis of Wins Above Replacement in Player Selection - https://content.iospress.com/download/journal-of-sports-analytics/jsa200586?id=journal-of-sports-analytics%2Fjsa200586
- Combine performance, draft position and playing position are poor predictors of player career outcomes in the Australian Football League - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0234400
- Optionality in Australian Football League draftee contracts - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0291439
Recruiter perspectives
- Exploring the skill of recruiting in the Australian Football League - https://journals.sagepub.com/doi/full/10.1177/1747954118809775
- An eye for talent: The recruiters’ role in the Australian Football talent pathway - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0241307
Randomness and skill vs. luck in sport
- HOW OFTEN DOES THE BEST TEAM WIN? A UNIFIED APPROACH TO UNDERSTANDING RANDOMNESS IN NORTH AMERICAN SPORT - https://arxiv.org/pdf/1701.05976.pdf
- IDENTIFICATION_AND_MEASUREMENT_OF_LUCK_IN_SPORT - https://www.researchgate.net/publication/305388606_IDENTIFICATION_AND_MEASUREMENT_OF_LUCK_IN_SPORT
- Meta-analytics: tools for understanding the statistical properties of sports metrics - https://www.degruyter.com/view/journals/jqas/12/4/article-p151.xml
- When can we trust a team’s stats? - https://fansided.com/2017/12/21/nylon-calculus-team-stats-noise-stabilization-thunder/
- How Long Does It Take For Three Point Shooting To Stabilize? - https://fansided.com/2014/08/29/long-take-three-point-shooting-stabilize/
- Baseball Therapy: It’s a Small Sample Size After All - https://www.baseballprospectus.com/news/article/17659/baseball-therapy-its-a-small-sample-size-after-all/
- Are Launch Angles Skills? - Analyzing Baseball Data with R, Second Edition
In-game win probability
NFL:
- https://medium.com/@technocat79/building-a-basic-in-game-win-probability-model-for-the-nfl-54600e57fe1c
- https://statsbylopez.com/2017/03/08/all-win-probability-models-are-wrong-some-are-useful/
- Going deep: models for continuous-time within-play valuation of game outcomes in American football with tracking data - https://www.degruyter.com/document/doi/10.1515/jqas-2019-0056/html
- iWinRNFL: A Simple, Interpretable & Well-Calibrated In-Game Win Probability Model for NFL - https://arxiv.org/pdf/1704.00197.pdf
AFL:
- https://thearcfooty.com/within-game-win-probabilities/
- https://thearcfooty.com/2017/02/07/win-probability-estimates-what-are-they-good-for/
- AFLaytics - Quantifying what makes a good game of footy - https://www.aflalytics.com/blog/2018/7/quantifying-good-match-footy/
- Real time prediction of match outcomes in Australian football - https://www.tandfonline.com/doi/full/10.1080/02640414.2023.2259266
Soccer:
- A Bayesian Approach to In-Game Win Probability in Soccer https://dl.acm.org/doi/10.1145/3447548.3467194
Basketball:
- A Data Snapshot Approach for Making Real-Time Predictions in Basketball
Match prediction & Team rating models
Forecasting football matches by predicting match statistics - https://content.iospress.com/download/journal-of-sports-analytics/jsa200462?id=journal-of-sports-analytics%2Fjsa200462
- Soccer. Predict shots on/off target, corners, and goals for each team. Combine those forecasts to predict match result.
- A Critical Comparison of Machine Learning Classifiers to Predict Match Outcomes in the NFL - https://sciendo.com/de/article/10.2478/ijcss-2020-0009
- Multifactorial analysis of factors influencing elite Australian football match outcomes: a machine learning approach - https://sciendo.com/article/10.2478/ijcss-2019-0020
- A Two-Stage Bayesian Model for Predicting Winners in Major League Baseball - http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep1&type=pdf&doi=10.1.1.124.4257
- Modelling Australian Rules Football as spatial systems with pairwise comparisons - https://www.degruyter.com/document/doi/10.1515/jqas-2021-0035/html
- A Data-Driven Machine Learning Algorithm for Predicting the Outcomes of NBA Games - https://www.mdpi.com/2073-8994/15/4/798
- THE REPLICATION PROJECT: IS XG THE BEST PREDICTOR OF FUTURE RESULTS? - https://www.americansocceranalysis.com/home/2022/7/19/the-replication-project-is-xg-the-best-predictor-of-future-results
Tipping models
- AFL Lab - SOLDIER Model: https://theafllab.wordpress.com/2019/03/02/the-soldier-model/
- AFL Gains: https://ricporteous.netlify.com/post/machine-learning-in-afl/#creating-a-machine-learning-model-to-predict-afl-matches
- AFLaytics - A Brownian Motion Inspired ELO Model: https://www.aflalytics.com/blog/2019/1/brownian-motion-inspired-elo-model/
- Build an AFL Elo with FitzRoy: https://analysisofafl.netlify.com/models/2018-07-23-build-a-quick-elo/
- AFL teams Elo ratings and footy-tipping: http://freerangestats.info/blog/2019/03/23/afl-elo
Causal Inference in sport
- Is Soccer Wrong About Long Shots?
- Modeling Player and Team Performance in Basketball - https://www.annualreviews.org/doi/pdf/10.1146/annurev-statistics-040720-015536
- We conclude with a discussion on the future of basketball analytics and, in particular, highlight the need for causal inference in sports.
- Understanding causal inference: the future direction in sports injury prevention - https://pubmed.ncbi.nlm.nih.gov/17513917/
- What Might a Theory of Causation Do for Sport? - https://www.mdpi.com/2409-9287/4/2/34/pdf
- Derrick Yam, Michael J. Lopez, “What was lost? A causal estimate of fourth down behavior in the National Football League”, Journal of Sports Analytics, 2019.
- https://mladenjovanovic.github.io/bmbstats-book/causal-inference.html
- Implementation of path analysis and piecewise structural equation modelling to improve the interpretation of key performance indicators in team sports: An example in professional rugby union - https://www.tandfonline.com/doi/full/10.1080/02640414.2021.1943169?s=03&journalCode=rjsp20#.YM-z2hohnOg.twitter
- A holistic analysis of collective behaviour and team performance in Australian Football via structural equation modelling - https://www.tandfonline.com/doi/full/10.1080/24733938.2022.2046286
- Estimating the causal effect of defensive formation on yards gained in run plays - https://operations.nfl.com/media/4199/bdb_kruchten.pdf
Player evaluation/rating
- cricWAR: A reproducible system for evaluating player performance in limited-overs cricket - https://www.sloansportsconference.com/research-papers/cricwar-a-reproducible-system-for-evaluating-player-performance-in-limited-overs-cricket
- Finding Your Feet: A Gaussian Process Model for Estimating the Abilities of Batsmen in Test Cricket - https://academic.oup.com/jrsssc/article/70/2/481/7033927?login=false
- A Bayesian Approach for Determining Player Abilities in Football – https://academic.oup.com/jrsssc/article/70/1/174/7033964#395473026
- Modelling player performance in basketball through mixed models - https://www.tandfonline.com/doi/abs/10.1080/24748668.2013.11868632
- Estimating an NBA player’s impact on his team’s chances of winning - https://www.degruyter.com/document/doi/10.1515/jqas-2015-0027/html?lang=en
- Deep Dive on Regularized Adjusted Plus Minus
- https://squared2020.com/2017/09/18/deep-dive-on-regularized-adjusted-plus-minus-i-introductory-example/
- https://squared2020.com/2017/09/18/deep-dive-on-regularized-adjusted-plus-minus-ii-basic-application-to-2017-nba-data-with-r/
- https://squared2020.com/2018/12/24/regularized-adjusted-plus-minus-part-iii-what-had-really-happened-was/
Shooting/Kicking/Passing ratings + xGoals
- Statistical modelling of goalkicking performance in the Australian Football League (BAYESIAN) - https://www.sciencedirect.com/science/article/pii/S1440244022001335
- Factors Affecting Set Shot Goal-kicking Performance in the Australian Football League - https://journals.sagepub.com/doi/full/10.1177/0031512518781265
- Upgrading Expected Goals - https://statsbomb.com/articles/soccer/upgrading-expected-goals/
EPV, VAEP, xThreat, Equity
Basketball
- Cervone, D., D’Amour, A., Bornn, L., & Goldsberry, K. (2016). A multiresolution stochastic process model for predicting basketball possession outcomes. Journal of the American Statistical Association, 111(514), 585–599.
- Expected Possession Value: An Evaluation Framework for Decision-Making, Strategy, and Execution in Basketball, Ivan C. Jutamulia, (B.S. Computer Science and Engineering) - http://dspace.mit.edu/bitstream/handle/1721.1/139205/Jutamulia-ivanj-meng-eecs-2021-thesis.pdf?sequence=1&isAllowed=y
AFL
- O’Shaughnessy, D. M. (2006). Possession versus position: strategic evaluation in AFL. Journal of sports science & medicine, 5(4), 533.
- ASSESSING PLAYER PERFORMANCE IN AUSTRALIAN FOOTBALL USING SPATIAL DATA - Karl Jackson - https://researchbank.swinburne.edu.au/file/248ec147-72d7-448c-a19d-49f01d90b12f/1/Karl%20Jackson%20Thesis.pdf
- Predicting and Understanding Australian Rules Football Using Markov Processes - https://link.springer.com/chapter/10.1007/978-3-030-99333-7_5
Soccer
- Possession Is The Puzzle Of Soccer Analytics. These Models Are Trying To Solve It - https://fivethirtyeight.com/features/possession-is-the-puzzle-of-soccer-analytics-these-models-are-trying-to-solve-it/
- Explaining Expected Threat - https://soccermatics.medium.com/explaining-expected-threat-cbc775d97935
- Uppsala: Expected possession value - https://uppsala.instructure.com/courses/28112/pages/8-expected-possession-value
- Introducing Expected Threat (xT) - https://karun.in/blog/expected-threat.html
- Valuing On-the-Ball Actions in Soccer: A Critical Comparison of xT and VAEP - https://tomdecroos.github.io/reports/xt_vs_vaep.pdf
- INTRODUCING A POSSESSION VALUE FRAMEWORK - https://www.statsperform.com/resource/introducing-a-possession-value-framework/
- A framework for the fine-grained evaluation of the instantaneous expected value of soccer possessions - https://link.springer.com/article/10.1007/s10994-021-05989-6
- Decroos, T., Bransen, L., Van Haaren, J., & Davis, J. (2019). Actions speak louder than goals: Valuing player actions in soccer. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 1851–1861).
- Link, D., Lang, S., & Seidenschwarz, P. (2016). Real time quantification of dangerousity in football using spatiotemporal tracking data. PLoS ONE, 11(12), e0168768.
- Rudd, S. (2011). A framework for tactical analysis and individual offensive production assessment in soccer using markov chains. In New England symposium on statistics in sports. http://nessis.org/nessis11/rudd.pdf.
- Spearman, W. (2018). Beyond expected goals. In Proceedings of the 12th MIT sloan sports analytics conference.
- Unpacking Ball Progression - https://statsbomb.com/articles/soccer/unpacking-ball-progression/
- Soccer as a Markov process: modelling and estimation of the zonal variation of team strengths - https://academic.oup.com/imaman/advance-article-abstract/doi/10.1093/imaman/dpab042/6512363
- An evaluation of characteristics of teams in association football by using a Markov process model - https://www.jstor.org/stable/4128133
- Guide to Expected Possession Value - https://abhiamishra.github.io/ggshakeR/articles/Guide_to_EPV.html
- Guide to Expected Threat - https://abhiamishra.github.io/ggshakeR/articles/Guide_to_Exp_Threat.html
NFL
- Yurko, R., Matano, F., Richardson, L. F., Granered, N., Pospisil, T., Pelechrinis, K., & Ventura, S.L. (2020). Going deep: models for continuous-time within-play valuation of game outcomes in American football with tracking data. Journal of Quantitative Analysis in Sports 1(ahead-of-print).
Rugby
- Integrating machine learning and decision support in tactical decision-making in rugby union - https://www.tandfonline.com/doi/full/10.1080/01605682.2020.1779624
- Development of an expected possession value model to analyse team attacking performances in rugby league - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8589207/
- The expected value of possession in professional rugby league match-play - https://pubmed.ncbi.nlm.nih.gov/26190116/
Action valuation
- Valuing actions intro: The principles of valuing actions - https://www.youtube.com/watch?v=xyyZLs_N1F0&ab_channel=FriendsofTracking
- Evaluating actions in football using machine learning - https://soccermatics.medium.com/evaluating-actions-in-football-using-machine-learning-69517e376e0c
- Valuing Player Actions in Counter-Strike: Global Offensive - https://arxiv.org/pdf/2011.01324v2.pdf
- Fitting your own football xG model - https://www.datofutbol.cl/xg-model/
- Space-Time VON CRAMM: Evaluating Decision-Making in Tennis with Variational generatiON of Complete Resolution Arcs via Mixture Modeling - https://arxiv.org/abs/2005.12853
Regression to the mean
- REGRESSION TO THE MEAN: AN EXAMPLE USING GOALKICKING - https://analysisofafl.netlify.app/models/2018-06-20-regression-to-the-mean/
Defensive valuation
- What Happened Next? Using Deep Learning to Value Defensive Actions in Football Event-Data - https://arxiv.org/pdf/2106.01786.pdf
- https://fivethirtyeight.com/features/a-better-way-to-evaluate-nba-defense/
- Counterpoints: Advanced Defensive Metrics for NBA Basketball - http://www.lukebornn.com/papers/franks_ssac_2015.pdf
- Using In-Game Shot Trajectories to Better Understand Defensive Impact in the NBA - https://arxiv.org/pdf/1905.00822.pdf
- The effect of team formation on defensive performance in Australian football - https://www.sciencedirect.com/science/article/abs/pii/S1440244021002358
- Making Offensive Play Predictable - Using a Graph Convolutional Network to Understand Defensive Performance in Soccer - https://o7dkx1gd2bwwexip1qwjpplu-wpengine.netdna-ssl.com/wp-content/uploads/2021/04/1617733444_PaulPowerOffensivePlaySoccerRPpaper-1.pdf
- Paul Power: neural networks for understanding defending - https://www.youtube.com/watch?v=d5NBm4CFygo&ab_channel=FriendsofTracking
Player tracking data
- Special Issue on Player Tracking Data in the National Football League (NFL) - https://www.degruyter.com/view/journals/jqas/16/2/jqas.16.issue-2.xml
- Factorized Point Process Intensities: A Spatial Analysis of Professional Basketball - https://arxiv.org/pdf/1401.0942.pdf
- Characterizing the spatial structure of defensive skill in professional basketball - https://arxiv.org/pdf/1405.0231.pdf
- A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes - https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1141685
- The role of intrinsic dimension in high-resolution player tracking data—Insights in basketball - https://arxiv.org/pdf/2002.04148.pdf
- Automatic event detection in basketball using HMM with energy based defensive assignment - https://www.degruyter.com/document/doi/10.1515/jqas-2017-0126/html?lang=en
- Route identification in the National Football League - https://www.degruyter.com/document/doi/10.1515/jqas-2019-0047/html
- Conference talk: https://www.youtube.com/watch?v=rnAzURpLLbs&ab_channel=MarkGlickman
- Template matching route classification - https://www.degruyter.com/document/doi/10.1515/jqas-2019-0051/html
- Possession Sketches: Mapping NBA Strategies - http://www.lukebornn.com/papers/miller_ssac_2017.pdf
- Using Data To Determine Blitz Strategy - https://www.kaggle.com/code/dominicborsani/using-data-to-determine-blitz-strategy?s=03
- A method for evaluating player decision-making in the Australian Football League - https://www.researchgate.net/profile/Bart-Spencer/publication/335101736_A_method_for_evaluating_player_decision-making_in_the_Australian_Football_League/links/5d4f512792851cd046b26add/A-method-for-evaluating-player-decision-making-in-the-Australian-Football-League.pdf
- NFL Big Data Bowl - How many yards will an NFL player gain after receiving a handoff? - 1st place solution The Zoo - https://www.kaggle.com/c/nfl-big-data-bowl-2020/discussion/119400
- Routine Inspection: Measuring Playbooks for Corner Kicks - https://global-uploads.webflow.com/5f1af76ed86d6771ad48324b/606e51c17bf6c8ba83d69a01_LaurieShaw-CornerKicks-RPpaper.pdf
- https://www.youtube.com/watch?v=yfPC1O_g-I8&t=3002s&ab_channel=MarkGlickman
- Effects of collective tactical variables and predictors on the probability of scoring in elite netball - https://www.tandfonline.com/doi/full/10.1080/24748668.2023.2225274
- The influence of match phase and field position on collective team behaviour in Australian Rules football - https://www.tandfonline.com/doi/full/10.1080/02640414.2019.1586077
- Quantifying congestion with player tracking data in Australian football - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0272657
- Team numerical advantage in Australian rules football: A missing piece of the scoring puzzle? - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0254591
- A copula-based multivariate hidden Markov model for modelling momentum in football - https://link.springer.com/article/10.1007/s10182-021-00395-8
- A STATISTICAL MODEL OF SERVE RETURN IMPACT PATTERNS IN PROFESSIONAL TENNIS - https://arxiv.org/pdf/2202.00583.pdf
- Is it worth the effort? Understanding and contextualizing physical metrics in soccer - https://arxiv.org/pdf/2204.02313.pdf
NMF
- Understanding Trends in the NBA: How NNMF Works - https://squared2020.com/2018/10/04/understanding-trends-in-the-nba-how-nnmf-works/
- Finding Patterns in Statsbomb Data: Non-Negative Matrix Factorization Applications - https://znstrider.github.io/2018-11-14-SBData-Non-Negative-Matrix-Factorization/
- A Bayesian marked spatial point processes model for basketball shot chart - https://www.degruyter.com/document/doi/10.1515/jqas-2019-0106/html
- Decomposing and Smoothing Soccer Spatial Tendencies - https://tonyelhabr.rbind.io/posts/decomposition-smoothing-soccer/
Pitch control
- Spearman - Quantifying Pitch Control: https://www.researchgate.net/publication/334849056_Quantifying_Pitch_Control
- Space and Control in Soccer - https://www.frontiersin.org/articles/10.3389/fspor.2021.676179/full
- Contextual movement models based on normalizing flows
Pass models/Completion models
- Frame by frame completion probability of an NFL pass - https://arxiv.org/pdf/2109.08051v1.pdf
- Unsupervised methods for identifying pass coverage among defensive backs with NFL player tracking data - https://www.degruyter.com/document/doi/10.1515/jqas-2020-0017/html
- Expected passes: Determining the difficulty of a pass in football (soccer) using spatio-temporal data - https://link.springer.com/content/pdf/10.1007/s10618-021-00810-3.pdf
- Extracting NFL tracking data from images to evaluate quarterbacks and pass defenses - https://www.degruyter.com/document/doi/10.1515/jqas-2019-0052/html?lang=en
- Quarterback evaluation in the national football league using tracking data - https://link.springer.com/article/10.1007/s10182-021-00406-8
- Passing and Pressure Metrics in Ice Hockey - https://www.semanticscholar.org/paper/4ea87ef8e84a461722b7381323ad6a93fd530362
Trajectory prediction (‘ghosting’)
- Basketball GAN: Sportingly Acceptable Trajectory Prediction - https://drive.google.com/file/d/1eZV5mIutg5aoiKqD3jSLUwerueoNFfzH/view
- Where will they go? predicting fine-grained adversarial multi-agent motion using conditional variational autoencoders.
- Bhostgusters: Realtime interactive play sketching with synthesized nba defenses.
- Generating multi-agent trajectories using programmatic weak supervision.
- A Graph Attention Based Approach for Trajectory Prediction in Multi-agent Sports Games - https://arxiv.org/pdf/2012.10531v1.pdf
- baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents - https://arxiv.org/pdf/2104.11980v2.pdf
- Simulating Defensive Trajectories in American Football for Predicting League Average Defensive Movements - https://www.frontiersin.org/articles/10.3389/fspor.2021.669845/full
- Inferring Player Location in Sports Matches: Multi-Agent Spatial Imputation from Limited Observations - https://arxiv.org/abs/2302.06569
- Multiagent off‑screen behavior prediction in football - https://www.nature.com/articles/s41598-022-12547-0
Training plan generation and optimisation
- Carey: Optimizing preseason training loads in Australian Football - https://journals.humankinetics.com/view/journals/ijspp/13/2/article-p194.xml
- Connor: Adaptive Athlete Training Plan Generation: An intelligent control systems approach - https://www.sciencedirect.com/science/article/pii/S1440244021004679
- Connor: Optimising Team Sport Training Plans With Grammatical Evolution - https://ieeexplore.ieee.org/document/8790369
Event stream analysis
- Supervised sequential pattern mining of event sequences in sport to identify important patterns of play: an application to rugby union - https://arxiv.org/pdf/2010.15377v4.pdf
Subjective ratings
- Capturing the “expert’s eye”: Towards a better understanding and implementation of subjective performance evaluations in team sports - https://sportrxiv.org/index.php/server/preprint/view/6/20
Coaching/scouting
- Full Jose Mourinho Scouting Report on FC Barcelona from 2005/2006 - https://twitter.com/_DaliborPlavsic/status/1106722625470889984?s=19
E-Sports
- Examining the game-specific practice behaviors of professional and semi-professional esports players: A 52-week longitudinal study - https://www.sciencedirect.com/science/article/pii/S0747563222002436
Sport science
Technology validation
- Methods to assess validity of positioning systems in team sports: can we do better? - https://bmjopensem.bmj.com/content/bmjosem/9/1/e001496.full.pdf
Fitness-Fatigue models
- The Use of Fitness-Fatigue Models for Sport Performance Modelling: Conceptual Issues and Contributions from Machine-Learning - https://link.springer.com/article/10.1186/s40798-022-00426-x?utm_source=getftr&utm_medium=getftr&utm_campaign=getftr_pilot
- A Deep Learning Approach for Fatigue Prediction in Sports Using GPS Data and Rate of Perceived Exertion - https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9881489
Non-invasive monitoring
- The paradox of “invisible” monitoring: The less you do, the more you do! - https://hiitscience.com/the-paradox-of-invisible-monitoring-the-less-you-do-the-more-you-do/
Injuries
- Untangling the NFL Injury Web - https://www.footballoutsiders.com/stat-analysis/2018/untangling-nfl-injury-web
- Blood sample profle helps to injury forecasting in elite soccer players - https://link.springer.com/content/pdf/10.1007/s11332-022-00932-1.pdf
- Characteristics of Complex Systems in Sports Injury Rehabilitation: Examples and Implications for Practice - https://sportsmedicine-open.springeropen.com/articles/10.1186/s40798-021-00405-8
- Just How Confident Can We Be in Predicting Sports Injuries? A Systematic Review of the Methodological Conduct and Performance of Existing Musculoskeletal Injury Prediction Models in Sport - https://link-springer-com.ez.library.latrobe.edu.au/article/10.1007/s40279-022-01698-9
- Modeling time loss from sports-related injuries using random effects models: an illustration using soccer-related injury observations - https://www.degruyter.com/view/journals/jqas/ahead-of-print/article-10.1515-jqas-2019-0030/article-10.1515-jqas-2019-0030.xml
- Training Load and Injury Part 1: The Devil Is in the Detail—Challenges to Applying the Current Research in the Training Load and Injury Field - https://www.jospt.org/doi/full/10.2519/jospt.2020.9675
- Training Load and Injury Part 2: Questionable Research Practices Hijack the Truth and Mislead Well-Intentioned Clinicians - https://www.jospt.org/doi/full/10.2519/jospt.2020.9211
Running and wearables
- Feasibility and usability of GPS data in exploring associations between training load and running-related knee injuries in recreational runners - https://bmcsportsscimedrehabil.biomedcentral.com/articles/10.1186/s13102-022-00472-8
- Injuries in Baseball: How (Self-)Exciting? - https://sharpestats.com/mlb-injury-point-process/
- Towards a complex systems approach in sports injury research: simulating running-related injury development with agent-based modelling - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6579554/
- Association Between Temporal Spatial Parameters and Overuse Injury History in Runners: A Systematic Review and Meta-analysis - https://link.springer.com/article/10.1007/s40279-019-01207-5
- Athlete Monitoring in Professional Road Cycling Using Similarity Search on Time Series Data - https://link.springer.com/chapter/10.1007/978-3-031-27527-2_9
- The “impacts cause injury” hypothesis: Running in circles or making new strides? - https://www.sciencedirect.com/science/article/abs/pii/S0021929023002634
- Comparison of different measures to monitor week-to-week changes in training load in high school runners - https://journals.sagepub.com/doi/full/10.1177/1747954120970305
- Accelerometer-based prediction of running injury in National Collegiate Athletic Association track athletes - https://www.sciencedirect.com/science/article/pii/S0021929018302653
- A 2-Year Prospective Cohort Study of Overuse Running Injuries: The Runners and Injury Longitudinal Study (TRAILS) - https://journals.sagepub.com/doi/full/10.1177/0363546518773755
Movement variability
Datasets and competitions
- SportsDataVerse - https://sportsdataverse.org/
- Scaling up SoccerNet with multi-view spatial localization and re-identification - https://www.nature.com/articles/s41597-022-01469-1
- DeepSport Dataset: 300+ high-resolution professional basketball images with multiple annotations - https://www.kaggle.com/gabrielvanzandycke/deepsport-dataset
- Collection and Validation of Psychophysiological Data from Professional and Amateur Players: a Multimodal eSports Dataset - https://arxiv.org/pdf/2011.00958v2.pdf
- Comprehensive Dataset of Broadcast Soccer Videos - https://ieeexplore.ieee.org/document/8397046 & http://media.hust.edu.cn/dataset.htm
- MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions - https://arxiv.org/pdf/2105.07404v2.pdf
- TeamTrack: An Algorithm and Benchmark Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos - https://github.com/AtomScott/TeamTrack
PERSIST: A Multimodal Dataset for the Prediction of Perceived Exertion during Resistance Training - https://www.mdpi.com/2306-5729/8/1/9
- Tennis - https://github.com/skoval/deuce
- NRL rugby - https://github.com/fredgj/rugby_scraper
- Rugby - https://github.com/walsh06/python_rugby
- ncaahoopR - https://github.com/lbenz730/ncaahoopR
- statsbomb - soccer - https://github.com/statsbomb/open-data
- NFL - big data bowl - https://github.com/nfl-football-ops/Big-Data-Bowl
- Harvard sports analytics - http://harvardsportsanalysis.org/
- NBA player tracking - https://github.com/PatrickChodowski/NBAr
- Multi-sport - https://github.com/meysubb/Sports_Data_Reference
- https://github.com/meysubb/Sports_Data_Reference/blob/master/R/Data.md
- Multi-sport - https://github.com/octonion?tab=repositories
- NBL - https://jaseziv.github.io/nblR/
- WNBL - https://github.com/jacquietran/wnblr
NBA - https://www.kaggle.com/datasets/wyattowalsh/basketball
- https://twitter.com/DSamangy/status/1492206283214114817?t=by17xVuXOVQBr0-eacK7QQ&s=03
Competitions
- AO Data Slam 2023 - Predict the next serve - https://www.crowdanalytix.com/contests/ao-data-slam-2023—predict-the-next-serve
- NFL 1st and Future - Playing Surface Analytics - https://www.kaggle.com/competitions/nfl-playing-surface-analytics
- NFL 1st and Future - Impact Detection - https://www.kaggle.com/competitions/nfl-impact-detection
- 1st and Future - Player Contact Detection - https://www.kaggle.com/competitions/nfl-player-contact-detection/overview
Deep Learning
- Troubleshooting Deep Neural Networks - http://josh-tobin.com/assets/pdf/troubleshooting-deep-neural-networks-01-19.pdf
- An overview of gradient descent optimization algorithms - https://ruder.io/optimizing-gradient-descent/
Andrej Karpathy - Neural Networks: Zero to Hero - https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ
Reinforcement learning
- Deep Reinforcement Learning: Pong from Pixels - http://karpathy.github.io/2016/05/31/rl/
Graph neural networks
- A Gentle Introduction to Graph Neural Networks - https://distill.pub/2021/gnn-intro/
- Understanding Convolutions on Graphs - https://distill.pub/2021/understanding-gnns/
- An attempt at demystifying graph deep learning - https://ericmjl.github.io/essays-on-data-science/machine-learning/graph-nets/
Computer vision
- EECS 4422 Computer Vision - https://www.eecs.yorku.ca/~kosta/Courses/EECS4422/
Computer vision in sport
- Computer vision in sport papers - https://github.com/avijit9/awesome-computer-vision-in-sports
- SportLogiQ research - https://www.sportlogiq.com/publications/
Player detection
- Accelerating the creation of instance segmentation training sets through bounding box annotation - https://arxiv.org/pdf/2205.11563.pdf
- Multimodal and multiview distillation for real-time player detection on a football field
- Semi-Supervised Training to Improve Player and Ball Detection in Soccer - https://arxiv.org/abs/2204.06859
Player/Team ID
- Pose Guided Gated Fusion for Person Re-identification https://openaccess.thecvf.com/content_WACV_2020/papers/Bhuiyan_Pose_Guided_Gated_Fusion_for_Person_Re-identification_WACV_2020_paper.pdf
- Contrastive Learning for Sports Video: Unsupervised Player Classification - https://arxiv.org/pdf/2104.10068v2.pdf
- Associative embedding for team discrimination
Player tracking
- Automated repair of fragmented tracks with 1D CNNs
- Comparison of a computer vision system against three-dimensional motion capture for tracking football movements in a stadium environment - https://link.springer.com/article/10.1007/s12283-021-00365-y
- SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes - https://arxiv.org/abs/2304.05170
- Visualizing Skiers’ Trajectories in Monocular Videos - https://openaccess.thecvf.com/content/CVPR2023W/CVSports/papers/Dunnhofer_Visualizing_Skiers_Trajectories_in_Monocular_Videos_CVPRW_2023_paper.pdf
- Individual Locating of Soccer Players from a Single Moving View - https://www.mdpi.com/1424-8220/23/18/7938
- Extraction of Positional Player Data From Broadcast Soccer Videos - https://openaccess.thecvf.com/content/WACV2022/papers/Theiner_Extraction_of_Positional_Player_Data_From_Broadcast_Soccer_Videos_WACV_2022_paper.pdf
Action detection/recognition
- Group Activity Detection from Trajectory and Video Data in Soccer https://arxiv.org/pdf/2004.10299.pdf
- Actor-Transformers for Group Activity Recognition https://arxiv.org/pdf/2003.12737.pdf
- Sport action mining: Dribbling recognition in soccer - https://link.springer.com/article/10.1007/s11042-021-11784-1
- Pose is all you need: the pose only group activity recognition system (POGARS) - https://link.springer.com/article/10.1007/s00138-022-01346-2
Sport camera calibration
- SoccerNet Camera Calibration - https://www.soccer-net.org/tasks/camera-calibration
- Optimizing Through Learned Errors for Accurate Sports Field Registration https://openaccess.thecvf.com/content_WACV_2020/papers/Jiang_Optimizing_Through_Learned_Errors_for_Accurate_Sports_Field_Registration_WACV_2020_paper.pdf
- Fast Camera Calibration for the Analysis of Sport Sequences - https://www.dirk-farin.net/publications/data/Farin2005d_slides.pdf
- End-to-End Camera Calibration for Broadcast Videos
- Sports Field Recognition Using Deep Multi-task Learning - https://www.jstage.jst.go.jp/article/ipsjjip/29/0/29_328/_pdf
- Evaluating Soccer Player: from Live Camera to Deep Reinforcement Learning - https://arxiv.org/pdf/2101.05388.pdf
- BirdsPyView - https://github.com/rjtavares/BirdsPyView
- TVCalib: Camera Calibration for Sports Field Registration in Soccer - https://mm4spa.github.io/tvcalib/ and https://arxiv.org/pdf/2207.11709.pdf
- Self-Supervised Shape Alignment for Sports Field Registration - https://openaccess.thecvf.com/content/WACV2022/papers/Shi_Self-Supervised_Shape_Alignment_for_Sports_Field_Registration_WACV_2022_paper.pdf
- Individual Locating of Soccer Players from a Single Moving View - https://www.mdpi.com/1424-8220/23/18/7938
- Extraction of Positional Player Data From Broadcast Soccer Videos - https://openaccess.thecvf.com/content/WACV2022/papers/Theiner_Extraction_of_Positional_Player_Data_From_Broadcast_Soccer_Videos_WACV_2022_paper.pdf
- KaliCalib: A Framework for Basketball Court Registration - https://arxiv.org/pdf/2209.07795.pdf
- Sports Field Registration via Keypoints-aware Label Condition - https://cgv.cs.nthu.edu.tw/KpSFR_data/KpSFR_paper.pdf
Telestration
- Assessing the Efficacy of Video Telestration in Aiding Memory Recall Among Elite Professional Football Players - https://journals.iupui.edu/index.php/sij/article/view/26317/24440
- The use and perceived value of telestration tools in elite football - https://www.tandfonline.com/doi/abs/10.1080/24748668.2020.1753965?journalCode=rpan20
Vision transformers
- AN IMAGE IS WORTH 16X16 WORDS (ViT) - https://openreview.net/pdf?id=YicbFdNTTy
Image feature matching/ Homography
- DFM: A Performance Baseline for Deep Feature Matching https://github.com/ufukefe/DFM
- Perceptual Loss for Robust Unsupervised Homography Estimation https://arxiv.org/pdf/2104.10011.pdf
- LoFTR: Detector-Free Local Feature Matching with Transformers https://zju3dv.github.io/loftr/
- Reprojecting the Perseverance landing footage onto satellite imagery - https://matthewearl.github.io/2021/03/06/mars2020-reproject/
- Smooth Globally Warp Locally: Video Stabilization using Homography Fields - https://cs.adelaide.edu.au/~tjchin/lib/exe/fetch.php?media=papers:fields_preprint.pdf
Multi-object tracking
- MAT: Motion-aware multi-object tracking - https://www.sciencedirect.com/science/article/pii/S0925231221019627
Sensors and Signal processing
GPS
Kalman filter
- How a Kalman filter works, in pictures - https://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures/
Sensor fusion
- Estimating Orientation Using Inertial Sensor Fusion and MPU-9250 - https://au.mathworks.com/help/fusion/ug/Estimating-Orientation-Using-Inertial-Sensor-Fusion-and-MPU-9250.html;jsessionid=ebb750d5ad1abe079f51ecc7acf7?s_eid=PSM_15028
- IMU and GPS Fusion for Inertial Navigation - https://au.mathworks.com/help/fusion/ug/imu-and-gps-fusion-for-inertial-navigation.html
Signal dimensionality reduction
- PCA of waveforms and functional PCA: A primer for biomechanics - https://www.sciencedirect.com/science/article/pii/S0021929020305303
- https://github.com/johnwarmenhoven/PCA-FPCA
- https://www.st-andrews.ac.uk/~wjh/dataview/tutorials/principal%20component%20analysis.html
- https://www.psych.mcgill.ca/misc/fda/examples.html
Statistics, data science, and modelling
- The Lost Art of Mathematical Modelling - https://arxiv.org/pdf/2301.08559.pdf
- Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning (Sebastian Raschka) - https://arxiv.org/pdf/1811.12808.pdf
https://discourse.datamethods.org/
- Harrell - Statistical Problems to Document and to Avoid - https://discourse.datamethods.org/t/author-checklist/3407
- Reference Collection to push back against “Common Statistical Myths” - https://discourse.datamethods.org/t/reference-collection-to-push-back-against-common-statistical-myths/1787
- Publish your raw data and your speculations, then let other people do the analysis: track and field edition - https://statmodeling.stat.columbia.edu/2017/08/21/publish-raw-data-speculations-let-people-analysis-track-field-edition/
Packages:
Probability distributions & Data generating processes
- Probabilistic Building Blocks - https://betanalpha.github.io/assets/case_studies/probability_densities.html
- (What’s the Probabilistic Story) Modeling Glory? - https://betanalpha.github.io/assets/case_studies/generative_modeling.html
‘Significance’ and testing
- Abandon Statistical Significance - http://www.stat.columbia.edu/~gelman/research/unpublished/amstat.draft2.pdf
- Scientists rise up against statistical significance - https://www.nature.com/articles/d41586-019-00857-9
Meaningful change & effect sizes
- The Minimal Clinically Important Difference Changes Greatly Based on the Different Calculation Methods - https://journals.sagepub.com/doi/full/10.1177/03635465231152484?s=03
- Caldwell, A., & Vigotsky, A. D. (2020). A case against default effect sizes in sport and exercise science. PeerJ, 8, e10314. - https://peerj.com/articles/10314/#p-1
- Standardized or simple effect size: What should be reported? - https://bpspsychub.onlinelibrary.wiley.com/doi/pdf/10.1348/000712608X377117
- Why I don’t like standardised effect sizes - https://janhove.github.io/reporting/2015/02/05/standardised-vs-unstandardised-es
MBI
- Systematic review of the use of “magnitude-based inference” in sports science and medicine - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0235318#pone.0235318.ref018
- Gelman on MBI - https://statmodeling.stat.columbia.edu/2018/11/15/the-state-of-the-art/
- Can we trust “Magnitude-based inference”? - https://www.tandfonline.com/doi/full/10.1080/02640414.2018.1516004
- How to Interpret Changes in an Athletic Performance Test, Will G Hopkins - http://www.sportsci.org/jour/04/wghtests.htm
- Design and analysis of research on sport performance enhancement - https://journals.lww.com/acsm-msse/Fulltext/1999/03000/Design_and_analysis_of_research_on_sport.18.aspx
- Progressive Statistics for Studies in Sports Medicine and Exercise Science - https://journals.lww.com/acsm-msse/Fulltext/2009/01000/Progressive_Statistics_for_Studies_in_Sports.2.aspx
- Making Meaningful Inferences About Magnitudes - https://research.tees.ac.uk/ws/files/5918054/58195.pdf
Statistics in sport science
- Current Research and Statistical Practices in Sport Science and a Need for Change
Power and sample size
- The tyranny of power: is there a better way to calculate sample size? - https://www.bmj.com/content/339/bmj.b3985
- Sample Size Planning for Statistical Power and Accuracy in Parameter Estimation
- This review examines recent advances in sample size planning, not only from the perspective of an individual researcher, but also with regard to the goal of developing cumulative knowledge. Psychologists have traditionally thought of sample size planning in terms of power analysis. Although we review recent advances in power analysis, our main focus is the desirability of achieving accurate parameter estimates, either instead of or in addition to obtaining sufficient power. Accuracy in parameter estimation (AIPE) has taken on increasing importance in light of recent emphasis on effect size estimation and formation of confidence intervals. The review provides an overview of the logic behind sample size planning for AIPE and summarizes recent advances in implementing this approach in designs commonly used in psychological research.
- Power, precision, and sample size estimation in sport and exercise science research - https://www.tandfonline.com/doi/pdf/10.1080/02640414.2020.1776002
- TWO SAMPLE-SIZE PRACTICES THAT I DON’T RECOMMEND - http://homepage.divms.uiowa.edu/~rlenth/Power/2badHabits.pdf
Sample size calculations for clinical prediction models
- R package: https://cran.r-project.org/web/packages/pmsampsize/index.html
- Continuous variables: https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.7993
- Binary/Time to event: https://onlinelibrary.wiley.com/doi/full/10.1002/sim.7992
Decision making
- Decision making in health and medicine - https://www.researchgate.net/profile/Paul-Glasziou/publication/37621420_Decision_Making_in_Health_and_Medicine_Integrating_Evidence_and_Values/links/00b49518af3c2b7add000000/Decision-Making-in-Health-and-Medicine-Integrating-Evidence-and-Values.pdf
Courses
- Statistical Rethinking (2022 Edition) - https://github.com/rmcelreath/stat_rethinking_2022
- Statistical Rethinking with brms, ggplot2, and the tidyverse - https://bookdown.org/ajkurz/Statistical_Rethinking_recoded/
- Harrell - Biostatistics - http://hbiostat.org/bbr/?s=03
- Course material for Stat 451: Introduction to Machine Learning and Statistical Pattern Classification - https://github.com/rasbt/stat451-machine-learning-fs21
Causal inference
- Reducing bias through directed acyclic graphs - https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-8-70
- Examples of solid causal inferences from purely observational data - https://discourse.datamethods.org/t/examples-of-solid-causal-inferences-from-purely-observational-data/1686
The Effect: An Introduction to Research Design and Causality
Jennifer Hill - Causal inferences that capitalizes on machine learning and statistics: opportunities and challenges
- Causal design patterns for data analysts - https://emilyriederer.netlify.app/post/causal-design-patterns/?s=03
- Preventing churn like a bandit - https://medium.com/bigdatarepublic/preventing-churn-like-a-bandit-49b7c51b4929
- For effective treatment of churn, don’t predict churn - https://medium.com/bigdatarepublic/for-effective-treatment-of-churn-dont-predict-churn-58328967ec4f
Statistical tests vs. Linear models
Common statistical tests are linear models (or: how to teach stats) - https://lindeloev.github.io/tests-as-linear/#1_the_simplicity_underlying_common_tests
GAMs
- https://m-clark.github.io/documents.html
- https://www.fromthebottomoftheheap.net/2018/04/21/fitting-gams-with-brms/
- Hierarchical generalized additive models in ecology: an introduction with mgcv - https://peerj.com/articles/6876/
Interpretable machine learning
https://christophm.github.io/interpretable-ml-book/
Bayesian
- Bayesian workflow, Gelman - https://arxiv.org/pdf/2011.01808.pdf
- brms: An R Package for Bayesian Multilevel Models using Stan - https://cran.r-project.org/web/packages/brms/vignettes/brms_overview.pdf
- Advanced Bayesian Multilevel Modeling with the R Package brms - https://cran.r-project.org/web/packages/brms/vignettes/brms_multilevel.pdf
- Practical Bayes Part I - https://m-clark.github.io/posts/2021-02-28-practical-bayes-part-i/
- A visual introduction to Gaussian Belief Propagation - https://gaussianbp.github.io/
- Bayesian statistics meets sports: a comprehensive review - https://www.degruyter.com/document/doi/10.1515/jqas-2018-0106/html?lang=en
- BOOK: Bayesian Modeling and Computation in Python - https://bayesiancomputationbook.com/welcome.html
Mixed effect models
Notes on mixed models - https://docs.google.com/document/d/1pxABPqUGUR1tCQvS-7KNt0mWK_CeoP4fXBhD7dhW0Wk/edit?usp=sharing
- Fitting linear mixed models in R - http://staff.pubhealth.ku.dk/~jufo/courses/rm2018/nlmePackage.pdf
- Fitting Linear Mixed-Effects Models Using lme4 - https://cran.r-project.org/web/packages/lme4/vignettes/lmer.pdf
- Getting started with the glmmTMB package - https://cran.r-project.org/web/packages/glmmTMB/vignettes/glmmTMB.pdf
- Plotting partial pooling in mixed-effects models https://www.tjmahr.com/plotting-partial-pooling-in-mixed-effects-models/?s=03
- https://web.stanford.edu/class/psych252/section/Mixed_models_tutorial.html
- INTRODUCTION TO LINEAR MIXED MODELS - https://ourcodingclub.github.io/2017/03/15/mixed-models.html
- M-Clark: Mixed Models - https://m-clark.github.io/mixed-models-with-R/random_intercepts.html
- https://www.stat.cmu.edu/~hseltman/309/Book/chapter15.pdf
- A Beginner’s Introduction to Mixed Effects Models - https://meghan.rbind.io/blog/2022-06-28-a-beginner-s-guide-to-mixed-effects-models/
- A brief introduction to mixed effects modelling and multi-model inference in ecology - https://peerj.com/articles/4794/
- A very basic tutorial for performing linear mixed effects analyses - https://jontalle.web.engr.illinois.edu/MISC/lme4/bw_LME_tutorial.pdf
- Random effects and penalized splines are the same thing - https://www.tjmahr.com/random-effects-penalized-splines-same-thing/
- Elements of Applied Biostatistics: Chapter 16 Models with random factors – linear mixed models - https://www.middleprofessor.com/files/applied-biostatistics_bookdown/_book/lmm.html
- Hierarchical Modeling - https://betanalpha.github.io/assets/case_studies/hierarchical_modeling.html
Covariance structures & temporal models
- Temporal analysis of variation in random effects - https://stats.stackexchange.com/questions/19911/temporal-analysis-of-variation-in-random-effects
- Dealing with temporal autocorrelation - https://www.flutterbys.com.au/stats/tut/tut8.3b.html
- Guidelines for Selecting the Covariance Structure in Mixed Model Analysis - https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/198-30.pdf
- Modelling covariance structure in the analysis of repeated measures data - https://faculty.washington.edu/heagerty/Courses/VA-longitudinal/private/Littell-StatMed2000.pdf
- Covariance structures with glmmTMB - https://cran.r-project.org/web/packages/glmmTMB/vignettes/covstruct.html
- fitting mixed models with (temporal) correlations in R - https://bbolker.github.io/mixedmodels-misc/notes/corr_braindump.html
- R Code for Repeated Measures - https://dnett.github.io/S510/24RepeatedMeasuresR.pdf
Sport specific
- The Utility of Mixed Models in Sport Science: A Call for Further Adoption in Longitudinal Data Sets - https://journals.humankinetics.com/view/journals/ijspp/aop/article-10.1123-ijspp.2021-0496/article-10.1123-ijspp.2021-0496.xml
- Multilevel data collection and analysis for weight training (with R code) - https://statmodeling.stat.columbia.edu/2018/09/22/38708/
Time series
- MultiVariate (Dynamic) Generalized Addivite Models - https://nicholasjclark.github.io/mvgam/index.html
Ordinal models
- Ordinal Regression - https://betanalpha.github.io/assets/case_studies/ordinal_regression.html
- Mixed effect ordinal models - https://drizopoulos.github.io/GLMMadaptive/articles/Ordinal_Mixed_Models.html
Zero inflated data
- A guide to modeling outcomes that have lots of zeros with Bayesian hurdle lognormal and hurdle Gaussian regression models - https://www.andrewheiss.com/blog/2022/05/09/hurdle-lognormal-gaussian-brms/#3-hurdle-lognormal-model
- Hurdle Models - https://m-clark.github.io/models-by-example/hurdle.html
- Getting Started with Hurdle Models - https://data.library.virginia.edu/getting-started-with-hurdle-models/
Stein’s Pardox
- Baseball paper - Efron and Morris - http://statweb.stanford.edu/~ckirby/brad/other/Article1977.pdf
- https://solomonkurz.netlify.com/post/stein-s-paradox-and-what-partial-pooling-can-do-for-you/
- The weirdest paradox in statistics (and machine learning) - https://www.youtube.com/watch?v=cUqoHQDinCM&ab_channel=Mathemaniac
- James-Stein estimator + bias-variance tradeoff
Gaussian processes
- Gaussian Processes for Machine Learning - http://www.gaussianprocess.org/gpml/chapters/RW.pdf
- Robust Gaussian Process Modeling - https://betanalpha.github.io/assets/case_studies/gaussian_processes.html
Clinical prediction models
Developing and reporting models
- https://www.ncbi.nlm.nih.gov/pubmed/25560730
- https://www.ncbi.nlm.nih.gov/pubmed/22397945
- https://www.ncbi.nlm.nih.gov/pubmed/22397946
- https://www.ncbi.nlm.nih.gov/pubmed/29741602
- https://www.ncbi.nlm.nih.gov/pubmed/27362778
- https://www.ncbi.nlm.nih.gov/pubmed/23393430
- https://www.ncbi.nlm.nih.gov/pubmed/20010215
- https://www.ncbi.nlm.nih.gov/pubmed/24898551
Evaluation
- In Machine Learning Predictions for Health Care the Confusion Matrix is a Matrix of Confusion - https://www.fharrell.com/post/mlconfusion/
Data viz
- Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm - https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128&ref=https://githubhelp.com
- Principles of Effective Data Visualization - https://www.sciencedirect.com/science/article/pii/S2666389920301896#fig1
- Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing - https://dl.acm.org/doi/pdf/10.1145/3025453.3025912
Dimensionality reduction
- Ten quick tips for effective dimensionality reduction - https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006907
- Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization - https://www.nature.com/articles/s42003-022-03628-x
- https://yliapis.github.io/Non-Negative-Matrix-Factorization/
- Learning the parts of objects by non-negative matrix factorization - https://www.nature.com/articles/44565
- https://goldinlocks.github.io/Non-negative-matrix-factorization/
- https://blog.acolyer.org/2019/02/18/the-why-and-how-of-nonnegative-matrix-factorization/
Exactly Uncorrelated Sparse Principal Component Analysis - https://www.tandfonline.com/doi/abs/10.1080/10618600.2023.2232843?af=R&journalCode=ucgs20
- Is there any good reason to use PCA instead of EFA? Also, can PCA be a substitute for factor analysis? - https://stats.stackexchange.com/questions/123063/is-there-any-good-reason-to-use-pca-instead-of-efa-also-can-pca-be-a-substitut
Variable selection
- mixOmics: An R package for ‘omics feature selection and multiple data integration - https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005752
Synthetic data generation
- https://cran.r-project.org/web/packages/synthpop/index.html
- https://pypi.org/project/synthcity/
- https://docs.sdv.dev/sdv/
Conferences and presentations
Sloan Sports Analytics Research Papers
KDD-Sports Analytics
http://large-scale-sports-analytics.org
Euro-KDD Sports analytics
https://dtai.cs.kuleuven.be/events/MLSA19/links.php
CVPR-sports
http://www.vap.aau.dk/cvsports/
Videos
https://www.youtube.com/watch?v=WjFdD7PDGw0&t=9s&index=2&list=WL Imitation Learning Tutorial ICML 2018 Tutorial session at the International Conference on Machine Learning (ICML 2018) - Yisong Yue (Caltech) & Hoang M. Le (Caltech). This is a high level talk about the machine learning techniques that people are using to train AI sports players like the ‘Ghosting’ video we watched in class.
https://www.youtube.com/watch?v=VkhPT2cPGLA&index=4&list=PLRPywWPWMCkoTF6yQQsI5Mes95ystQbXU&t=2248s Lecture: Machine Learning in Sports by Sam Robertson Good overview lecture on machine learning applications in sports.
https://www.youtube.com/watch?v=YBY9viGTdU0&index=2&list=PLRPywWPWMCkoTF6yQQsI5Mes95ystQbXU&t=388s 2015 NESSIS - Talk by Sam Robertson (Western Bulldogs) “A method to assess the influence of individual player performance distribution on match outcome in team sports” presented by Sam Robertson at the 2015 New England Symposium on Statistics in Sports, held on Sept 26, 2015, at the Harvard University
https://www.youtube.com/watch?v=O0rKs6P0rnY&index=5&list=PLRPywWPWMCkoTF6yQQsI5Mes95ystQbXU&t=62s Statistical Models for Sport in R – Stephanie Kovalchik (Tennis Australia) A hand on tutorial and walkthrough on doing sports analytics in R.
https://www.youtube.com/watch?v=djD-yL3vWNQ 2017 NESSIS - Talk by Ronald Yurko “NFLWAR: A reproducible method for offensive player evaluation in football” presented by Ronald Yurko at the 2017 New England Symposium on Statistics in Sports, held on Sept 23, 2017, at the Harvard University Science Center.
https://www.youtube.com/watch?v=RN2FLKoKC50 2017 NESSIS - Talk by Nathan Sandholtz “Replaying the NBA: Using Markov Decision Processes to test decision-making from the 2015-2016 regular season” presented by Nathan Sandholtz at the 2017 New England Symposium on Statistics in Sports, held on Sept 23, 2017, at the Harvard
https://www.youtube.com/user/42analytics/videos Sloan sports analytics conference presentations Library of many past sports analytics presentations.
https://www.anziam.org.au/MathSport+Proceedings MathSport Proceedings ANZIAM Mathsport has placed conference proceedings online to make the papers available to researchers everywhere.
Books
Applied Predictive Modeling - by Max Kuhn and Kjell Johnson
http://appliedpredictivemodeling.com/
This text is intended for a broad audience as both an introduction to predictive models as well as a guide to applying them. Non-mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics.
An Introduction to Statistical Learning with Applications in R - Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
https://www-bcf.usc.edu/~gareth/ISL/
This book provides an introduction to statistical learning methods. It is aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences. The book also contains a number of R labs with detailed explanations on how to implement the various methods in real life settings, and should be a valuable resource for a practicing data scientist.
Computer Age Statistical Inference
https://web.stanford.edu/~hastie/CASI_files/PDF/casi.pdf
The twenty-first century has seen a breathtaking expansion of statistical methodology, both in scope and in influence. ‘Big data’, ‘data science’, and ‘machine learning’ have become familiar terms in the news, as statistical methods are brought to bear upon the enormous data sets of modern science and commerce. How did we get here? And where are we going? This book takes us on an exhilarating journey through the revolution in data analysis following the introduction of electronic computation in the 1950s. Beginning with classical inferential theories - Bayesian, frequentist, Fisherian - individual chapters take up a series of influential topics: survival analysis, logistic regression, empirical Bayes, the jackknife and bootstrap, random forests, neural networks, Markov chain Monte Carlo, inference after model selection, and dozens more. The distinctly modern approach integrates methodology and algorithms with statistical inference. The book ends with speculation on the future direction of statistics and data science.
Programming
R
- R Programming - https://www.coursera.org/learn/r-programming
- Statistical Inference via Data Science: A ModernDive into R and the Tidyverse - https://moderndive.com/
- https://github.com/uc-r/Intro-R
Python
- How to create a Python package in 2022 - https://mathspp.com/blog/how-to-create-a-python-package-in-2022