Dienst van SURF
© 2025 SURF
Although causal inference has shown great value in estimating effect sizes in, for instance, physics, medical studies, and economics, it is rarely used in sports science. Targeted Maximum Likelihood Estimation (TMLE) is a modern method for performing causal inference. TMLE is forgiving in the misspecification of the causal model and improves the estimation of effect sizes using machine-learning methods. We demonstrate the advantage of TMLE in sports science by comparing the calculated effect size with a Generalized Linear Model (GLM). In this study, we introduce TMLE and provide a roadmap for making causal inference and apply the roadmap along with the methods mentioned above in a simulation study and case study investigating the influence of substitutions on the physical performance of the entire soccer team (i.e., the effect size of substitutions on the total physical performance). We construct a causal model, a misspecified causal model, a simulation dataset, and an observed tracking dataset of individual players from 302 elite soccer matches. The simulation dataset results show that TMLE outperforms GLM in estimating the effect size of the substitutions on the total physical performance. Furthermore, TMLE is most robust against model misspecification in both the simulation and the tracking dataset. However, independent of the method used in the tracking dataset, it was found that substitutes increase the physical performance of the entire soccer team.
Physical activity is crucial in human life, whether in everyday activities or elite sports. It is important to maintain or improve physical performance, which depends on various factors such as the amount of physical activity, the capability, and the capacity of the individual. In daily life, it is significant to be physically active to maintain good health, intense exercise is not necessary, as simple daily activities contribute enough. In sports, it is essential to balance capacity, workload, and recovery to prevent performance decline or injury.With the introduction of wearable technology, it has become easier to monitor and analyse physical activity and performance data in daily life and sports. However, extracting personalised insights and predictions from the vast and complex data available is still a challenge.The study identified four main problems in data analytics related to physical activity and performance: limited personalised prediction due to data constraints, vast data complexity, need for sensitive performance measures, overly simplified models, and missing influential variables. We proposed end investigated potential solutions for each issue. These solutions involve leveraging personalised data from wearables, combining sensitive performance measures with various machine learning algorithms, incorporating causal modelling, and addressing the absence of influential variables in the data.Personalised data, machine learning, sensitive performance measures, advanced statistics, and causal modelling can help bridge the data analytics gap in understanding physical activity and performance. The research findings pave the way for more informed interventions and provide a foundation for future studies to further reduce this gap.
LINK