SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Bullock GS, Ward P, Collins GS, Hughes T, Impellizzeri F. Sports Med. Open 2024; 10(1): e84.

Copyright

(Copyright © 2024, Holtzbrinck Springer Nature Publishing Group)

DOI

10.1186/s40798-024-00745-1

PMID

39068259

PMCID

PMC11283439

Abstract

We recently read the article titled "Machine Learning for Understanding and Predicting Injuries in Football" in Sports Medicine - Open [1]. Given that injury prediction is an emerging topic within sport, the increasing interest and excitement towards complex machine learning algorithms within this space is a cause for concern when fundamental principles of prediction model development are not followed. As such, we feel the need to intervene and highlight methodological and conceptual inaccuracies.

The models presented in this paper were deemed by the authors to be "quite sound" [1]. However, this is not true, as recently highlighted in the systematic review in Sports Medicine [2]. All of these models were included in this systematic review, and after evaluation with the established Prediction Model Risk of Bias Assessment Tool (PROBAST) [3], were rated as high or unclear risk of bias [2].

The authors detail that, "the use of machine learning has great potential to unearth new insights into the workload and injury relationship."[1] Prediction models may use both causal and non-causal predictors to estimate the risk of a future outcome [4, 5]. Consequently, it is inappropriate to use the included predictors to infer causal relationships between individual predictors and the outcome [6, 7]. Further, the authors state that Shapley values, local interpretable model-agnostic explanations, and partial dependency plots can be used to assist in interpreting cause-effect relationships with machine learning models [8]. These tools assess for associations between predictors and outcomes and, regardless of how these methods are labelled, the popular adage "correlation is not causation" still holds [8]. Importantly, these methods remain explorative, provide post hoc explanations (rationalization), and require confirmatory studies.

Such inaccurate and incorrect interpretations of clinical prediction models are of particular concern. This is because they can lead practitioners to attempt to change injury risk by intervening or manipulating predictor variables under the false assumption of a causal relationship; while these strategies are likely ineffective, they also have potentially harmful consequences for the athlete [4, 5, 9].

While the authors promote balancing dataset outcomes through over and under-sampling [1], this is highly discouraged as 'balancing' datasets alters the outcome prevalence, biasing towards overestimating risk.[10, 11] Balancing data without appropriate recalibration can inappropriately impact risk prediction and ultimately decision-making [11]. The authors also encourage creating classification models. Classification models are not recommended as this supersedes clinical and performance decision-making from the model users [11]. Classification models do not allow situational context and assume all situations and individuals have the same risk threshold. Prediction models should be developed and reported as a probability or at least risk score, to allow user interpretation and decisions [11].

The authors report that area under the curve (a form of discrimination), accuracy, sensitivity,...

Keywords: American football


Language: en

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print