SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

McCart JA, Berndt DJ, Jarman J, Finch DK, Luther SL. J. Am. Med. Inform. Assoc. 2013; 20(5): 906-914.

Affiliation

Consortium for Healthcare Informatics Research (CHIR) and the HSR&D/RR&D Center of Excellence: Maximizing Rehabilitation Outcomes, James A Haley Veterans' Hospital, Tampa, Florida, USA.

Copyright

(Copyright © 2013, American Medical Informatics Association, Publisher Elsevier Publishing)

DOI

10.1136/amiajnl-2012-001334

PMID

23242765

Abstract

OBJECTIVE: To determine how well statistical text mining (STM) models can identify falls within clinical text associated with an ambulatory encounter. MATERIALS AND METHODS: 2241 patients were selected with a fall-related ICD-9-CM E-code or matched injury diagnosis code while being treated as an outpatient at one of four sites within the Veterans Health Administration. All clinical documents within a 48-h window of the recorded E-code or injury diagnosis code for each patient were obtained (n=26 010; 611 distinct document titles) and annotated for falls. Logistic regression, support vector machine, and cost-sensitive support vector machine (SVM-cost) models were trained on a stratified sample of 70% of documents from one location (dataset A(train)) and then applied to the remaining unseen documents (datasets A(test)-D). RESULTS: All three STM models obtained area under the receiver operating characteristic curve (AUC) scores above 0.950 on the four test datasets (A(test)-D). The SVM-cost model obtained the highest AUC scores, ranging from 0.953 to 0.978. The SVM-cost model also achieved F-measure values ranging from 0.745 to 0.853, sensitivity from 0.890 to 0.931, and specificity from 0.877 to 0.944. DISCUSSION: The STM models performed well across a large heterogeneous collection of document titles. In addition, the models also generalized across other sites, including a traditionally bilingual site that had distinctly different grammatical patterns. CONCLUSIONS: The results of this study suggest STM-based models have the potential to improve surveillance of falls. Furthermore, the encouraging evidence shown here that STM is a robust technique for mining clinical documents bodes well for other surveillance-related topics.


Language: en

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print