SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Lehto MR, Marucci-Wellman HR, Corns H. Inj. Prev. 2009; 15(4): 259-265.

Affiliation

School of Industrial Engineering, Purdue University, 1287 Grissom Hall, West Lafayette, IN 47907, USA. lehto@purdue.edu

Copyright

(Copyright © 2009, BMJ Publishing Group)

DOI

10.1136/ip.2008.021337

PMID

19652000

Abstract

To compare two Bayesian methods (Fuzzy and Naïve) for classifying injury narratives in large administrative databases into event cause groups, a dataset of 14 000 narratives was randomly extracted from claims filed with a worker's compensation insurance provider. Two expert coders assigned one-digit and two-digit Bureau of Labor Statistics (BLS) Occupational Injury and Illness Classification event codes to each narrative. The narratives were separated into a training set of 11 000 cases and a prediction set of 3000 cases. The training set was used to develop two Bayesian classifiers that assigned BLS codes to narratives. Each model was then evaluated for the prediction set. Both models performed well and tended to predict one-digit BLS codes more accurately than two-digit codes. The overall sensitivity of the Fuzzy method was, respectively, 78% and 64% for one-digit and two-digit codes, specificity was 93% and 95%, and positive predictive value (PPV) was 78% and 65%. The Naïve method showed similar accuracy: a sensitivity of 80% and 70%, specificity of 96% and 97%, and PPV of 80% and 70%. For large administrative databases, Bayesian methods show significant promise as a means of classifying injury narratives into cause groups. Overall, Naïve Bayes provided slightly more accurate predictions than Fuzzy Bayes.


Language: en

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print