SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Nanda G, Grattan KM, Chu MT, Davis LK, Lehto MR. J. Saf. Res. 2016; 57: 71-82.

Affiliation

School of Industrial Engineering, Purdue University, 315 N. Grant Street, West Lafayette, IN 47907-2023, USA. Electronic address: lehto@purdue.edu.

Copyright

(Copyright © 2016, U.S. National Safety Council, Publisher Elsevier Publishing)

DOI

10.1016/j.jsr.2016.03.001

PMID

27178082

Abstract

INTRODUCTION: Studies on autocoding injury data have found that machine learning algorithms perform well for categories that occur frequently but often struggle with rare categories. Therefore, manual coding, although resource-intensive, cannot be eliminated. We propose a Bayesian decision support system to autocode a large portion of the data, filter cases for manual review, and assist human coders by presenting them top k prediction choices and a confusion matrix of predictions from Bayesian models.

METHOD: We studied the prediction performance of Single-Word (SW) and Two-Word-Sequence (TW) Naïve Bayes models on a sample of data from the 2011 Survey of Occupational Injury and Illness (SOII). We used the agreement in prediction results of SW and TW models, and various prediction strength thresholds for autocoding and filtering cases for manual review. We also studied the sensitivity of the top k predictions of the SW model, TW model, and SW-TW combination, and then compared the accuracy of the manually assigned codes to SOII data with that of the proposed system.

RESULTS: The accuracy of the proposed system, assuming well-trained coders reviewing a subset of only 26% of cases flagged for review, was estimated to be comparable (86.5%) to the accuracy of the original coding of the data set (range: 73%-86.8%). Overall, the TW model had higher sensitivity than the SW model, and the accuracy of the prediction results increased when the two models agreed, and for higher prediction strength thresholds. The sensitivity of the top five predictions was 93%.

CONCLUSIONS: The proposed system seems promising for coding injury data as it offers comparable accuracy and less manual coding. PRACTICAL APPLICATIONS: Accurate and timely coded occupational injury data is useful for surveillance as well as prevention activities that aim to make workplaces safer.

Copyright © 2016 Elsevier Ltd and National Safety Council. All rights reserved.


Language: en

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print