SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Rao AR, Garai S, Dey S, Peng H. SN Comput. Sci. 2021; 2(6).

Copyright

(Copyright © 2021, Holtzbrinck Springer Nature Publishing Group)

DOI

10.1007/s42979-021-00871-7

PMID

unavailable

Abstract

With calls for increasing transparency, governments are releasing greater amounts of data in multiple domains including finance, education, and healthcare. We focus on healthcare due to its economic importance worldwide. The efficient exploratory analysis of healthcare data constitutes a significant challenge. Key concerns in public health include the quick identification and analysis of trends and the detection of outliers. This allows policies to be rapidly adapted to changing circumstances. We present an efficient outlier detection technique, termed PIKS (Pruned iterative-k means searchlight), which combines an iterative k-means algorithm with a pruned searchlight based scan. We apply this technique to identify outliers in two publicly available healthcare datasets from the New York Statewide Planning and Research Cooperative System, and California's Office of Statewide Health Planning and Development. We provide a comparison of our technique with three other existing outlier detection techniques, consisting of auto-encoders, isolation forests, and feature bagging. We identified outliers in conditions including suicide rates, immunity disorders, social admissions, cardiomyopathies, and pregnancy in the third trimester. We demonstrate that the PIKS technique produces results consistent with other techniques such as the auto-encoder. However, the auto-encoder needs to be trained, which requires several parameters to be tuned. In comparison, the PIKS technique has far fewer parameters to tune. This makes it advantageous for fast, "out-of-the-box" data exploration. The PIKS technique is scalable and can readily ingest new datasets. Hence, it can provide valuable, up-to-date insights to citizens, patients, and policy-makers. We have made our code open source, and with the availability of open data, other researchers can easily reproduce and extend our work. This will help promote a deeper understanding of healthcare policies and public health issues. © 2021, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.


Language: en

Keywords

Trend analysis; Machine learning; Big data analytics; Exploratory data analysis; Open healthcare data; Outlier detection; Policy making; Unsupervised clustering

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print