SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Sawangarreerak S, Thanathamathee P. Information (Basel) 2020; 11(11): e519.

Copyright

(Copyright © 2020, MDPI: Multidisciplinary Digital Publications Institute)

DOI

10.3390/info11110519

PMID

unavailable

Abstract

In this work, we propose a combined sampling technique to improve the performance of imbalanced classification of university student depression data. In experimental results, we found that combined random oversampling with the Tomek links under sampling methods allowed generating a relatively balanced depression dataset without losing significant information. In this case, the random oversampling technique was used for sampling the minority class to balance the number of samples between the datasets. Then, the Tomek links technique was used for undersampling the samples by removing the depression data considered less relevant and noisy. The relatively balanced dataset was classified by random forest. The results show that the overall accuracy in the prediction of adolescent depression data was 94.17%, outperforming the individual sampling technique. Moreover, our proposed method was tested with another dataset for its external validity. This dataset’s predictive accuracy was found to be 93.33%.


Language: en

Keywords

depression prediction; feature selection; imbalanced data; Patient Health Questionnaire-9 (PHQ-9); sampling techniques

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print