SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Seliverstov Y, Seliverstov S, Malygin I, Korolev O. Transp. Res. Proc. 2020; 50: 626-635.

Copyright

(Copyright © 2020, Elsevier Publications)

DOI

10.1016/j.trpro.2020.10.074

PMID

unavailable

Abstract

The paper addresses the task of analyzing traffic safety in the Northwestern Federal District according to the reviews published in the Web. To accomplish the task, the authors developed a system of automatic review classification based on a sentiment classifier. They analyzed open source libraries for data mining, developed a web crawler using Scrapy framework, written in Python 3, and collected reviews. They also considered the methods of text vectorization and lemmatization and their application in the Scikit-Learn library: Bag-of-Words, N-gram, CountVectorizer, and TF-IDF Vectorizer. For the purpose of classification, the authors used the naïve Bayes algorithm and a linear classifier model with stochastic gradient descent optimization. A base of tagged Twitter reviews was used as a training set. The classifier was trained using cross-validation and ShuffleSplit strategies. The authors also tested and compared the classification results for different classifiers. As a result of validation, the best model was determined. The developed system was applied to analyze the quality of roads in the Northwestern Federal District. Based on the outcome, the roads were marked-up in color to illustrate the results of the research.


Language: en

Keywords

automatic text mining; intelligent transportation systems; linear classifier; machine learning; n-gram; naïve Bayes algorithm; sentiment analysis; text classification; tf-idf; web crawlers

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print