Aspect Based Sentiment Analysis Model for Hotel Services in Amharic Language Using Machine Learning Techniques
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ASTU
Abstract
In the past years, the World Wide Web (www) has come to be a large source of user-generated
content and opinionated data. used by social media, such as YouTube, Facebook, organizational
websites, etc. Opinion mining focuses on the sentiment which may be positive, negative, and
neutral. Aspect-based sentiment analysis where particular aspects are extracted, sentiment
polarity of the aspects is determined. Sentimental analysis carries through in one of three different
levels namely, sentence, document, and aspect /feature. Among the three levels, aspect level
sentimental analysis is detail and complex but has a better advantage to meet customers and the
organization's needs. Comments written by customers have a huge advantage on the success of the
hotel. contrarily, it would be difficult for a hotel to manually analyze a numerous amount of
commented data to know whether a customer is satisfied or not. In this study, to alleviate this
problem, Aspect Based Sentimental Analysis Model through implementing machine learning
techniques is proposed. In this research, four Machine learning classification approaches are
used. these include Naive Bayesian, Logistic Regression, Support Vector Machine, and Gradient
Boosting which were experimented for building and evaluating the sentiment (polarity) model with
the extracted features based on the 2124 datasets collected from 10 hotels’ social media pages as
well as their websites and annotated by Amharic linguistic expert. the cosine similarity technique
for extracting aspects (features) of the hotels is also used. The cross-validation method is known
for its skill in the less biased or less optimistic estimation over the previous simple train/split
approach. Dataset is partitioned into k equal-sized groups or folds, and each fold is treated as a
validation set, while the rest k-1 is used as a training set to fit the model. The 10-fold cross validation of the Gradient boosting model based on the countvetorizer+unigram+Trigram feature
with an accuracy of 92.8% and an f1-score of 95% has shown the best performance as compared
to SVM, NB, and LR models. Aspect based SA has evolved as an active research area that
dominates all the sciences in the world, so this research proposes aspect-based sentimental
analysis to predict the aspect (like food, ambiance, drink, price) and polarity (+ve, -ve or neutral)
by gathering data from Facebook pages and websites. Due to a shortage of time stemmer is not
apply, so we recommend for the others researchers to apply stemmer
