Comparative Evaluation of Unigram and Bigram Models for Sentiment Classification of Hotel Reviews
DOI:
https://doi.org/10.32628/IJSRST2613378Keywords:
Sentiment Analysis, N-Gram Models, Text Classification, Machine Learning, Hotel Reviews, Natural Language ProcessingAbstract
This paper provides a comparative analysis of as much as possible piece of N-gram-based sentiment classification models to huge-scale hotel review data. Three feature extraction methods such as unigram, a bigram, and a combination of both unigram and bigram representations were used to examine sentiment polarity with respect to negative, neutral and positive classes. Machine learning classifiers (Logistic Regression, Support Vector Machine (SVM), Naive Bayes, Ridge, Stochastic Gradient Descent (SGD) and Passive Aggressive) were tested on three baseline models (M1-M3). An experiment shows that the highest performance is achieved by the unigram-bigram model (M3), and the accuracy, precision, and recall indicate that M3-UniBigram-SVM had an evaluation of 0.8538, 0.8298 and 0.8538 respectively. The analysis of confusion matrices revealed that the highest accuracy was obtained when positive sentiment was considered and the most difficult category according to semantic ambiguity was that of neutral sentiment. The findings prove the hypothesis that hybrid N-gram representations can greatly improve the sentiment classification performances when compared to either single unigram or bigram models. The research generates potent baseline models that are used to support highly developed hybrid sentiment classification models.
Downloads
References
Jurafsky, Daniel, and James H. Martin. 2020. Speech and Language Processing. 3rd ed. Draft.
Liu, Bing. 2012. Sentiment Analysis and Opinion Mining. San Rafael: Morgan & Claypool.
Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge: Cambridge University Press.
Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. “Thumbs up? Sentiment Classification Using Machine Learning Techniques.” Proceedings of the ACL Conference, 79–86.
P. K. Pal, B. Kataria and J. Jangid, "Schematic-Aware PCB Inspection Using Computer Vision and Deep Learning for Trace, Solder, and Net-Level Fault Detection," in IEEE Transactions on Components, Packaging and Manufacturing Technology, doi: 10.1109/TCPMT.2026.3687148.
Sebastiani, Fabrizio. 2002. “Machine Learning in Automated Text Categorization.” ACM Computing Surveys 34 (1): 1–47.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 International Journal of Scientific Research in Science and Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.
https://creativecommons.org/licenses/by/4.0