Abstract
Many researchers, applications and fields of study have researched and used many works concerning the sentiment classification. Each model (or method) of the sentiment analysis has many advantages and many disadvantages. Thus, we see that the opinion classification is an extremely important field of research. In this study, we have proposed a Valence-Totaling Model for Vietnamese (called VTMfV, a new model for Vietnamese sentiment classification) to classify many Vietnamese documents. First of all, we built a new Vietnamese sentiment dictionary which contains sentiment-bearing Vietnamese words such as negative Vietnamese words, positive Vietnamese words and neutral Vietnamese words. The Jaccard Measure (JM) is a similarity measure between two words (or two vectors); our Vietnamese sentiment dictionary has been created using JM. We call the Vietnamese sentiment dictionary “VSD_JM”. JM has been used in many researches of the English sentiment classification; however, it has not yet been used in any study of the Vietnamese sentient classification. From this moment, JM can be applied for the researches of the Vietnamese sentiment analysis. Then, our VTMfV has used our VSD_JM to classify the Vietnamese documents. We have processed all kinds of Vietnamese sentences. Finally, we have used the VTMfV to classify 30,000 Vietnamese documents which include the 15,000 positive Vietnamese documents and the 15,000 negative Vietnamese documents. We have achieved accuracy in 63.9% of our Vietnamese testing data set. VTMfV is not dependent on the special domain. VTMfV is also not dependent on the training data set and there is no training stage in this VTMfV. From our results in this work, our VTMfV can be applied in the different fields of the Vietnamese natural language processing. In addition, our TCMfV can be applied to many other languages such as Spanish, Korean, etc. It can also be applied to the big data set sentiment classification in Vietnamese and can classify millions of the Vietnamese documents.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agarwal B, Mittal N (2016) Machine learning approach for sentiment analysis, prominent feature extraction for sentiment analysis. pp 21–45. Print ISBN 978-3-319-25341-1. doi:10.1007/978-3-319-25343-5_3
Agarwal B, Mittal N (2016) Semantic orientation-based approach for sentiment analysis, prominent feature extraction for sentiment analysis. pp 77–88. Print ISBN 978-3-319-25341-1. doi:10.1007/978-3-319-25343-5_6
Ahmed S, Danti A (2016) Effective sentimental analysis and opinion mining of web reviews using rule based classifiers. In: Computational intelligence in data mining, India vol 1, pp 171–179. Print ISBN 978-81-322-2732-8. doi:10.1007/978-81-322-2734-2_18
An NTT, Hagiwara M (2014) Adjective-based estimation of short sentence’s impression. In: International Conference on Kansei Engineering and Emotion Research, Keer2014, Linköping.
Bach NX, Van PD, Tai ND, Phuong TM (2015) Mining Vietnamese comparative sentences for sentiment analysis. In: 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE), pp 162–167.
Ban DQ (2005) Vietnamese Grammar. Education Publisher, Vietnam
Ban DQ (2013) Vietnam Grammar. Education Publisher, Vietnam
Bang TS, Haruechaiyasak C, Sornlertlamvanich V (2015) Vietnamese sentiment analysis based on term feature selection approach. In: Proceedings of The Tenth International Conference on Knowledge, Information and Creativity Support Systems (KICSS2015), Phuket, Thailand, November 12–14.
Ben-Shimon D, Rokach L, Shani G, Shapira B (2016) Anytime algorithms for recommendation service providers. In: ACM transactions on intelligent systems and technology (TIST)—regular papers, survey papers and special issue on recommender system benchmarks, vol 7, issue 3, New York, USA
Booma PM, Prabhakaran S (2016) Classification of genes for disease idxentification using data mining techniques. J Theor Appl Inf Technol 83(3) (ISSN: 1992-8645).
Borchardt V, Lord AR, Li M, van der Meer J, Heinze HJ, Bogerts B, Breakspear M, Walter M (2016) Preprocessing strategy influences graph-based exploration of altered functional networks in major depression. Human Brain Map 37(4):1422–1442
Canuto S, Gonçalves MA, Benevenuto F (2016) Exploiting new sentiment-based meta-level features for effective sentiment analysis. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining (WSDM ‘16), New York USA, pp 53–62.
Chamberlain BP, Levy-Kramer J, Humby C, Deisenroth MP (2016) Real-time association mining in large social networks, social and information networks.
Chen LS, Chiu HJ (2009) Developing a neural network based index for sentiment classification. In: Proceedings of the International MultiConference of Engineers and Computer Scientists, Hong Kong
Cimiano P, Wenderoth J (2007) Automatic acquisition of ranked qualia structures from the web. Proceedings of the 45th annual meeting of the association of computational linguistics. Czech Republic, Prague, pp 888–895
Dat H, Doi TT, Lan DT (1998) Vietnamese Establishments. Eduational Publisher, Vietnam
Duyen NT, Bach NX, Phuong TM (2014) An empirical study on sentiment analysis for Vietnamese. In: 2014 International Conference on Advanced Technologies for Communications (ATC), pp 309–314.
Efron M (2004) Cultural orientation: classifying subjective documents by cociation sic analysis. In: Proceedings of the AAAI Fall Symposium on Style and Meaning in Language, Art, Music, and Design, pp 41–48
Feng S, Zhang L, Li B, Wang D, Yu G, Wong KF (2013) Is Twitter a better corpus for measuring sentiment similarity? In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, pp 897–902
Ha QT, Vu TT, Pham HT, Luu CT (2011) An upgrading feature-based opinion mining model on Vietnamese product reviews. In: Proceedings of the 7th international conference on Active media technology (AMT 11), pp 173–185.
Hao CX (1991) Vietnamese: draft, grammatical function. Social Science Publisher, Vietnam
Khan F, Fatima M, Alvi UT, Jilani T, Fatima U (2016) comparative study of similarity measures in link prediction using Facebook data. Int J Comp Sci Inf Secur 132–143
Kieu BT, Pham SB (2010) Sentiment analysis for Vietnamese. In: 2010 Second International Conference on Knowledge and Systems Engineering (KSE), pp 152–157
Kundi FM, Khan A, Asghar MZ, Ahamd S (2015) Context-aware spelling corrector for sentiment analysis. MAGNT Res Rep 2(6):1–11
LACVIET dictionary software (2017) http://www.lacviet.vn/san-pham/tudienlacviet
Le HS, Le TV, Pham TV (2015) Aspect analysis for opinion mining of Vietnamese text. In: 2015 International Conference on Advanced Computing and Applications (ACOMP)
Le HS, Lee JH, Lee HK (2015) Applying machine learning to classify sentiment text for Vietnamese language on social network data. In: The Korea Society of Management information Systems, pp 709–714
LINGOES dictionary software (2017) http://www.lingoes.net/
Lu G, Huang P, He L, Cu C, Li X (2010) A new semantic similarity measuring method based on web search engines. J WSEAS Trans Comput 9 (1)
Manek AS, Shenoy PD, Chandra Mohan M, Venugopal KR (2016) Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World Wide Web, USA, pp 1–20. Print ISSN1386-145X. doi:10.1007/s11280-015-0381-x
Mao H, Gao P, Wang Y, Bollen J (2014) Automatic construction of financial semantic orientation Lexicon from large-scale Chinese news corpus. In: The 7th Financial Risks International Forum
Nadaf M, Lahane S, Deshpande A, Tirth S (2015) Using business intelligence for mining online reviews for predicting sales performance. Int J Eng Comput Sci 4(5):11718–11717 (ISSN:2319-7242)
Nguyen TC (1998) Vietnamese Grammar. Vietnam National University Publisher, Vietnam
Nguyen NY, Van Khang N, Hao VQ, Thanh PX (2010) Great Dictionary of Vietnamese. Ho Chi Minh City National University Publisher, Vietnam
Nguyen DQ, Nguyen DQ, Vu T, Pham SB (2014) Sentiment classification on polarity reviews: an empirical study using rating-based features. In: Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2014, pp 128–135
Nguyen HM, Le TV, Le HS, Pham TV (2014) Domain specific sentiment dictionary for opinion mining of Vietnamese text. Multi-discip Trends Artif Intell 136–148
Phan DH, Cao TD (2014) Applying skip-gram word estimation and SVM-based classification for opinion mining Vietnamese food places text reviews. In: Proceedings of the Fifth Symposium on Information and Communication Technology (SoICT 14), New York, USA, pp 232–239
Phe H, Linh HTT, Luong VX (2015) Vietnamese Dictionary 2015. Da Nang Publisher, Vietnam
Phu VN (2017) A valences-totaling model for English sentiment classification. Knowledge and Information Systems. doi:10.1007/s10115-017-1054-0
Phu VN, Tuoi PT (2014) Sentiment classification using enhanced contextual valence shifters. In: International Conference on Asian Language Processing (IALP), pp 224–229.
Phu VN, Chau VTN, Tran VTN, Dat ND (2017a) A Vietnamese adjective emotion dictionary based on exploitation of Vietnamese language characteristics. Int J Artif Intell Rev (AIR). doi:10.1007/s10462-017-9538-6
Phu VN, Chau VTN, Tran VTN, Dat ND, Nguyen TA (2017b) STING algorithm used english sentiment classification in a parallel environment. Int J Patt Recognit Artif Intell 31(7):30. doi:10.1142/S0218001417500215
Phu VN, Chau VTN, Tran VTN, Dat ND (2017) A C4.5 algorithm for english emotional classification. Int J Evol Syst. doi:10.1007/s12530-017-9180-1
Ramli N, Mohammed N, Shohaimay F (2016) Jaccard ranking index with algebraic product t-norm based on second function principle in handling fuzzy risk analysis problem. In: Regional Conference on Science, Technology and Social Sciences (RCSTSS 2014), pp 231–239
Rothfels J, Tibshirani J (2010) Unsupervised sentiment classification of english movie reviews using automatic selection of positive and negative sentiment items. CS224N-Final Project
Sneha B, Mohit D, Singh VZ (2016) Comparison of different similarity functions on Hindi QA system. In: Proceedings of International Conference on ICT for Sustainable Development, pp 657–663.
Song J, He Y, Fu G (2015) Polarity classification of short product reviews via multiple cluster-based SVM classifiers. In: 29th Pacific Asia Conference on Language, Information and Computation: Posters, Shanghai, China, pp 267–274
Taboada M, Anthony C, Voll K (2006) Methods for creating semantic orientation dictionaries. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006), pp 427–432, Genoa, Italy
TLNET Vietnamese Dictionary (2017) http://www.tlnet.com.vn/tu-dien-tieng-viet/
Tran VTN, Phu VP, Tuoi PT (2014) Learning more Chi square feature selection to improve the fastest and most accurate sentiment classification. In: The Third Asian Conference on Information Systems, ACIS 2014
Trinh S, Nguyen L, Vo M, Do P (2016) Lexicon-based sentiment analysis of facebook comments in Vietnamese language. In: Recent Developments in Intelligent Information and Database Systems, pp 263–276.
Turney P (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of 40th ACL, pp 417–424.
Turney PD, Littman ML (2003) Measuring praise and criticism: inference of semantic orientation from association. ACM Trans Inf Syst (TOIS) 21(4):315–346
Van Anh TT, Dau HX (2014) A crossed-domain sentiment analysis system for the discovery of current careers from social networks. In: Proceedings of the Fifth Symposium on Information and Communication Technology (SoICT 14), New York, USA, pp 226–231
VDict Vietnamese Dictionary (2017) http://vdict.com/
Vietnam Social Science Commission (1993) Vietnamese grammar. Social Science Publisher, Ha Noi
Voll K, Taboada M (2007) Not all words are created equal: extracting semantic orientation as a function of adjective relevance. In: Proceedings of the 20th Australian Joint Conference on Artificial Intelligence, Gold Coast, Australia, pp 337–346
Vu XS, Park SB (2014) Construction of Vietnamese SentiWordNet by using Vietnamese dictionary. In: The 40th Conference of the Korea Information Processing Society, South Korea, pp 745–748.
Wang G, Araki K (2007) Modifying SO-PMI for JapaneseWeblog opinion mining by using a balancing factor and detecting neutral expressions. In: Proceedings of NAACL HLT 2007, Companion Volume, pp 189–192
Yuen RWM, Chan TYW, Lai TBY, Kwong OY, T’sou BKY (2004) Morpheme-based derivation of bipolar semantic orientation of Chinese words. In: Proceedings of the 20th International Conference on Computational Linguistics, Stroudsburg, PA, USA
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Phu, V.N., Chau, V.T.N., Tran, V.T.N. et al. A Valence-Totaling Model for Vietnamese sentiment classification. Evolving Systems 10, 453–499 (2019). https://doi.org/10.1007/s12530-017-9187-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-017-9187-7