Abstract
This article reports on a modification of the user-kNN algorithm that measures the similarity between users based on the similarity of text reviews, instead of ratings. We investigate the performance of text semantic similarity measures and we evaluate our text-based user-kNN approach by comparing it to a range of ratings-based approaches in a ratings prediction task. We do so by using datasets from two different domains: movies from RottenTomatoes and Audio CDs from Amazon Products. Our results show that the text-based userkNN algorithm performs significantly better than the ratings-based approaches in terms of accuracy measured using RMSE.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Herlocker, J., Konstan, J., Borchers, J.A., Riedl, J.: An Algorithmic Framework for Performing Collaborative Filtering. In: Proceedings of the 1999 Conference on Research and Development in Information Retrieval (1999)
Terzi, M., Ferrario, M., Whittle, J.: Free Text In User Reviews: Their Role In Recommender Systems. In: Proceedings of the 3rd ACM RecSys 2010 Workshop on Recommender Systems and the Social Web, pp. 45–48. ACM, Chicago (2011)
Leung, C.W.K., Chan, S.C.F., Chung, F.: Integrating collaborative filtering and sentiment analysis: A rating inference approach. In: Proceedings of the ECAI 2006 Workshop on Recommender Systems, Riva del Garda, Italy, pp. 62–66 (2006)
Zhang, W., Ding, G., Chen, L., Li, C.: Augmenting Chinese Online Video Recommendations by Using Virtual Ratings Predicted by Review Sentiment Classification. In: Proc. of the IEEE ICDM Workshops. IEEE Computer Society, Washington, DC (2010)
Chen, L., Wang, F.: Preference-based Clustering Reviews for Augmenting e-Commerce Recommendation. In: Knowledge-Based Systems (2013)
Musat, C.C., Liang, Y., Faltings, B.: Recommendation using textual opinions. In: Proceedings of the 23rd IJCAI, pp. 2684–2690. AAAI Press (2013)
Pero, Š., Horváth, T.: Opinion-Driven Matrix Factorization for Rating Prediction. In: Carberry, S., Weibelzahl, S., Micarelli, A., Semeraro, G. (eds.) UMAP 2013. LNCS, vol. 7899, pp. 1–13. Springer, Heidelberg (2013)
Singh, V.K., Mukherjee, M., Mehta, G.K.: Combining collaborative filtering and sentiment classification for improved movie recommendations. In: Sombattheera, C., Agarwal, A., Udgata, S.K., Lavangnananda, K. (eds.) MIWAI 2011. LNCS, vol. 7080, pp. 38–50. Springer, Heidelberg (2011)
Raghavan, S., Gunasekar, S., Ghosh, J.: Review quality aware collaborative filtering. In: Proceedings of the 6th ACM Conference on RecSys, pp. 123–130. ACM, Chicago (2011)
McAuley, J., Leskovec, J.: Hidden factors and hidden topics: understanding rating dimensions with review text. In: Proceedings of the 7th ACM RecSys. ACM (2013)
Levi, A., Mokryn, O., Diot, C., Taft, N.: Finding a needle in a haystack of reviews: cold start context-based hotel recommender system. In: Proc. RecSys 2012, pp. 115–122. ACM, New York (2012)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet: Similarity - Measuring the Relatedness of Concepts. In: Proc. of AAAI, pp. 1024–1025. AAAI, Menlo Park (2004)
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: Rcv1: A new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. In: Fellbaum, C. (ed.), pp. 305–332. MIT Press (1998)
Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: 32nd Annual Meeting of the Association for Computational Linguistics, pp. 133–138 (1994)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of IJCAI, pp. 448–453 (1995)
Lin, D.: An information theoretic definition of similarity. In: Proceedings of the 15th IICML. Morgan Kaufmann, San Francisco (1998)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: ROCLING X. Academia Sinica, Tapei (1997)
Miller, G.A., Leacock, C., Tengi, R., Bunker, R.T.: A semantic concordance. In: Proceedings of the Workshop on HLT, Stroudsburg, PA, USA, pp. 303–308 (1993)
Gantner, Z., Rendle, S., Freudenthaler, C., Schmidt-Thieme, L.: Mymedialite: a free recommender system library. In: Proceedings of the 5th ACM Conference on Recommender Systems, pp. 305–308. ACM, New York (2011)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)
Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook (2011)
Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the Conference on Web Search and Web Data Mining (2008)
Bennet, J., Lanning, S.: The Netflix Prize. In: KDD Cup and Workshop (2007)
Koren, Y.: Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: Proceedings of the 14th ACM SIGKDD, pp. 426–434. ACM, New York (2008)
Mohler, M., Mihalcea, R.: Text-to-Text Semantic Similarity for Automatic Short Answer Grading. In: EC-ACL 2009, Athens, Greece, pp. 567–575 (2009)
Gunawardana, A., Shani, G.: A survey of accuracy evaluation metrics of recommendation tasks. J. Mach. Learn. Res. 10, 2935–2962 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Terzi, M., Rowe, M., Ferrario, MA., Whittle, J. (2014). Text-Based User-kNN: Measuring User Similarity Based on Text Reviews. In: Dimitrova, V., Kuflik, T., Chin, D., Ricci, F., Dolog, P., Houben, GJ. (eds) User Modeling, Adaptation, and Personalization. UMAP 2014. Lecture Notes in Computer Science, vol 8538. Springer, Cham. https://doi.org/10.1007/978-3-319-08786-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-08786-3_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08785-6
Online ISBN: 978-3-319-08786-3
eBook Packages: Computer ScienceComputer Science (R0)