×

Robust recommendation via social network enhanced matrix completion. (English) Zbl 07763170

Summary: Robust product recommendation enables internet platforms to boost their business. However, in practice, the user-product rating matrix often has many missing entries. Social network information generates new insights about user behaviors. To fully use such information, we develop a novel approach, called MCNet, that combines the random dot product graph model and low-rank matrix completion to recover missing entries in a user-product rating matrix. Our algorithm improves the accuracy and the efficiency of recovering the incomplete matrices. We study the asymptotic properties of the estimator. Furthermore, we perform extensive simulations and show that MCNet outperforms existing approaches, especially when the data have small signals. Moreover, MCNet yields robust estimation under misspecified models. We apply MCNet and its competitors to predict the missing entries in the user-product rating matrices of the Yelp and Douban movie platforms. The results show that, in general, MCNet gives the smallest testing errors among the comparative methods.

MSC:

62-XX Statistics

Software:

softImpute
Full Text: DOI

References:

[1] Abernethy, J., Bach, F., Evgeniou, T. and Vert, J.-P. (2009). A new approach to collaborative filtering: Operator estimation with spectral regularization. Journal of Machine Learning Research 10, 803-826. · Zbl 1235.68122
[2] Ahn, S. C. and Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econo-metrica 81, 1203-1227. · Zbl 1274.62403
[3] Athreya, A., Priebe, C. E., Tang, M., Lyzinski, V., Marchette, D. J. and Sussman, D. L. (2018). Statistical inference on random dot product graphs: A survey. The Journal of Machine Learning Research 18, 1-92. · Zbl 1338.62061
[4] Athreya, A., Priebe, C. E., Tang, M., Lyzinski, V., Marchette, D. J. and Sussman, D. L. (2016). A limit theorem for scaled eigenvectors of random dot product graphs. Sankhya A 78, 1-18. · Zbl 1338.62061
[5] Bi, X., Qu, A., Wang, J. and Shen, X.(2017). A group-specific recommender system. Journal of the American Statistical Association 112, 1344-1353.
[6] Cai, J. F., Candès, E. J. and Shen, Z. (2010). A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization 20, 1956-1982. · Zbl 1201.90155
[7] Candes, E. J. and Plan, Y. (2010). Matrix completion with noise. Proceedings of the IEEE 98, 925-936.
[8] Chiang, K.-Y., Hsieh, C.-J. and Dhillon, I. S. (2015). Matrix completion with noisy side informa-tion. In Proceedings of the 28th International Conference on Neural Information Processing Systems, 3447-3455.
[9] Dai, B., Wang, J., Shen, X. and Qu, A. (2019). Smooth neighborhood recommender systems. Journal of Machine Learning Research 20, 1-24. · Zbl 1483.68292
[10] Diallo, A. O., Diop, A. and Dupuy, J. F. (2017). Asymptotic properties of the maximum-likelihood estimator in zero-inflated binomial regression. Communications in Statistics-Theory and Methods 46, 9930-9948. · Zbl 1462.62460
[11] Elsener, A. and van de Geer, S. (2018). Robust low-rank matrix estimation. The Annals of Statistics 46, 3481-3509. · Zbl 1412.62068
[12] Fan, J., Gong, W. and Zhu, Z. (2019). Generalized high-dimensional trace regression via nuclear norm regularization. Journal of Econometrics 212, 177-202. · Zbl 1452.62536
[13] Fan, J., Wang, W. and Zhu, Z. (2017). A shrinkage principle for heavy-tailed data: High-dimensional robust low-rank matrix recovery. The Annals of Statistics 49, 1239-1266. · Zbl 1479.62034
[14] Feuerverger, A., He, Y. and Khatri, S. (2012). Statistical significance of the Netflix challenge. Statistical Science 27, 202-231. · Zbl 1330.62090
[15] Hall, D. B. (2000). Zero-inflated Poisson and binomial regression with random effects: A case study. Biometrics 56, 1030-1039. · Zbl 1060.62535
[16] Kang, Z., Peng, C. and Cheng, Q. (2016). Top-N recommender system via matrix completion. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, 179-185.
[17] Kim, E., Lee, M. and Oh, S. (2015). Elastic-net regularization of singular values for robust subspace learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 915-923.
[18] Klopp, O. (2014). Noisy low-rank matrix completion with general sampling distribution. Bernoulli 20, 282-303. · Zbl 1400.62115
[19] Koltchinskii, V., Lounici, K. and Tsybakov, A. B. (2011). Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. The Annals of Statistics 39, 2302-2329. · Zbl 1231.62097
[20] Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. The Annals of Statistics 43, 215-237. · Zbl 1308.62041
[21] Li, H., Chen, N. and Li, L. (2012). Error analysis for matrix elastic-net regularization algorithms. IEEE Transactions on Neural Networks and Learning Systems 23, 737-748.
[22] Liu, X. and Aberer, K. (2013). SoCo: A social network aided context-aware recommender system. In Proceedings of the 22nd International Conference on World Wide Web, 781-802.
[23] Liu, J. and Wu, C. (2017). Deep learning based recommendation: A survey. International Con-ference on Information Science and Applications, 451-458.
[24] Lu, L. and Peng, X. (2013). Spectra of edge-independent random graphs. Electronic Journal of Combinatorics 20, P27. · Zbl 1295.05211
[25] Lyzinski, V., Sussman, D. L., Tang, M., Athreya, A. and Priebe, C. E. (2014). Perfect clustering for stochastic blockmodel graphs via adjacency spectral embedding. Electronic Journal of Statistics 8, 2905-2922. · Zbl 1308.62131
[26] Lyzinski, V., Tang, M., Athreya, A., Park, Y. and Priebe, C. E. (2021). Community detection and classification in hierarchical stochastic blockmodels. IEEE Transactions on Network Science and Engineering 4, 13-26.
[27] Ma, H., Zhou, D., Liu, C., Lyu, M. R. and King, I. (2011). Recommender systems with social regularization. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, 287-296.
[28] Mao, X., Chen, S. X. and Wong, R. K. (2019). Matrix completion with covariate information. Journal of the American Statistical Association 114, 198-210. · Zbl 1478.62122
[29] Mao, X., Wong, R. K. and Chen, S. X. (2021). Matrix completion under low-rank missing mechanism. Statistica Sinica 31, 2005-2030. · Zbl 1524.62260
[30] Mazumder, R., Hastie, T. and Tibshirani, R. (2010). Spectral regularization algorithms for learning large incomplete matrices. Journal of Machine Learning Research 11, 2287-2322. · Zbl 1242.68237
[31] Negahban, S. and Wainwright, M. J. (2011). Estimation of (near) low-rank matrices with noise and high-dimensional scaling. The Annals of Statistics 39, 1069-1097. · Zbl 1216.62090
[32] Oliveira, R. I. (2009). Concentration of the adjacency matrix and of the laplacian in random graphs with independent edges. arXiv preprint arXiv:0911.0600.
[33] Rao, N., Yu, H. F., Ravikumar, P. and Dhillon, I. S. (2015). Collaborative filtering with graph information: Consistency and scalable methods. In Proceedings of the 28th International Conference on Neural Information Processing Systems, 2107-2115.
[34] Recht, B. (2011). A Simpler Approach to Matrix Completion. Journal of Machine Learning Research 12, 3413-3430. · Zbl 1280.68141
[35] Recht, B., Fazel, M. and Parrilo, P. A. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review 52, 471-501. · Zbl 1198.90321
[36] Robin, G., Wai, H.-T., Josse, J., Klopp, O. and Moulines,É. (2018). Low-rank interaction with sparse additive effects model for large data frames. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, 5501-5511.
[37] Rubin, D. B. (1976). Inference and missing data. Biometrika 63, 581-592. · Zbl 0344.62034
[38] Shi, Y., Larson, M. and Hanjalic, A. (2014). Collaborative filtering beyond the user-item matrix: A survey of the state of the art and future challenges. ACM Computing Surveys (CSUR) 47, 1-45.
[39] Srebro, N., Rennie, J. and Jaakkola, T. S. (2005). Maximum-margin matrix factorization. In Proceedings of the 17th International Conference on Neural Information Processing Sys-tems, 1329-1336.
[40] Sun, T and Zhang, C. H. (2012). Calibrated elastic regularization in matrix completion. In Pro-ceedings of the 25th International Conference on Neural Information Processing Systems, 863-871.
[41] Sussman, D. L., Tang, M., Fishkind, D. E. and Priebe, C. E. (2012). A consistent adjacency spectral embedding for stochastic blockmodel graphs. Journal of the American Statistical Association 107, 1119-1128. · Zbl 1443.62188
[42] Sussman, D. L., Tang, M. and Priebe, C. E. (2013). Consistent latent position estimation and vertex classification for random dot product graphs. IEEE Transactions on Pattern Anal-ysis and Machine Intelligence 36, 48-57.
[43] Tang, M., Athreya, A., Sussman, D. L., Lyzinski, V., Park, Y. and Priebe, C. E. (2017). A semiparametric two-sample hypothesis testing problem for random graphs. Journal of Computational and Graphical Statistics 26, 344-354.
[44] Tang, M. and Priebe, C. E. (2018). Limit theorems for eigenvectors of the normalized laplacian for random graphs. The Annals of Statistics 46, 2360-2415. · Zbl 1408.62120
[45] Wang, H., Wang, N. and Yeung, D. Y. (2015). Collaborative deep learning for recommender systems. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1235-1244.
[46] Xu, M., Jin, R. and Zhou, Z.-H. (2013). Speedup matrix completion with side information: application to multi-label learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, 2301-2309.
[47] Yelp (2019). Yelp dataset challenge @MISC. http://www.yelp.com/dataset_challenge.
[48] Yu, L., Pan, R. and Li, Z. (2011). Adaptive social similarities for recommender systems. In Proceedings of the Fifth ACM Conference on Recommender Systems, 257-260.
[49] Yu, H. F., Rao, N. and Dhillon, I. S. (2016). Temporal regularized matrix factorization for high-dimensional time series prediction. In Proceedings of the 30th International Conference on Neural Information Processing Systems, 847-855.
[50] Yu, X., Li, T., Ying, N. and Jing, B.-Y. (2021). Collaborative filtering with awareness of social network. Journal of Business & Economic Statistics. DOI: 10.1080/07350015. 2021.1954527. · Zbl 07928282 · doi:10.1080/07350015.2021.1954527
[51] Zhang, S., Yao, L., Sun, A. and Tay, Y. (2019). Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR) 52, 1-38.
[52] Zheng, J., Liu, J., Shi, C., Zhuang, F., Li, J. and Wu, B. (2017). Recommendation in heteroge-neous information network via dual similarity regularization. International Journal of Data Science and Analytics 3, 35-48.
[53] Zhu, Y., Shen, X. and Ye, C.(2016). Personalized prediction and sparsity pursuit in latent factor models. Journal of the American Statistical Association 111, 241-252.
[54] Zou, H. and Hastie, T.(2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67, 301-320. · Zbl 1069.62054
[55] E-mail: jingxuan.wang@ucsf.edu
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.