Abstract
Understanding ambiguous or multi-faceted search queries is essential for information retrieval. The task of identifying the major aspects or senses of queries can be viewed as detection of query intents, where the intents are represented as a number of clusters. So the challenging issue in this task is how to generate intent candidates and group them semantically. This paper explores the competence of lexical statistics and embedding method. First a novel term expansion algorithm is designed to sketch all possible intent candidates. Moreover, an efficient query intent generation model is proposed, which learns latent representations for intent candidates via embedding-based methods. And then vectorized intent candidates are clustered and detected as query intents. Experimental results, based on the NTCIR-12 IMine-2 corpus, show that query intent generation model via phrase embedding significantly outperforms the state-of-art clustering algorithms in query intent detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The overview of the NTCIR-12 IMine-2 Task have been released in URL: http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings12/pdf/ntcir/OVERVIEW/01-NTCIR12-OV-IMINE-YamamotoT.pdf.
- 2.
- 3.
References
Liu, Y., Song, R., Zhang, M., Dou, Z., Yamamoto, T., Kato, M.P, Ohshima, H., Zhou, K.: Overview of the NTCIR-11 IMine task. In: NTCIR (2014)
Bendersky, M., Metzler, D., Croft, W. B.: Effective query formulation with multiple information sources. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 443–452. ACM (2012)
Bouchoucha, A., Nie, J.Y., Liu, X.: Université de Montréal at the NTCIR-11 IMine task. In: NTCIR(2014)
Cui, H., Wen, J.R., Nie, J.Y., Ma, W.Y.: Query expansion by mining user logs. IEEE Trans. Knowl. Data Eng. 15(4), 829–839 (2003)
Bai, J., Song, D., Bruza, P., Nie, J. Y., Cao, G.: Query expansion using term relationships in language models for information retrieval. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 688–695. ACM (2005)
Radlinski, F., Szummer, M., Craswell, N.: Inferring query intent from reformulations and clicks. In: Proceedings of the 19th International Conference on World Wide Web, pp. 1171–1172. ACM (2010)
Zhang, Z., Sun, L., Han, X.: Learning to mine query subtopics from query log. In: ACL Short papers, vol. 2, p. 341 (2015)
Jiang, D., Leung, K.W.T., Ng, W.: Query intent mining with multiple dimensions of web search data. World Wide Web 19(3), 475–497 (2016)
Li, C., Yan, N., Roy, S.B., Lisham, L., Das, G.: Facetedpedia: dynamic generation of query-dependent faceted interfaces for Wikipedia. In: Proceedings of the 19th International Conference on World Wide Web, pp. 651–660, ACM (2010)
Hu, Y., Qian, Y., Li, H., Jiang, D., Pei, J., Zheng, Q.: Mining query subtopics from search log data. In: SIGIR (2012)
Mei, L., Huang, H., Wei, X., Yuan, P., Mao, X.L.: FCL: a new network words extraction approach based on statistical language knowledge. Chinese National Conference on Social Media Processing. Communications in Computer and Information Science, vol. 568, pp. 119–130. Springer, Singapore (2015)
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. ICML 14, 1188–1196 (2014)
Mikolov, T., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. (2013)
Yamamoto, T., Liu, Y., Zhang, M., Dou, Z., Zhou, K., Markov, I., Kato, M.P, Ohshima, H., Fujita, S.: Overview of the NTCIR-12 IMine-2 task. In: Proceedings of the NTCIR (2016)
Tsukuda, K., Dou, Z., Sakai, T.: Microsoft research Asia at the NTCIR-10 Intent Task. In: NTCIR (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Gu, J., Feng, C., Gao, X., Wang, Y., Huang, H. (2016). Query Intent Detection Based on Clustering of Phrase Embedding. In: Li, Y., Xiang, G., Lin, H., Wang, M. (eds) Social Media Processing. SMP 2016. Communications in Computer and Information Science, vol 669. Springer, Singapore. https://doi.org/10.1007/978-981-10-2993-6_9
Download citation
DOI: https://doi.org/10.1007/978-981-10-2993-6_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2992-9
Online ISBN: 978-981-10-2993-6
eBook Packages: Computer ScienceComputer Science (R0)