×

Query-doc relation mining based on user search behavior. (Chinese. English summary) Zbl 1313.68150

Summary: The relationship between queries and docs is a valuable type of information that search engines hope to obtain. An exact correlation analysis between queries and docs is not only helpful for ranking search result, but also important for building a bridge between queries and docs to allow information transfer between related queries and docs, which is beneficial to a deep understanding of queries and to a series of applications. This paper presents a query-doc relation mining algorithm based on user search behavior. Initially, we collect and analyze users’ search log data to build a bipartite graph between queries and docs. Next we model the bipartite data with a Markov random walk model, and then mine the click-through data and session data from the bipartite graph. Eventually, we can obtain doc data that the user did not click in the click-through data and predict the implied relationship between queries and docs. Besides, we can also take advantage of the algorithm to get the potential relationship between queries and queries. Based on the theoretical foundation described above, we construct a complete log data mining system. Through a large number of experimental contrasts, the system shows outstanding performance on many aspects, such as increasing relevance up to 71.23%, which indicates that the theory and algorithms proposed in this paper can solve the problem of mining implicit relationships between queries and docs effectively. Our approach provides a good basis for increasing recall of search results in optimizing query recommendation and clustering retrieved results.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
05C81 Random walks on graphs
68P20 Information storage and retrieval of data
68R10 Graph theory (including graph drawing) in computer science
68U35 Computing methodologies for information systems (hypertext navigation, interfaces, decision support, etc.)