Abstract—Searchers using microblog search often have difficulty in expressing their needs, they often struggle to modify queries by adding or removing terms, and this lead to unsatisfied experience. On the other hand, Microblogs content is limited in size, contain a few words and these will cause a real problem of term mismatch in microblog search.
We propose a probabilistic model for query reformulation on click-through data, which can help user auto-complete their search need. We assume that the knowledge of good queries of former searchers can then be used for improving a poor query of following users, by expanding extra meaningful terms or fuzzing trivial query terms. The key idea is to discover terms that can grasp users' click need or carry less meaning from large-scale click-through data. Our results indicate the reformulated query can better describe users's real functional need. In the experiment, we try to reduce query drift and noise in click-through data by using crowd-sourced relevance feedback and smoothing methods. The experimental result is much better than that of searching only the original query, especial on gains of low-frequency and long-tailed query.
Index Terms—Query reformulation, query intent, query understanding, microblog retrieval.
Wei Pang and Junping Du are with Beijing University of Posts and Telecommunications, China (e-mail: pangweitf@163.com, junpingdu@126.com).
Cite: Wei Pang and Junping Du, "Query Expansion and Query Fuzzy with Large-Scale Click-through Data for Microblog Retrieval," International Journal of Machine Learning and Computing vol. 9, no. 3, pp. 279-287, 2019.