The Opposite of Smoothing: A Language Model Approach to Ranking Query-Specific Document Clusters

O. Kurland; E. Krikon

doi:10.1613/jair.3327

PDF PS

Published: Jul 29, 2011

DOI: https://doi.org/10.1613/jair.3327

O. Kurland

E. Krikon

Abstract

Exploiting information induced from (query-specific) clustering of top-retrieved documents has long been proposed as a means for improving precision at the very top ranks of the returned results. We present a novel language model approach to ranking query-specific clusters by the presumed percentage of relevant documents that they contain. While most previous cluster ranking approaches focus on the cluster as a whole, our model utilizes also information induced from documents associated with the cluster. Our model substantially outperforms previous approaches for identifying clusters containing a high relevant-document percentage. Furthermore, using the model to produce document ranking yields precision-at-top-ranks performance that is consistently better than that of the initial ranking upon which clustering is performed. The performance also favorably compares with that of a state-of-the-art pseudo-feedback-based retrieval method.

Issue

Vol. 41 (2011)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details