Objective: A common measure of Internet search engine effectiveness is its ability to find documents that a user perceives as 'relevant'. This study sought to test whether user provided relevance ratings for documents retrieved by an Internet search engine correlate with the decision outcome after use of a search engine. Design: 227 university students were asked to answer four randomly assigned consumer health questions, then to conduct an Internet search on one of two randomly assigned search engines of different performance, and to again answer the question. Measurements: Participants were asked to provide a relevance score for each document retrieved as well as a pre and post search answer to each question. Results: User relevance rankings had little or no predictive power. Relevance rankings were unable to predict whether the user of a search engine could correctly answer a question after search and could not differentiate between two search engines with statistically different performance in the hands of users. Only when users had strong prior knowledge of the questions, and the decision task was of low complexity, did relevance appear to have modest predictive power. Conclusions: User provided relevance rankings taken in isolation seem to be of limited to no value when designing a search engine that will be used in a general-purpose setting. Relevance rankings may have a place in situations in which experts provide rankings, and decision tasks are of complexity commensurate with the abilities of the raters. A more natural metric of search engine performance may be a user's ability to accurately complete a task, as this removes the inherent subjectivity of relevance rankings, and provides a direct and repeatable outcome measure which directly correlates with the performance of the search technology in the hands of users.
|Number of pages||4|
|Journal||Journal of the American Medical Informatics Association|
|Publication status||Published - Jul 2008|