Question? Leave a message!




How can we improve recall in search

How can we improve recall in search
WilliamsMcmahon Profile Pic
WilliamsMcmahon,United States,Professional
Published Date:20-07-2017
Website URL
Comment
Introduction to Information Retrieval Introduction to Information Retrieval Relevance Feedback & Query Expansion 1Introduction to Information Retrieval Take-away today  Interactive relevance feedback: improve initial retrieval results by telling the IR system which docs are relevant / nonrelevant  Best known relevance feedback method: Rocchio feedback  Query expansion: improve retrieval results by adding synonyms / related terms to the query  Sources for related terms: Manual thesauri, automatic thesauri, query logs 2 2Introduction to Information Retrieval Overview ❶ Motivation ❷ Relevance feedback: Basics ❸ Relevance feedback: Details ❹ Query expansion 3Introduction to Information Retrieval Outline ❶ Motivation ❷ Relevance feedback: Basics ❸ Relevance feedback: Details ❹ Query expansion 4Introduction to Information Retrieval How can we improve recall in search?  Main topic today: two ways of improving recall: relevance feedback and query expansion  As an example consider query q: aircraft . . .  . . . and document d containing “plane”, but not containing “aircraft”  A simple IR system will not return d for q.  Even if d is the most relevant document for q  We want to change this:  Return relevant documents even if there is no term match with the (original) query 5 5Introduction to Information Retrieval Recall  Loose definition of recall in this lecture: “increasing the number of relevant documents returned to user”  This may actually decrease recall on some measures, e.g., when expanding “jaguar” with “panthera”  . . .which eliminates some relevant documents, but increases relevant documents returned on top pages 6 6Introduction to Information Retrieval Options for improving recall  Local: Do a “local”, on-demand analysis for a user query  Main local method: relevance feedback  Part 1  Global: Do a global analysis once (e.g., of collection) to produce thesaurus  Use thesaurus for query expansion  Part 2 7 7Introduction to Information Retrieval Google examples for query expansion  One that works well  ˜flights -flight  One that doesn’t work so well  ˜hospitals -hospital 8 8Introduction to Information Retrieval Outline ❶ Motivation ❷ Relevance feedback: Basics ❸ Relevance feedback: Details ❹ Query expansion 9Introduction to Information Retrieval Relevance feedback: Basic idea  The user issues a (short, simple) query.  The search engine returns a set of documents.  User marks some docs as relevant, some as nonrelevant.  Search engine computes a new representation of the information need. Hope: better than the initial query.  Search engine runs new query and returns new results.  New results have (hopefully) better recall. 10 10Introduction to Information Retrieval Relevance feedback  We can iterate this: several rounds of relevance feedback.  We will use the term ad hoc retrieval to refer to regular retrieval without relevance feedback.  We will now look at three different examples of relevance feedback that highlight different aspects of the process. 11 11Introduction to Information Retrieval Relevance feedback: Example 1 12 12Introduction to Information Retrieval Results for initial query 13 13Introduction to Information Retrieval User feedback: Select what is relevant 14 14Introduction to Information Retrieval Results after relevance feedback 15 15Introduction to Information Retrieval Vector space example: query “canine” (1) Source: Fernando Díaz 16 16Introduction to Information Retrieval Similarity of docs to query “canine” Source: Fernando Díaz 17 17Introduction to Information Retrieval User feedback: Select relevant documents Source: Fernando Díaz 18 18Introduction to Information Retrieval Results after relevance feedback Source: Fernando Díaz 19 19Introduction to Information Retrieval Example 3: A real (non-image) example Initial query: new space satellite applications Results for initial query: (r = rank) r + 1 0.539 NASA Hasn’t Scrapped Imaging Spectrometer + 2 0.533 NASA Scratches Environment Gear From Satellite Plan 3 0.528 Science Panel Backs NASA Satellite Plan, But Urges Launches of Smaller Probes 4 0.526 A NASA Satellite Project Accomplishes Incredible Feat: Staying Within Budget 5 0.525 Scientist Who Exposed Global Warming Proposes Satellites for Climate Research 6 0.524 Report Provides Support for the Critics Of Using Big Satellites to Study Climate 7 0.516 Arianespace Receives Satellite Launch Pact From Telesat Canada + 8 0.509 Telecommunications Tale of Two Companies User then marks relevant documents with “+”. 20