Question? Leave a message!




Evaluation & Result Summaries

Evaluation & Result Summaries
WilliamsMcmahon Profile Pic
WilliamsMcmahon,United States,Professional
Published Date:20-07-2017
Website URL
Comment
Introduction to Information Retrieval Introduction to Information Retrieval Evaluation & Result Summaries 1Introduction to Information Retrieval Overview ❶ Recap ❷ Unranked evaluation ❸ Ranked evaluation ❹ Evaluation benchmarks ❺ Result summaries 2Introduction to Information Retrieval Outline ❶ Recap ❷ Unranked evaluation ❸ Ranked evaluation ❹ Evaluation benchmarks ❺ Result summaries 3Introduction to Information Retrieval 4 4Introduction to Information Retrieval Pivot normalization source: Lilian Lee 5 5Introduction to Information Retrieval Heuristics for finding the top k even faster  Document-at-a-time processing  We complete computation of the query-document similarity score of document d before starting to compute the query- i document similarity score of d . i+1  Requires a consistent ordering of documents in the postings lists  Term-at-a-time processing  We complete processing the postings list of query term t i before starting to process the postings list of t . i+1  Requires an accumulator for each document “still in the running”  The most effective heuristics switch back and forth between term-at-a-time and document-at-a-time processing. 6 6Introduction to Information Retrieval Use min heap for selecting top k ouf of N  Use a binary min heap  A binary min heap is a binary tree in which each node’s value is less than the values of its children.  It takes O(N log k) operations to construct the k-heap containing the k largest values (where N is the number of documents).  Essentially linear in N for small k and large N. 7 7Introduction to Information Retrieval Binary min heap 8Introduction to Information Retrieval Selecting k top scoring documents in O(N log k)  Goal: Keep the k top documents seen so far  Use a binary min heap  To process a new document d′ with score s′:  Get current minimum h of heap (in O(1)) m  If s′ ≤ h skip to next document m  If s′ h heap-delete-root (in O(log k)) m  Heap-add d′/s′ (in O(1))  Reheapify (in O(log k)) 9 9Introduction to Information Retrieval Tiered index 10 10Introduction to Information Retrieval Outline ❶ Recap ❷ Unranked evaluation ❸ Ranked evaluation ❹ Evaluation benchmarks ❺ Result summaries 11Introduction to Information Retrieval Measures for a search engine  How fast does it index  e.g., number of bytes per hour  How fast does it search  e.g., latency as a function of queries per second  What is the cost per query?  in dollars 12 12Introduction to Information Retrieval Measures for a search engine  All of the preceding criteria are measurable: we can quantify speed / size / money  However, the key measure for a search engine is user happiness.  What is user happiness?  Factors include:  Speed of response  Size of index  Uncluttered UI  Most important: relevance  (actually, maybe even more important: it’s free)  Note that none of these is sufficient: blindingly fast, but useless answers won’t make a user happy.  How can we quantify user happiness? 13 13Introduction to Information Retrieval Who is the user?  Who is the user we are trying to make happy?  Web search engine: searcher. Success: Searcher finds what she was looking for. Measure: rate of return to this search engine  Web search engine: advertiser. Success: Searcher clicks on ad. Measure: clickthrough rate  Ecommerce: buyer. Success: Buyer buys something. Measures: time to purchase, fraction of “conversions” of searchers to buyers  Ecommerce: seller. Success: Seller sells something. Measure: profit per item sold  Enterprise: CEO. Success: Employees are more productive (because of effective search). Measure: profit of the company 14 14Introduction to Information Retrieval Most common definition of user happiness: Relevance  User happiness is equated with the relevance of search results to the query.  But how do you measure relevance?  Standard methodology in information retrieval consists of three elements.  A benchmark document collection  A benchmark suite of queries  An assessment of the relevance of each query-document pair 15 15Introduction to Information Retrieval Relevance: query vs. information need  Relevance to what?  First take: relevance to the query  “Relevance to the query” is very problematic.  Information need i : “I am looking for information on whether drinking red wine is more effective at reducing your risk of heart attacks than white wine.”  This is an information need, not a query.  Query q: red wine white wine heart attack  Consider document d′: At heart of his speech was an attack on the wine industry lobby for downplaying the role of red and white wine in drunk driving.  d′ is an excellent match for query q . . .  d′ is not relevant to the information need i . 16 16Introduction to Information Retrieval Relevance: query vs. information need  User happiness can only be measured by relevance to an information need, not by relevance to queries.  Our terminology is sloppy in these slides and in IIR: we talk about query-document relevance judgments even though we mean information-need-document relevance judgments. 17 17Introduction to Information Retrieval Precision and recall  Precision (P) is the fraction of retrieved documents that are relevant  Recall (R) is the fraction of relevant documents that are retrieved 18 18Introduction to Information Retrieval Precision and recall P = TP / ( TP + FP ) R = TP / ( TP + FN ) 19 19Introduction to Information Retrieval Precision/recall tradeoff  You can increase recall by returning more docs.  Recall is a non-decreasing function of the number of docs retrieved.  A system that returns all docs has 100% recall  The converse is also true (usually): It’s easy to get high precision for very low recall.  Suppose the document with the largest score is relevant. How can we maximize precision? 20 20