Friday Jun 13, 2008

Highlighting search results in Minion

I've just posted a new piece of Minion documentation about how search results highlighting works.

It's kind of complicated, but then again getting the highlighting that you want is kind of complicated. The short version is: if you have a set of query terms and a document that you want to highlight that contains (some of) those terms, then:


  1. Tell the passage retrieval API what fields you want to highlight and how to treat the passages in that field.

  2. Use the passage retrieval algorithm to find a set of passages.

  3. Pull out the highlighted passages and display theme.

Using the passage retrieval algorithm to find a set of passages has some handy side effects like it easily handles things like finding morphological variations of the query terms.

A major improvement for this version over previous versions, is that the process of figuring out how to build a passage of a particular size (e.g., you want to display a 500 character passage from the body of an email message) is a lot more robust.

About

This is Stephen Green's blog. It's about the theory and practice of text search engines, with occasional forays into recommendation and other technologies that can use a good text search engine. Steve is the PI of the Information Retrieval and Machine Learning project in Oracle Labs.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today