Tuesday May 18, 2010

Pattern for Defining Fields in Minion

While writing various applications that use Minion, I've settled into the habit of using an Enum to declare (and define) the fields I'm using in the search engine. It is a convenient and concise way to combine the declaration of their attributes with a mechanism for always getting the field name right. It is pretty simple and works well. I create an enum with one value per field, using the syntax that lets you specify arguments to a constructor to set the field's attributes. I then can use an accessor method of the enum to always get the particular value's field name whenever I need to refer to the field. As a bonus, I can throw in a defineFields method that simply iterates over all the fields in the enum, defining each one in the search engine with the attributes specified. See the example below.

/\*\* \* An enumeration of the fields available in the index. These values should \* always be used when referencing fields. \*/ public enum IndexFields { /\*\* Email address is saved and searchable \*/ EMAIL ("email", FieldInfo.Type.STRING, EnumSet.of(FieldInfo.Attribute.INDEXED, FieldInfo.Attribute.TOKENIZED, FieldInfo.Attribute.SAVED)), /\*\* ID is saved only \*/ PERSON_ID ("person-id", FieldInfo.Type.STRING, EnumSet.of(FieldInfo.Attribute.SAVED)), /\*\* Tags are saved and searchable, but not broken into tokens. They are vectored for use in document similarity \*/ TAG ("tag", FieldInfo.Type.STRING, EnumSet.of(FieldInfo.Attribute.INDEXED, FieldInfo.Attribute.VECTORED, FieldInfo.Attribute.SAVED)), /\*\* Bio is the "body" of the document, indexed, tokenized, and vectored \*/ BIO ("bio", FieldInfo.Type.NONE, EnumSet.of(FieldInfo.Attribute.INDEXED, FieldInfo.Attribute.TOKENIZED, FieldInfo.Attribute.VECTORED)); /\* \* Each enumerated value will have these three fields \*/ private final String fieldName; private final EnumSet<FieldInfo.Attribute> attrs; private final FieldInfo.Type type; /\*\* \* The constructor to create the instances defined \* above \*/ IndexFields(String fieldName, FieldInfo.Type type, EnumSet<FieldInfo.Attribute> attrs) { this.fieldName = fieldName; this.attrs = attrs; this.type = type; } /\* \* Public methods to get the field properties: \*/ public String getFieldName() { return fieldName; } public EnumSet<FieldInfo.Attribute> getAttributes() { return attrs; } public FieldInfo.Type getType() { return type; } public String toString() { return fieldName; } /\*\* \* Defines the fields enumerated in this enum in \* the provided search engine \*/ public static void defineFields(SearchEngine engine) throws SearchEngineException { for (IndexFields i : IndexFields.values()) { engine.defineField(new FieldInfo( i.getFieldName(), i.getAttributes(), i.getType())); } } }
About

Jeff Alexander is a member of the Information Retrieval and Machine Learning group in Oracle Labs.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Feeds