Indexing an SQL database (or part of one)

Via Glen Newton, LuSql, a "command line Java application that can be used to build a Lucene index from an arbitrary SQL query of a JDBC-accessible SQL database".

I've often thought about writing something similar for Minion, but Glen appears to have done a remarkably thorough job for Lucene.  A 40 page user manual? That's impressive. It's Apache Licensed, so I suppose I could port it to use Minion as the indexing engine...

Comments:

Hi Stephen,

The next version of LuSql has been re-architected and has plugins for both the source and the sink, as opposed to having a hard wired JDBC source and Lucene sink, as it is now. One of the sinks I will be creating for the release is a Minion sink, so JDBC-to-Minion will be possible (other sinks: Berkeley DB, JDBC, Terrier, XML, text; other sources: SparQl end point, Lucene, Minion, Terrier, Berkeley DB).

This next version will be out in 3-4 weeks (updating the docs will take more time than finishing the software!! :-) ) and I will let you know when it is released.

-Glen

Posted by Glen Newton on March 09, 2009 at 03:37 AM EDT #

Post a Comment:
Comments are closed for this entry.
About

This is Stephen Green's blog. It's about the theory and practice of text search engines, with occasional forays into recommendation and other technologies that can use a good text search engine. Steve is the PI of the Information Retrieval and Machine Learning project in Oracle Labs.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today