Office 2007 files in SES 10.1.8.4

Following on from my previous article, here's another handy use for a document service.

SES 10.1.8.4 cannot currently handle Office 2007 files formats (.docx, .pptx and .xlsx), but these can be handled by the filters in the latest versions of Oracle Text. The next release of SES will also use these new filters, and thus have full support for Office 2007. Meanwhile, if we should happen to have an installation of Oracle 11.1.0.7 database on the same machine as SES, we can use the 11.1.0.7 filters to do the filtering within SES.

We can do this because document services have access the original pre-filtered binary documents. So even though the built-in SES filters will have failed/refused to index the documents, we can pick them up in the document service, filter them using the external filter executable from the 11g installation, and feed the resulting HTML stream back to SES for indexing.

The document service to do this can be found here: OfficexFilter.zip. Unzip it and check the readme.txt file for installation instructions.

Please note that although Oracle 11.1.0.6 is downloadable from Oracle.com, you need the 11.1.0.7 patchset to get the new filters, which is available from Metalink. Also I'm not trying to guess what the licensing implications are here, you would need to discuss that with your Oracle Sales Representative, or someone else who deals with that sort of thing (which isn't me!).
Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

bocadmin_ww

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today