Monday Jan 14, 2013

ODI - Integrating PDF using iText

Integrating PDF form data with ODI is as easy as any other data source,  see the code template from Suraj Bang a few years back with OWB for what I followed and converted to ODI - a few changes for the newer version of iText and to work with ODI. The LKM PDF to SQL uses ODI's File technology as a source and loads to any ANSI SQL store. The LKM reads all of the form fields from a PDF form and supports the File technology's String and Numeric datatype. The fields have to be defined by position, so your field name in the file can be MYCOL_1, MYCOL_2, MYCOL_3 etc. or whatever prefix you desire. The LKM as mentioned uses the iText java api, so the JAR should be downloaded and copied into the normal ODI userlib directory, I tested with version 5.3.5 of iText.

The LKM PDF to SQL is here. I manually defined a File datastore with column names (remembering to use the position as the suffix) and types, the delimiter and such information are not used since its PDF, but I set them anyway. The example I used was to process W4 PDF data, the same as in Suraj's post and was able to process all of the PDFs in the same manner. Kudos to Suraj for the jython, with a few little changes from me to use the latest iText.

About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« January 2013 »
SunMonTueWedThuFriSat
  
1
3
4
5
6
7
8
9
12
13
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today