Tuesday Aug 06, 2013

ODI - Accelerator Launchpad, getting Groovy

To organize your groovy scripts and package up utilities there are various approaches to take. This one will exploit a little known fact in the groovy studio support. The code executed has some contextual variables available that you may be aware of - such as the odiInstance variable which represents a handle to the repository in the SDK. There is also another contextual variable which is 'this', this happens to be the groovy shell reference, which is very useful to execute other scripts from. For example the command below if executed will execute the groovy script from the filesystem from my script when executed. 

this.evaluate(new File("d:\\accelerators\odi_create_model.groovy"))

 This is a useful approach if you want to execute your scripts from a launchpad, or quick start/accelerator control. The image below shows a launchpad with buttons for various activities, one of which is the create model accelerator groovy script I created (this was blogged here).

When the '1. Create Model' button is pressed, the groovy script to execute the create model groovy is executed. This can do whatever you want, below is what my groovy script does to create a model;

Each button in the accelerator launchpad is defined as below, the script representing the action is evaluated when the button is pressed, its quite straightforward; 

  1.         button(action: action(name: '1. Create Model', closure: {
  2.           gshell.evaluate(new File("d:\\accelerators\odi_create_model.groovy"))
  3.         })

 You can quite easily invoke your companies specific accelerator routines with this approach.

Here is the full script...

  1. import groovy.swing.SwingBuilder
  2. import java.awt.FlowLayout as FL
  3. import javax.swing.BoxLayout as BXL

  4. def startLaunchPad() {
  5.   def s = new SwingBuilder()
  6.   s.setVariable('myDialog-properties',[:]) 
  7.   def vars = s.variables 
  8.   def dial = s.dialog(title:'ODI Accelerator Launchpad',id:'myDialog',modal:true) { 
  9.   def gshell = this
  10.   panel() {
  11.     flowLayout(alignment:FL.RIGHT)
  12.     label(icon:imageIcon(url:new java.net.URL('file:///d|/accelerators/images/odi_launchpad.jpg')))

  13.     panel() {
  14.         boxLayout(axis:BXL.Y_AXIS)
  15.         panel(alignmentX:0f) {
  16.             flowLayout(alignment:FL.CENTER)
  17.         button(action: action(name: '1. Create Model', closure: {
  18.           gshell.evaluate(new File("d:\\ accelerators\odi_create_model.groovy"))
  19.         }))
  20.         }
  21.         panel(alignmentX:0f) {
  22.             flowLayout(alignment:FL.CENTER)
  23.         button(action: action(name: '2.         ..........', closure: {
  24.           gshell.evaluate(new File("d:\\ accelerators\anotherscript.groovy"))
  25.         }))
  26.         }
  27.         panel(alignmentX:0f) {
  28.             flowLayout(alignment:FL.CENTER)
  29.         button(action: action(name: '3.         ..........', closure: {
  30.           gshell.evaluate(new File("d:\\ accelerators\yetanotherone.groovy"))
  31.         }))
  32.         }
  33.         panel(alignmentX:0f) {
  34.             flowLayout(alignment:FL.LEFT)
  35.             button('OK',preferredSize:[80,24],
  36.                    actionPerformed:{
  37.                        vars.dialogResult = 'OK' 
  38.                        dispose()
  39.             })
  40.             button('Cancel',preferredSize:[80,24],
  41.                    actionPerformed:{
  42.                        vars.dialogResult = 'cancel'
  43.                        dispose()
  44.             })
  45.         }
  46.     }
  47. }  }
  48.   dial.pack()
  49.   dial.show()
  50.   return vars
  51. }

  52. vars= startLaunchPad()

The SwingBuilder framework is quite simple and lets you build up these kind of wrapper interfaces very simply.

Sunday Sep 13, 2009

How to Define Multi Record Format Files in ODI

The posts in this series assume that you have some level of familiarity with ODI. The concepts of Datastore, Model and Logical Schema are used here assuming that you understand them in the context of ODI. If you need more details on these elements, please refer to the ODI Tutorial for a quick introduction, or to the complete ODI documentation for more details.

It is not unusual to have to load files containing various record formats. For example a company might store orders and order lines using distinct record formats in the same flat file or you might have one single file containing a header, some records and a footer.
In this post, we'll use the following source file as an example:


1,101,2009/09/01,Computer Parts
2,234,101,Motherboard, Asus P6T,239.99
2,235,101,CPU,Intel Celeron 430,40
1,102,2009/09/02,Computer Parts
2,301,102,CPU,AMD Phenom II X4,170
1,103,2009/09/05,Printers
2,401,103,Inkjet Printer,Canon iP4600,69.99
2,402,103,Inkjet Printer,Epson WF30,39.99
2,403,103,Inkjet Printer,HP Deskjet D2660,49.99

As we can see the Order and Order Lines records have different formats (one has 4 fields, the other has 6 fields), they could also have a different field separator.

Identifying the Record Codes

The first step in order to handle such a file in ODI is to identify a record code, this record code should be unique for a particular record type. In our example the record code will be used by ODI to identify if the record is an Order or an Order Line. All the Order records should have the same record code, this also applies to the Order Lines records.
In our example the first field indicates the record code:
- 1 for Orders records.
- 2 for Order Lines records.

Define the Datastores

We assume that you have already created a Model using a Logical Schema that points to the directory containing your source file.

We will start by defining a datastore for the Order records.

Right-click on the File model and select Insert Datastore.
In the Definition tab, enter a name and specify the flat file resource name in the Resource Name field. 

Order_Datastore.jpg

 

 

In the Files tab, specify your flat file settings (delimiter, field separator etc.)

Refer to the ODI documentation for additional information regarding how to define a flat file datastore.

 

 

In our example the Order records have 4 fields:
- RECORD_TYPE
- ORDER_ID
- ORDER_DATE
- ORDER_TYPE

Go to the columns tab and add those 4 columns to your datastore.

Now specify the Record Code in the 'Rec. Code' field of the RECORD_CODE column.

 

 

Order_Datastore_Columns.jpg

Click OK.

 

 

In the Models view, right-click on the datastore and select View Data to display the file content and make sure it is defined correctly. 

 

Order_Datastore_Data.jpg

The data is filtered based on the record code value, we only see the Order records.

 

We will now apply the same approach to the Order Lines record.

Right-click on the File model and select Insert Datastore to add a second datastore for the Order Lines record.

In the Definition tab, enter a name and specify the flat file resource name in the Resource Name field. We are pointing this datastore to the same file we used for the Order records. 

 

Order_Line_Datastore.jpg

In the Files tab, specify your flat file settings (delimiter, field separator etc.).
Refer to the ODI documentation for additional information regarding how to define a flat file datastore.

 

 

In our example the Order Lines records have 6 fields:
- RECORD_TYPE
- LORDER_ID
- ORDER_ID
- LINE_ORDER_TYPE
- ITEM
- PRICE

Go to the columns tab and add those 6 columns to your datastore.

Now specify the Record Code in the 'Rec. Code' field of the RECORD_CODE column.

 

Order_Line_Datastore_Columns.jpg

Click OK.

 

Right-click on the datastore and select View Data to display the file content and make sure it is defined correctly.

 

Order_Line_Datastore_Data.jpg

The data is filtered based on the record code value, we only see the Order Lines records.

You can now use those 2 datastores in your interfaces.

 

All Screenshots were taken using version 10.1.3.5 of ODI. Actual icons and graphical representations may vary with other versions of ODI.

Wednesday Sep 09, 2009

Generating Sample Data with ODI: A Case Study For Knowledge Modules and User Functions

Looking for Data Integration at OpenWorld 2009? Look no further: all you need is here!

The posts in this series assume that you have some level of familiarity with ODI. The concepts of Interface, Model, Knowledge Module and User Function are used here assuming that you understand them in the context of ODI. If you need more details on these elements, please refer to the ODI Tutorial for a quick introduction, or to the complete ODI documentation for detailed information.


We've all been there: we start coding, waiting for a set of sample data to be available. We move along with the code... and the data is not available. Or we need to build a small (or not so small) data set quickly. Sure, we all have sample databases left and right for that purpose. But recently I was looking for a decent size data set for some tests (more than the traditional 30 sample records) and could not put my hands on what I needed. What the heck: why not have ODI build this for me?

The techniques that we will leveraged for this are the following:

  • Creation of a temporary interface to create the sample table (See this previous post for details on how to create a temporary interface)
  • Creation of a new knowledge module to generate enough records in the new table
  • Creation of ODI User Functions to simplify the generation of random values

 


All the objects mentioned in this article can be downloaded. Save

this XML file
if you want to import in your repository a project that already contains all the objects (IKM and User functions). Click
here
if you want to download a file that will let you import the different objects individually. You will have to unzip the file before importing the objects in the later case.

The samples provided here have all been designed for an Oracle database, but can be modified and adapted for other technologies.

Today we will discuss the different elements that allow us to generate the sample data set. In future posts, we will dissect the Knowledge Modules and User Functions to see what technological choices were made based on the different challenges that had to be solved.

1. THE INTERFACE

For more details on how to create a temporary interface, you can refer to this post. For our example, we will create a new table in an existing schema. When you create your temporary interface, remember to set the following elements:

  • Select of your staging area ( In the Definition tab of the interface)
  • Name your target table
  • Select the location of your target table (work schema / data schema)
  • Name the Columns, and set their individual data type and length

 


Interface Definition Tab

For our example, we will use a fairly simple table structure:
TABLE NAME:

SAMPLER

COLUMNS:
SAMPLER_ID number(3)
SAMPLER_NAME varchar2(30)
SAMPLER_PROMOTION varchar2(10)
SAMPLER_PRICE number(10,2)
SAMPLER_RELEASE_DATE date

 

Sampler Interface Table Creation

2. USER FUNCTIONS

The Oracle database comes with a package called DBMS_RANDOM. Other random generators can be used (DBMS_CRYPTO for instance has random generation functions as well). These functions take more or less parameters, and if we realize after creating dozens of mappings that using the "other" package would have been better... we would be in a lot of trouble. Creating user functions will allow us to:

  • Have a naming convention that is simplified
  • Limit the number of parameters
  • Limit the complexity of the code
  • Later maintain the code independently of our interfaces, in a centralized location: if we decide to change the code entirely, we will make modifications in one single place - no matter how often we use that function.

 


For our example, we will have 5 ODI user functions in ODI (again, these can be downloaded

here
):

  • RandomDecimal(Min, Max): generates a random value (with decimals) between the Min and Max values
  • RandomNumber(Min, Max): generates a random value (without decimals) between the Min and Max values
  • RandomBool(): generate a 0 or a 1
  • RandomDate(MinDate, MaxDate): returns a date between MinDate and MaxDate (make sure MinDate and MaxDate are valid dates for Oracle)
  • RandomString(Format, Min, Max): generates a random string with a minimum of Min characters and a maximum of Max characters. Valid formats are:
    • 'u', 'U' - returning string in uppercase alpha characters
    • 'l', 'L' - returning string in lowercase alpha characters
    • 'a', 'A' - returning string in mixed case alpha characters
    • 'x', 'X' - returning string in uppercase alpha-numeric characters
    • 'p', 'P' - returning string in any printable characters.

 


SamplerUserFunctions.PNG

We can either use these functions as is or as part of a more complex code logic, such as a case...when statement.

For our example, we will build the following mappings:

Column Mapping
SAMPLER_ID RandomNumber(1,999)
SAMPLER_NAME RandomString('A', 1, 30)
SAMPLER_PROMOTION case when RandomBool()=0 then 'FALSE'
else 'TRUE'
end
SAMPLER_PRICE RandomDecimal(1,10000)
SAMPLER_RELEASE_DATE RandomDate('01-JAN-2000', sysdate)


In ODI, the mappings will look like this:

Sampler Interface Mappings

3. THE KNOWLEDGE MODULE

Since we do not have any source table in this interface, we only need an IKM. The IKM provided will this example needs to be imported in your project.

Because the purpose of this KM is to generate sample data, it will have a few options where the default values will be different from the usual KMs:

  • TRUNCATE defaults to 'YES': we assume here that if you re-run the interface, you want to create a new sample. If you only want to add more records to an existing table, simply set this option to 'NO' in your interface.
  • CREATE_TABLE defaults to 'YES': we assume that the table to be loaded does not exist yet. You can turn that option to 'NO' if there is no need to create the table.
  • THOUSANDS_OF_RECORDS: set this to any value between 1 and 1,000 to generate between 1,000 and 1,000,000 records

 

Sampler IKM

Once you have set the values for your KM, you can run the interface and let it generate the random data set.

With the above configuration, and using a standard laptop (dual core 1.86GHz processor and 2 Gb of RAM) equipped with Oracle XE my statistics were as follows:

10,000 records generated in 5 seconds
100,000 records generated in 24 to 35 seconds (about 30 seconds on average)
1,000,000 records generated in 211 to 235 seconds (about 4 minutes on average)

Note that the machine was not dedicated to this process and was running other processes.

Statistics are available in the Operator interface.

Sampler Stats

To review the data loaded by ODI in your target table, simply reverse-engineer this table in a model, then right-click on the table and select View Data to see what was generated!

SamplerData.PNG

4. EXPANDING TO OTHER TECHNOLOGIES

One question: why did I stop here and did not try to make this work for other technologies? Well, it turns out that ODI is really meant to move and transform data. As long as I have at least ONE table with random data in any one of my databases, it is now faster to just create a regular ODI interface and move the data across... The design will take less than a minute. The data transfer should not take much time either. Who would try to spend more time coding when the solution is that simple?

But if you want to make this work for other databases, here are your entry points:

  • Duplicate the KM and modify it to use SQL that would work on these other databases
  • Update the user functions to make sure that they use the appropriate functions for the given databases
  • Use the same logic to create your interface

Enjoy!

 


All Screenshots were taken using version 10.1.3.5 of ODI. Actual icons and graphical representations may vary with other versions of ODI.

Data Integration Showcased at OpenWorld 2009

Thursday Sep 03, 2009

Designing and Loading Your Own Staging Tables

The posts in this series assume that you have some level of familiarity with ODI. The concepts of Interface, Model and Knowledge Module are used here assuming that you understand them in the context of ODI. If you need more details on these elements, please refer to the ODI Tutorial for a quick introduction, or to the complete ODI documentation for detailed information.

 

When ODI moves data from your source systems to your target system, it automates the management (creation, loading, deletion) of all the necessary staging tables. The Knowledge Modules control which tables are created when and where, so that the developers do not have to worry about these mechanics.
There are cases though where the logic of your integration processes calls for the creation of your own staging tables, independently of what ODI will create. You may want to pre-aggregate some data before applying further logic. You may want to break a very complex interface into a set on interfaces that will be easier to develop and maintain. Over time, I have found that creating these intermediate tables very valuable.
The problem is: they are not part of the data model.
One of the ODI features that is often overlooked is the ability to create tables directly from the designer tool.

We will look into the creation of the table from 2 different angles: using ODI Diagrams or using temporary interfaces. We will look into the pros and cons of each approach.Note that these techniques can be used beyond staging tables and leveraged in the design of your overall solutions.


1. ODI Diagrams

ODI Diagrams can be found under the Models section in the Designer interface.

DiagramNew.PNG

Creating Diagrams will have many benefits from a design perspective:
- You will be able to retrieve and convert table structures from other technologies, ODI will automatically convert the data types as required;
- Once you have designed your new tables, ODI can generate a DDL that is context independent (See this previous post for more details on contexts);
- If you derive the structure of the new tables from existing models, ODI will offer the automatic creation of interfaces for you - using the base tables as sources and the new tables as targets.

Let's now see how to create a diagram and get started.

1.1 Creation of an ODI Diagram

In your models, you will find the entry point for Diagrams directly under the Model name itself. Simply right-click on the Diagrams entry and select Insert Diagram to create your first Diagram.

Insert Diagram

When you create a diagram, you can either model all your tables one column at a time, or start from existing structures. This is the feature that I like best, as it will save me a lot of time in all subsequent operations. To try this out, simply drag and drop a table from another model (if possible even from a different technology) directly in the model. ODI will first ask you if you really want to create this table in the new model. Then it will immediately match the data types with the new technology, and display the resulting table so that you can make any modification as required: you may want different datatypes, different column names, you may want to add and remove columns, or even rename the table itself. You can either perform these operations as you are copying the structure, or later on when you will edit your diagram.

Diagram Table Name

Diagram Table Columns

Once you validate the definition of the table, ODI will display the table in the diagram and add it to your model tree. Note that at this point in time, the table is not created in the database.

Not all tables will necessarily have a matching structure in other systems. You may have to create them directly from the diagram. You can then take advantage of the toolbar to create not only the table and its columns, but also to define the relationship between the different tables.

Diagram.PNG


1.2 Leveraging the ODI Diagram

Once the Diagram has been created, several options become available:
- Generation of the DDLs to create the tables in the actual databases
- Generation of Interfaces using the initial tables as sources and the new tables as target (and vice-versa).
All these operations are possible from the Models menu (right-click on the model to see these operations).

Diagram Menus

The Generate DDL menu will allow you to create the tables you have designed along with all the necessary constraints. Furthermore, you will be able to run the DDL in all contexts with absolutely no code modification - hence guarantying consistency across environments .

The creation of interfaces will be as easy. But interfaces will require knowledge modules, and to truly take advantage of this feature, you want to make sure that the knowledge modules of your choice (and only these!) are imported in a project. The trick with "only these" is that you want the knowledge modules to be selected for you automatically. When ODI will generate the interfaces, if there is only one choice for the LKM, and only one choice for the IKM, maybe a CKM if needed - then you will not even have to open the interfaces before you run them (except for a sanity check maybe!). Do not hesitate and modify the default values for the KMs you have selected as, again, that will save you a tremendous amount of time in terms of reviews.

Once the KMs are set, go back to your model, and select the menu entry Generate Interfaces IN to generate the interfaces that will load your new tables (or Generate Interfaces OUT to generate interfaces that will extract data from your new tables). Select the project where you have imported the appropriate KMs as the destination for the interfaces, and wait for the automatic generation to happen. If you have renamed columns, they are properly mapped to the original column. If a table was created from several source tables, all sources are part of the interfaces. You only have to worry about new columns, relationships between source tables that ODI would not know of, and transformations if you need any.

Over time, I've used this feature a lot when I am given a set of files that are supposed to represent tables in a real case. I reverse engineer the files - then use the files definitions to quickly design the equivalent structures in the database that I need to load... and run the DDLs and interfaces all in one shot. A very nice way to save hours of development time!

1.3 When to use this approach

The real benefit of this approach is in the massive generation of DDLs and interfaces. If you only need to load a couple of tables, the benefit is not as obvious. But as the number of tables to create and interfaces to generate grows, so does the value of Diagrams. I have recently worked on a case where in less than an hour, I had two dozen tables designed, created, and loaded with data when initially all I had was a series of flat files...

2. TEMPORARY INTERFACES

There will be cases where you need to create just a few set of staging tables and you do not necessarily want to go through the creation of models and DDLs. For these cases, ODI will let you create your table directly from with your interfaces.

2.1 Creating the Interface

When you create a new table in an interface, you will first have to indicate where the table will be created. To do so, select "staging area different from target" in the definition tab of your interface, and select the schema where the table will be created.

New Interface Staging

2.2 Designing the table

The first step will be to name the table, and define where it will be created (in the data schema or in the work schema. These schemas are defined in the Topology interface. The work schema is typically your staging area). To name the table, click on the "Untitled" caption in the target side of you interface, and enter a table name in the properties window. It is in that same properties window that you will select the data or work schema.

Interface - Name Table

You can then add columns to this table. If you add a datastore on the source side of your interface, you can add the columns one by one (simply drag and drop the columns from the source datastore to the target datastore. Or you can right-click on any source column to add it to the target table. Or you can right-click on the table name and add all columns at once by selecting the menu "add to target". When the columns are added, ODI converts the data types to match those of the target technology, and pre-maps the target columns to the source columns.

Of course, you can always edit the column information later on in the Properties window.

Interface Add Columns

You can also right-click under the existing columns on the target side to manually add new columns. In this case, you will have to specify the column name, data type and size.

Interface Add New Column


2.3 Creating and loading the table

Once you have created the table as needed, you can use it as any other target table in your interface and design your transformations as usual. The one thing to remember will be to set the "CREATE_TARGET_TABLE" option to "YES" in the IKM of this interface to make sure that the table is created at execution time.

2.4 Using the staging table as a source

Once this table has been created and loaded, all you have to do to use it as a source is to drag and drop the Interface as a source in the next interface (you read that right: drag and drop the interface directly from the project tree, as if it were a table!). No need to actually reverse engineer the table and add it to your model.

This does not mean that you cannot reverse engineer the table and add it to your model. In many cases, the table will actually belong to the model in the end and definitely belongs with your metadata.

2.5 When to use this approach

This second approach is more attractive for "one-off" creation of staging tables. It is a very quick and easy way to integrate intermediary steps in your data integration logic, but does not have the scale effect of the diagram approach.

CONCLUSION

As we have seen here, ODI offer multiple techniques to create new tables very easily directly from the Graphical Interface. Depending on the techniques used to create the tables, ODI will offer many shortcuts to make your development initiatives even faster: either the generation of DDL and interfaces based on the old and new structures, or the ability to create a table and use it immediately without ever having to deal with the metadata...

Enjoy!


All Screenshots were taken using version 10.1.3.5 of ODI. Actual icons and graphical representations may vary with other versions of ODI.


Looking for Data Integration at OpenWorld 2009? Were are here!

For more information about ODI, click here.

About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
2
3
5
6
7
8
9
10
12
13
14
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today