Thursday Sep 12, 2013

Stream Relational Transactions into Big Data Systems

Are you one of the organizations adopting ‘big data systems’ to manage and analyze a class of data typically referred to as big data? If so, you may know that big data includes data that could be structured, semi-structured or unstructured, each of which originates from a variety of different sources.  Another characterization of big data is described by the data's volume, velocity, and veracity. Due to its promise to help harness the data deluge we are faced with, the adoption of big data solutions is becoming quite pervasive. In this blog post I’d like discuss how to leverage Oracle GoldenGate’s real-time replication for big data systems.

The term 'big data systems' is an umbrella terminology used in general to discuss a wide variety of technologies each of which is used for a specific purpose. Broadly speaking, big data technologies address the needs for batch, transactional, and real-time processing requirements. Using the appropriate big data technology is highly dependent on the use case being addressed.

While gaining business intelligence from transactional data continues to be a dominant factor in the decision making process, businesses have realized that gaining intelligence from other forms of data they have been collecting will enable them achieve a more complete view, address additional business objectives, and lead to better decision making. The following table illustrates some examples of various industry verticals, forms of data, and the objective the business attempts to achieve using the other forms of data.

Industry

Data

Objective

Healthcare

Practitioner’s notes, machine statistics.

Best practices and reduced hospitalization.

Retail

Weblog, click streams.

Micro-segmentation recommendations.

Banking

Weblogs, fraud reports.

Fraud detection, risk analysis.

Utilities

Smart meter reading, call center data.

Real-time and predictive utilization analysis.

Role of transactional data

When using other forms of data for analytics, better contextual intelligence is obtained when the analysis is combined with transactional data. Especially low-latency transactional data brings additional value to dynamically changing operations that day-old data cannot deliver. In organizations, a vast majority of applications' transactional data is captured in relational databases. In order to ensure an efficient supply of transactional data for big data analytics, there are several requirements that the data integration solution should address:

<!--[if !supportLists]-->· <!--[endif]-->Reliable change data capture and delivery mechanism

<!--[if !supportLists]-->· <!--[endif]-->Minimal resource consumption when extracting data from the relational data source

<!--[if !supportLists]-->· <!--[endif]-->Secured data delivery

<!--[if !supportLists]-->· <!--[endif]-->Ability to customize data delivery

<!--[if !supportLists]-->· <!--[endif]-->Support heterogeneous database sources

<!--[if !supportLists]-->· <!--[endif]-->Easy to install, configure and maintain

A solution which can reliably stream database transactions to a desired target enables that the effort is spent on data analysis rather than data acquisition. Also, when the solution is non-intrusive and minimally impacts the source database, it minimizes the need for additional resources and changes on the source database.

Oracle GoldenGate is a time tested and proven product for real-time, heterogeneous relational database replication. Oracle GoldenGate addresses the challenges listed above and is widely used by organizations for mission critical data replication among relational databases. Furthermore, GoldenGate moves transactional data in real-time to support timely operational business intelligence needs.

Oracle GoldenGate Integration Options for Big Data Analytics

There is a variety of integration options available with the Oracle GoldenGate product that facilitates delivering transactions on relational databases into non-relational targets.

Oracle GoldenGate provides pre-built adapters which integrate with Flat Files and Messaging Systems. Please refer to Oracle GoldenGate for Java - Administration Guide and Oracle GoldenGate for Flat Files -Administration Guide for more information.

Oracle GoldenGate also provides Java APIs and a framework for developing custom integrations to Java enabled targets. Using this capability, custom adapters or handlers can be developed to address specific requirements. In this blog post I’d like focus on Oracle GoldenGate Java APIs for developing custom integrations to big data systems.

As we mentioned earlier, 'big data systems' is an umbrella terminology used in general to describe a wide variety of technologies, each of which is used for a specific purpose. Among the various big data systems, Hadoop and its suite of technologies are widely adopted by various organizations for processing big data. The below diagram illustrates a general high level architecture for integrating with Hadoop.

<!--[if !vml]--><!--[endif]--> <!--[if !vml]--><!--[endif]-->

Custom Adapter

<!--[if !vml]--><!--[endif]--> <!--[if !vml]--><!--[endif]--> <!--[if !vml]--><!--[endif]--> <!--[if !vml]--><!--[endif]-->

Pump Parameter file

Adapter Properties file

<!--[if !vml]-->

You can implement custom adapter or handler for the big data system using Oracle GoldenGate's Java API. The custom adapter is deployed as an integral part of the Oracle GoldenGate Pump process. The Pump and the custom adapter are configured through the Pump parameter file and custom adapter's properties file respectively. Depending upon the requirements, the properties for the custom adapter will need to be determined and implemented.

The Pump process will execute the adapter in its address space. The Pump reads the Trail File created by the Oracle GoldenGate Capture process and passes the transactions to the adapter. Based on the configuration, the adapter will write the transactions into Hadoop.

Enabling the co-existence of big data systems with relational systems will benefit organizations to better serve customers and improve decision-making capabilities. Oracle GoldenGate, which has an excellent record of empowering IT on the various aspects of data management requirements, provides the capability to integrate with big data systems. In the upcoming blog posts, we will discuss in depth the implementation and the configuration of integrating Oracle GoldenGate with Hadoop technologies. 

Friday May 31, 2013

Improving Customer Experience for Segment of One Using Big Data

Customer experience has been one of the top focus areas for CIOs in recent years. A key requirement for improving customer experience is understanding the customer: their past and current interactions with the company, their preferences, demographic information etc. This capability helps the organization tailor their service or products for different customer segments to maximize their satisfaction. This is not a new concept. However, there have been two parallel changes in how we approach and execute on this strategy.

First one is the big data phenomenon that brought the ability to obtain a much deeper understanding of customers, especially bringing in social data. As this Forbes article "Six Tips for Turning Big Data into Great Customer Experiences" mentions big data especially has transformed online marketing. With the large volume and different types of data we have now available companies can run more sophisticated analysis, in a more granular way. Which leads to the second change: the size of customer segments. It is shrinking down to one, where each individual customer is offered a personalized experience based on their individual needs and preferences. This notion brings more relevance into the day-to-day interactions with customers, and basically takes customers satisfaction and loyalty to a new level that was not possible before. 

One of the key technology requirements to improve customer experience at such a granular level is to obtaining a complete and  up-to-date view of the customer. And that requires integrating data across disparate systems and in a timely manner. Data integration solution should move and transform large data volumes stored in heterogeneous systems in geographically dispersed locations. Moving data with very low latency to the customer data repository or a data warehouse, enables companies to have a relevant and actionable insight for each customer. Instead of relying on yesterday's data, which may not be pertinent anymore, the solution should analyze latest information and turn them into a deeper understanding of that customer. With that knowledge the company can formulate real opportunities to drive higher customer satisfaction.

Real-time data integration is key enabling technology for real-time analytics. Oracle GoldenGate's real-time data integration technology  has been used by many leading organizations to get the most out of their big data and build a closer relationship with customers.  One good example in the telecommunications industry is MegaFon. MegaFon is Russia's top provider of mobile internet solutions. The company deployed Oracle GoldenGate 11g to capture billions of monthly transactions from eight regional billing systems. The data was integrated and centralized onto Oracle Database 11g  and distributed to business-critical subsystems. The unified and up-to-date view into customers enabled more sophisticated analysis of mobile usage information and facilitated more targeted customer marketing. As a result of  the company increased revenue generated from the current customer base. Many other telecommunications industry leaders, including DIRECTV, BT, TataSky, SK Telecom, Ufone, have improved customer experience by leveraging real-time data integration. 

Telecommunications is not the only industry where single view of the customer drives more personalized interaction with customers. Woori Bank  implemented Oracle Exadata and Oracle GoldenGate.  In the past, it had been difficult for them to revise and incorporate changes to marketing campaigns in real time because they were working with the previous day’s data. Now, users can immediately access and analyze transactions for specific trends in the data mart access layer and adjust campaigns and strategies accordingly. Woori Bank can also send tailored offers to customers. 

This is just one example of how real-time data integration can transform business operations and the way a company interacts with its customers. I would like to invite you to learn more about data integration facilitating improved customer experience by  reviewing our  free resources here and following us on Facebook, Twitter, YouTube, and Linkedin.

Image courtesy of jscreationzs at FreeDigitalPhotos.net

Thursday Apr 11, 2013

Why Real Time?

Continuing on the five key data integration requirements topic, this time we focus on real-time data for decision making. 

[Read More]

Friday Mar 15, 2013

Pervasive Access to Any Data

In my previous blog, I shared with you the five key data integration requirements, which can be summarized as: integrating any data from any source, stored on premise or in the cloud, with maximum performance and availability, to achieve 24/7 access to timely and trusted information. Today, I want to focus on the requirement for integrating “any data”.

We all feel the impact of huge growth in the amount of raw data collected on a daily basis. And big data is a popular topic of information technology these days. Highly complex, large volumes of data bring opportunities and challenges to IT and business audiences. The opportunities, as discussed in McKinsey’s report, are vast, and companies are ready to tap into big data to differentiate and innovate in today’s competitive world.

One of the key challenges of big data is managing the unstructured data, which is estimated to be %80 of enterprise data. Structured and unstructured data must coexist and be used in conjunction with each other in order to gain maximum insight. This means, organizations must collect, organize, and analyze data from sensors, conversations, e-commerce websites, social networks, and many other sources.

Big data also changes the perspective into information management. It changes the question from “How do you look at your data?” to “How do you look at the data that is relevant to you?” This shift in perspective has huge implications in terms of information-management best practices and technologies applied. Data integration technologies now need to support unstructured and semi-structured data, in addition to structured transactional data, to be able to support a complete picture of the enterprise that will drive higher efficiencies, productivity and innovation.

Oracle addresses big data requirements with a complete solution.

In addition to Oracle Big Data Appliance for acquiring and organizing big data, Oracle offers Oracle Big Data Connectors that enable an integrated data set for analysis. Big Data Connectors is a software suite that integrates big data across the enterprise. Oracle Data Integrator offers an Application Adapter for Hadoop, which is part of the Big Data Connectors, and allows organizations to build Hadoop metadata within Oracle Data Integrator, load data into Hadoop, transform data within Hadoop, and load data directly into Oracle Database using Oracle Loader for Hadoop. Oracle Data Integrator has the necessary capabilities for integrating structured, semi-structured, and unstructured data to support organizations with transforming any type of data into real value.

If you would like to learn more about how to use Oracle’s data integration offering for your big data initiatives take a look at our resources on Bridging the Big Data Divide with Oracle Data Integration.

Thursday Mar 07, 2013

5 New Data Integration Requirements for Today’s Data-Driven Organizations

How are the latest technology trends affecting data integration requirements? Read about the 5 key data integration requirements and Oracle's Data Integration product strategy in meeting these requirements.[Read More]

Monday Feb 25, 2013

Connecting Velocity to Value: Introducing Oracle Fast Data

To understand fast data, one must first look at one of the most compelling new the breakthroughs in data management: big data. Big data solutions address the challenge today’s businesses are facing when it comes to managing the increasing volume, velocity, variety of all data - not just data within as well as about the organization. Much of the buzz and to-do around big data has been around Hadoop, NoSQL technologies, but little has been talked about velocity. Velocity is about the speed this data is generating. In many cases the economic value of this data diminishes fast as well. As a result, companies need to process large volumes of data in real-time and make decisions in a more rapid fashion to create value from highly-perishable, high-volumes of data in business operations.

This is where fast data comes in. Fast data solutions help manage the velocity (and scale) of any type of data and any type of event to enable precise action for real-time results.

Fast data solutions come from multiple technologies, and some of the concepts, such as complex event processing and business activity monitoring, have been in use in areas such as the financial services industry for years. But often, the pieces were used in isolation—a complex event process engine as a standalone application to apply predefined business rules to filter data, for example. But when these concepts are tied to analytics, capabilities expand to allow improved real-time insights. By tying together these strands, companies can filter/correlate, move/transform, analyze, and finally act on information from big data sources quickly and efficiently, enabling both real-time analysis and further business intelligence work once the information is stored.

Oracle’s Fast Data solutions offer multiple technologies that work hand-in-hand to create value out of high-velocity, high-volume data. They are designed to optimize the efficiency, scale for processing high volume events and transactions.

[Read More]

Friday Dec 28, 2012

ODI - Reverse Engineering Hive Tables

ODI can reverse engineer Hive tables via the standard reverse engineer and also an RKM to reverse engineer tables defined in Hive, this makes it very easy to capture table designs in ODI from Hive for integrating. To illustrate I will use the movie lens data set which is a common data set used in Hadoop training.

I have defined 2 tables in Hive for movies and their ratings as below, one file has fields delimited with '|' the other is tab delimited. 

  1. create table movies (movie_id int, movie_name string, release_date string, vid_release_date string,imdb_url string) row format delimited fields terminated by '|';
  2. create table movie_ratings (user_id string, movie_id string, rating float, tmstmp string) row format delimited fields terminated by '\t';

For this example I have loaded the Hive tables manually from my local filesystem (into Hive/HDFS) using the following LOAD DATA Hive commands and the movie lens data set mentioned earlier; 

  1. load data local inpath '/home/oracle/data/u.item' OVERWRITE INTO TABLE movies;
  2. load data local inpath '/home/oracle/data/u.data' OVERWRITE INTO TABLE movie_ratings;

The data set in the file u.item data file looks like the following with '|' delimiter;

  • 1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0
  • 2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
  • 3|Four Rooms (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995)|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0

In ODI I can define my Hive data server and logical schema, here is the JDBC connection for my Hive database (I just used the default);

I can then define my model and perform a selective reverse using standard ODI functionality, below I am reversing just the movies table and the movie ratings table;

I

After the reverse is complete, the tables will appear in the model in the tree, the data can be inspected just like regular datastores;

From here we see the data in the regular data view;

The ODI RKM for Hive performs logging that is useful in debugging if you hit issues with the reverse engineer. This is a very basic example of how some of the capabilities hang together, ODI can also be used to design the load of the file into Hive, transformations within it and subsequent loads using Oracle Loader for Hadoop into Oracle and on and on.

Monday Oct 15, 2012

Bridging the Gap in Cloud, Big Data, and Real-time

With all the buzz of around big data and cloud computing, it is easy to overlook one of your most precious commodities—your data. Today’s businesses cannot stand still when it comes to data. Market success now depends on speed, volume, complexity, and keeping pace with the latest data integration breakthroughs. Are you up to speed with big data, cloud integration, real-time analytics? Join us in this three part blog series where we’ll look at each component in more detail. Meet us online on October 24th where we’ll take your questions about what issues you are facing in this brave new world of integration.

Let’s start first with Cloud.

What happens with your data when you decide to implement a private cloud architecture? Or public cloud? Data integration solutions play a vital role migrating data simply, efficiently, and reliably to the cloud; they are a necessary ingredient of any platform as a service strategy because they support cloud deployments with data-layer application integration between on-premise and cloud environments of all kinds. For private cloud architectures, consolidation of your databases and data stores is an important step to take to be able to receive the full benefits of cloud computing. Private cloud integration requires bidirectional replication between heterogeneous systems to allow you to perform data consolidation without interrupting your business operations. In addition, integrating data requires bulk load and transformation into and out of your private cloud is a crucial step for those companies moving to private cloud. In addition, the need for managing data services as part of SOA/BPM solutions that enable agile application delivery and help build shared data services for organizations.

But what about public Cloud? If you have moved your data to a public cloud application, you may also need to connect your on-premise enterprise systems and the cloud environment by moving data in bulk or as real-time transactions across geographies.

For public and private cloud architectures both, Oracle offers a complete and extensible set of integration options that span not only data integration but also service and process integration, security, and management. For those companies investing in Oracle Cloud, you can move your data through Oracle SOA Suite using REST APIs to Oracle Messaging Cloud Service —a new service that lets applications deployed in Oracle Cloud securely and reliably communicate over Java Messaging Service . As an example of loading and transforming data into other public clouds, Oracle Data Integrator supports a knowledge module for Salesforce.com—now available on AppExchange. Other third-party knowledge modules are being developed by customers and partners every day.

To learn more about how to leverage Oracle’s Data Integration products for Cloud, join us live: Data Integration Breakthroughs Webcast on October 24th 10 AM PST.

Friday Sep 21, 2012

Tackling Big Data Analytics with Oracle Data Integrator

 By Mike Eisterer

 The term big data draws a lot of attention, but behind the hype there's a simple story. For decades, companies have been making business decisions based on transactional data stored in relational databases. Beyond that critical data, however, is a potential treasure trove of less structured data: weblogs, social media, email, sensors, and documents that can be mined for useful information.

 Companies are facing emerging technologies, increasing data volumes, numerous data varieties and the processing power needed to efficiently analyze data which changes with high velocity.

 

 

Oracle offers the broadest and most integrated portfolio of products to help you acquire and organize these diverse data sources and analyze them alongside your existing data to find new insights and capitalize on hidden relationships

 

Oracle Data Integrator Enterprise Edition(ODI) is critical to any enterprise big data strategy. ODI and the Oracle Data Connectors provide native access to Hadoop, leveraging such technologies as MapReduce, HDFS and Hive. Alongside with ODI’s metadata driven approach for extracting, loading and transforming data; companies may now integrate their existing data with big data technologies and deliver timely and trusted data to their analytic and decision support platforms. In this session, you’ll learn about ODI and Oracle Big Data Connectors and how, coupled together, they provide the critical integration with multiple big data platforms.

 

Tackling Big Data Analytics with Oracle Data Integrator

October 1, 2012 12:15 PM at MOSCONE WEST – 3005

For other data integration sessions at OpenWorld, please check our Focus-On document

If you are not able to attend OpenWorld, please check out our latest resources for Data Integration.

Monday Aug 20, 2012

Hadoop - Invoke Map Reduce

Carrying on from the previous post on Hadoop and HDFS with ODI packages, this is another needed call out task - how to execute existing map-reduce code from ODI. I will show this by using ODI Packages and the Open Tool framework.

The Hadoop JobConf SDK is the class needed for initiating jobs whether local or remote – so the ODI agent could be hosted on a system other than the Hadoop cluster for example, and just fire jobs off to the Hadoop cluster. Also some useful posts such as this older one on executing map-reduce jobs from java (following the reply from Thomas Jungblut in this post) helped me get up to speed.

Where better to start than the WordCount example (see a version of it here, both mapper and reducer and inner classes), let’s see how this can be invoked from an ODI package. The HadoopRunJob below is a tool I added via the Open Tool framework, it basically wrappers the JobConf SDK, the parameters are defined in ODI.

You can see some of the parameters below, so I define the various class names I need below, plus various other parameters including the Hadoop job name, can also specify the job tracker to fire the job on (for a client-server style architecture). The input path and output path are also defined as parameters, you can see the first tool in the package is calling the copy file to HDFS – this is just to demonstrate that I will copy the files needed by the WordCount program into HDFS ready for it to run.

Nice and simple, and shields a lot of the complexity hidden behind some simple tools. The JAR file containing WordCount needed to be available to the ODI Agent (or Studio since I invoked it with the local agent), that was it. When the package is executed, just like normal the agent processes the code and executes the steps. I I run the package above it will successfuly copy the files to HDFS and perform the word count. On a second execution of the package an error will be reported because the output directory already exists as below.

I left the example like this to illustrate that we can then extend the package design to have conditional branching to handle errors after a flow, just like the following;

Here after executing the word count, the status is checked and you can conditionally branch on success or failure – just like any other ODI package. I used the beep just for demonstration.

The above HadoopRunJob tool used above was done using the v1 MapReduce SDK, with MR2/Yarn this again will change – these kinds of changes hammer home the need for better tooling to abstract common concepts for users to exploit.

You can see from these posts that we can provide useful tooling behind the basic mechanics of Hadoop and HDFS very easily, along with the power of generating map-reduce jobs from interface designs which you can see from the OBEs here.

Wednesday Aug 01, 2012

ODI 11g - Hadoop integration self study

There is a self study available at the link below which is a great introduction to the Hadoop related integration available in ODI 11.1.1.6 (see earlier blog here). Thanks to the curriculum development group for creating this material. You can see from the study how ODI was extended to support integration in and out of the Hadoop ecosystem.

https://apex.oracle.com/pls/apex/f?p=44785:24:0::NO:24:P24_CONTENT_ID,P24_PREV_PAGE:6130,29

The paper here titled 'High Performance Connectors for Load and Access of Data from Hadoop to Oracle  Database' which describes the raw capabilities in the Oracle Loader for Hadoop and Oracle Direct Connector for HDFS is encapsulated in the HDFS File/Hive to Oracle KM, so the different options for loading described in the paper are modeled as capabilities of the Knowledge Module. Another great illustration of the capabilities of KMs.

Much more to come in this space... 

Monday Jul 30, 2012

Four Ways to Wrestle a Big Data Elephant

He’s large. He’s fast. He’s strong. And very very hungry! Meet the big data elephant. Perhaps you have seen him stalking the corners of your data warehouse looking for some untapped data to devour? Or some unstructured weblogs to weigh in on. To wrestle the elephant to work for you rather than against you, we need data integration. But not just any kind, we need newer styles of data integration that are poised for these evolving types of data management challenges. I've put together four key requirements below with some pointers to industry experts in each category. Hopefully this is useful. And, good luck with that 8 and ¼ tons of data!

Four Ways to Wrestle a Big Data Elephant

  • Leverage existing tools and skill-sets
  • Quality first
  • Remember real-time
  • Integrate the platform

Leverage existing tools and skill-sets

While Hadoop technologies are cool to say, and can seem to add an impressive ‘buzz’ to your LinkedIn/Twitter profiles, a word of caution that not every big data technology may actually be necessary. The trend now is that tools are becoming ‘integrated’ in such a way that designing ETL and developing mapReduce can be implemented in a single design environment. Data Integration tools are evolving to support new forms of connectivity to source in NoSQL, HDFS. This is as opposed to keeping these two worlds separate. Something that I referred to recently in my blog on Bridging two Worlds: Big Data and Enterprise Data.

The advantages of a single solution allow you to address not only the complexities of mapping, accessing, and loading big data but also correlating your enterprise data – and this correlation may require integrating across mixed application environments. The correlation is key to taking full advantage of big data and requires a single unified tool that can straddle both environments.

Quality First

Secondly, big data sources consist of many different types and in many different forms. How can anyone be sure of the quality of that data? And yes, data stewardship best practices still do apply. In the big data scenario, data quality is important because of the multitude of data sources Multiple data sources make it difficult to trust the underlying data. Being able to quickly and easily identify and resolve any data discrepancies, missing values, etc in an automated fashion is beneficial to the applications and systems that use this information.

Remember real-time

I covered this very subject in last week’s blog on Is Big Data Just Super Sexy Batch. No it’s not. But at the same time, it would be an overstatement to say that big data addresses all of our real-time needs. [The cheetah still runs faster than the elephant… although I still wouldn’t want to try to outrun an elephant!]. Tools such as Oracle GoldenGate and techniques in real-time replication, change data capture don’t simply disappear with big data. In fact, the opposite will happen. They become even more crucial as our implementations cross over between unstructured and structured worlds where both performance, low-latency become increasingly paramount as volumes rise and velocity requirements

Integrate the platform

Taking all the miscellaneous technologies around big data – which are new to many organizations - and making them each work with one another is challenging. Making them work together in a production-grade environment is even more daunting. Integrated systems can help an organization radically simplify their big data architectures by integrating the necessary hardware and software components to provide fast and cost-efficient access, and mapping, to NoSQL and HDFS.

Combined hardware and software systems can be optimized for redundancy with mirrored disks, optimized for high availability with hot-swappable power, and optimized for scale by adding new racks with more memory and processing power. Take it one step further and you can use these same systems to build out more elastic capacity to meet the flexibility requirements big data demands.

To learn more about Oracle Data Integration products, see our website or to follow more conversations like this one join me on twitter @dainsworld.

Sunday Jul 22, 2012

Is Big Data just Super Sexy Batch?

One of the key expectations we have for big data and our information architecture is to yield faster, better and more insightful analytics. That appeal of processing so much information quickly is why the Hadoop technologies may have originally been invented. But is it nothing more than a super sexy batch? Yes – on sexy. But there’s definitely an important real-time element involved. Read the rest of the article to see more on our take on the intersection of batch, real-time, big data, and business analytics. [Read More]

Monday Apr 02, 2012

Why Oracle Data Integrator for Big Data?

Big Data is everywhere these days - but what exactly is it? It’s data that comes from a multitude of sources – not only structured data, but unstructured data as well.  The sheer volume of data is mindboggling – here are a few examples of big data: climate information collected from sensors, social media information, digital pictures, log files, online video files, medical records or online transaction records.  These are just a few examples of what constitutes big data.   Embedded in big data is tremendous value and being able to manipulate, load, transform and analyze big data is key to enhancing productivity and competitiveness. 

The value of big data lies in its propensity for greater in-depth analysis and data segmentation -- in turn giving companies detailed information on product performance, customer preferences and inventory.  Furthermore, by being able to store and create more data in digital form, “big data can unlock significant value by making information transparent and usable at much higher frequency." (McKinsey Global Institute, May 2011)

Oracle's flagship product for bulk data movement and transformation, Oracle Data Integrator, is a critical component of Oracle’s Big Data strategy. ODI provides automation, bulk loading, and validation and transformation capabilities for Big Data while minimizing the complexities of using Hadoop.  Specifically, the advantages of ODI in a Big Data scenario are due to pre-built Knowledge Modules that drive processing in Hadoop. This leverages the graphical UI to load and unload data from Hadoop, perform data validations and create mapping expressions for transformations.  The Knowledge Modules provide a key jump-start and eliminate a significant amount of Hadoop development. 

Using Oracle Data Integrator together with Oracle Big Data Connectors, you can simplify the complexities of mapping, accessing, and loading big data (via NoSQL or HDFS) but also correlating your enterprise data – this correlation may require integrating across heterogeneous and standards-based environments, connecting to Oracle Exadata, or sourcing via a big data platform such as Oracle Big Data Appliance.

To learn more about Oracle Data Integration and Big Data, download our resource kit to see the latest in whitepapers, webinars, downloads, and more… or go to our website on www.oracle.com/bigdata

Wednesday Mar 28, 2012

New Feature in ODI 11.1.1.6: ODI for Big Data

By Ananth Tirupattur

Starting with Oracle Data Integrator 11.1.1.6.0, ODI is offering a solution to process Big Data. This post provides an overview of this feature.

Before getting into the details of ODI for Big Data and with all the buzz around Big Data, I will provide a brief introduction to Big Data and Oracle Solution for Big Data.

[Read More]
About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« March 2015
SunMonTueWedThuFriSat
1
2
3
4
5
6
7
8
9
10
12
13
14
15
16
17
18
19
20
21
22
23
24
25
27
28
29
30
31
    
       
Today