Monday Oct 12, 2015

Featuring Big Data Sessions at Oracle OpenWorld 2015

Oracle OpenWorld is only a few days away now and Big Data will be front and center again this year! Many of our Oracle Data Integration sessions will speak to your Big Data needs, we hope you will come and meet us to hear how Oracle Data Integration helps everyone by simplifying access to Big Data and introducing real time capabilities to Big Data.

I would recommend attending the following 2 key sessions on Oracle Data Integration with Big Data:

  • Enabling Real-Time Data Integration with Big Data [CON9724]
    In this session Chai Pydimukkala from the Oracle Data Integration Product Management team will discuss GoldenGate's offering for big data environments. With Chai, Janardh Bantupalli from LinkedIn, will present their solution that uses Oracle GoldenGate for Big Data to optimize the data warehousing environment and achieve operational insights with lower costs.
  • Oracle Data Integration Product Family: a Cornerstone for Big Data [CON9609]
    In this session Alex Kotopoulis from the Oracle Data Integration Product Management team and Mark Rittman Chief Technical Officer at Rittman Mead will describe how our Data Integration platform uses a metadata-based approach to hide the complexity of the various big data technologies such as Hive, Pig, and Spark, and delivers a simplified and future-proofed investment in big data technologies.

There are many more Big Data related sessions I’d also recommend to attend:

In addition we will be running several Hands on Labs covering Oracle Big Data Preparation Cloud Service, Oracle Data Integrator and Oracle GoldenGate. Space is limited and they usually fill up quickly so make sure to register!

Please also come to visit us at our various demo pods in Moscone South:

  • Oracle Big Data Preparation Cloud Service: Get Your Big Data Ready to Use
    Workstation ID: SBD-022 / Venue: Moscone South, Upper Right, Big Data Showcase
  • Oracle Big Data Preparation Cloud Service
    Workstation ID: SPI-023 / Venue: Moscone South, Oracle Cloud Platform and Infrastructure Showcase
  • Oracle Data Integrator Enterprise Edition and Big Data Option: High-Performance Data Integration
    Workstation ID: SLM-022 / Venue: Moscone South, Lower Left, Middleware
  • Oracle GoldenGate: Real-Time Data Integration for Heterogeneous and Big Data Environments
    Workstation ID: SLM-035 / Venue: Moscone South, Lower Left, Middleware
  • Tame Big Data with Oracle Data Integration
    Workstation ID: SBD-023 / Venue: Moscone South, Upper Right, Big Data Showcase

We hope you will join a few! Don’t forget to view the Focus on Data Integration – for a full review of Data Integration Sessions during OpenWorld. See you there!

Thursday Oct 08, 2015

Featuring Big Data Preparation Cloud Service and other Cloud Data Integration Sessions at Oracle OpenWorld 2015

Oracle OpenWorld is almost upon us! We are excited to be sharing with you some previews of what will be seen and discussed in just a few weeks in San Francisco!

One of the highlights is Oracle’s new Cloud Based Data Preparation Solution, Oracle Big Data Preparation Cloud Service, also known as BDP.  This new service will revolutionize the process of importing, preparing and publishing your complex business data and getting it ready for use allowing you to spend more time analyzing data rather than preparing data for analysis. Users are guided through the process with intuitive recommendation driven interfaces. The system also provides various ways to automate and operationalize the entire data preparation pipeline via the built in scheduler or via a rich set of RESTful API’s.

During OpenWorld, Oracle’s Luis Rivas, alongside Blue Cloud Innovations’ Vinay Kumar and Pythian’s Alex Gorbachev will discuss and demonstrate how big data promises many game-changing capabilities if tackled efficiently! You will discover how Oracle Big Data Preparation Cloud Service takes “noisy” data from a broad variety of sources in many different formats, both structured and unstructured, and uses sophisticated and unique blend of machine learning and Natural Language Processing based on a vast set of linked open reference data that provide a powerful way to ingest, prepare, enrich, and publish it into useful data streams, ready for further discovery, analysis, and reporting. Don’t miss it:

CON9615 Solving the “Dirty Secret” of Big Data with Oracle Big Data Preparation Cloud Service

Tuesday, Oct 27, 5:15 p.m. | Moscone South—310

Curious to find out more about BDP before the conference? Take a look here and view a short video: Chalk Talk: Oracle Big Data Preparation Cloud Service!

Since we are on the topic of Data Integration and the Cloud – I will also take a quick moment to remind everyone about Oracle Data Integrator’s (ODI) integration relative to the Oracle Storage Cloud Service as well for example. But that’s not all – here is a view into the Data Integration sessions that relate to the Cloud – in chronological order:

CON3506 Into the Cloud and Back with Oracle Data Integrator 12c

Monday, Oct 26, 5:15 p.m. | Moscone West—2022


CON9614 Oracle Data Integration Solutions: the Foundation for Cloud Integration

Wednesday, Oct 28, 11:00 a.m. | Moscone South—274


CON9717 Accelerate Cloud Onboarding Using Oracle GoldenGate Cloud Service

Wednesday, Oct 28, 3:00 p.m. | Moscone West—2022


CON9595 Cloud Data Quality: Lessons Learned from Oracle’s Journey to the Sales Cloud

Thursday, Oct 29, 12:00 p.m. | Moscone West—2022


CON9612 Oracle Enterprise Metadata Management and the Cloud

Thursday, Oct 29, 1:15 p.m. | Marriott Marquis—Salon 4/5/6

We hope you will join a few! Don’t forget to view the Focus on Data Integration – for a full review of Data Integration Sessions during OpenWorld. See you there!

Thursday Jul 02, 2015

Chalk Talk Video: How to Raise Trust and Transparency in Big Data with Oracle Metadata Management

Some fun new videos are available; we call the series ‘Chalk Talk’!

The first in the series that we will share with you around Oracle Data Integration speaks to raising trust and transparency within big data. It is known that crucial big data projects often fail due to a lack in the overall trust of the data. Data is not always transparent, and governing it can become a costly overhead. Oracle Metadata Management assists in the governance and trust across all data with the enterprise, Oracle and 3rd party.

View this video to learn more: Chalk Talk: How to Raise Trust and Transparency in Big Data.

For additional information on Oracle Metadata Management, visit the OEMM homepage.

Wednesday Apr 15, 2015

Data Governance for Migration and Consolidation

By Martin Boyd, Senior Director of Product Management

How would you integrate millions of parts, customer and supplier information from multiple acquisitions into a single JD Edwards instance?  This was the question facing National Oilwell Varco (NOV), a leading worldwide provider of worldwide components used in the oil and gas industry.  If they could not find an answer then many operating synergies would be lost, but they knew from experience that simply “moving and mapping” the data from the legacy systems into JDE was not sufficient, as the data was anything but standardized.

This was the problem described yesterday in a session at the Collaborate Conference in Las Vegas.  The presenters were Melissa Haught of NOV and Deepak Gupta of KPIT, their systems integrator. Together they walked through an excellent discussion of the problem and the solution they have developed:

The Problem:  It is first important to recognize that the data to be integrated from many and various legacy systems had been created over time with different standards by different people according to their different needs. Thus, saying it lacked standardization would be an understatement.  So how do you “govern” data that is so diverse?  How do you apply standards to it months or years after it has been created? 

The Solution:  The answer is that there is no single answer, and certainly no “magic button” that will solve the problem for you.  Instead, in the case of NOV, a small team of dedicated data stewards, or specialists, work to reverse-engineer a set of standards from the data at hand.  In the case of product data, which is usually the most complex, NOV found they could actually infer rules to recognize, parse, and extract information from ‘smart’ part numbers, even from part numbering schemes from acquired companies.  Once these rules are created for an entity or a category and built in to their Oracle Enterprise Data Quality (EDQ) platform. Then the data is run through the DQ process and the results are examined.  Most often you will find out problems, which then suggest some rule refinements are required. Rule refinement and data quality processing steps run repeatedly until the result is as good as it can be.  The result is never 100% standardized and clean data though. Some data is always flagged into a “data dump” for future manual remediation. 

Lessons Learned:

  • Although technology is a key enabler, it is not the whole solution. Dedicated specialists are required to build the rules and improve them through successive iterations
  • A ‘user friendly’ data quality platform is essential so that it is approachable and intuitive for the data specialists who are not (nor should they be) programmers
  • A rapid iteration through testing and rules development is important to keep up project momentum.  In the case of NOV, specialists request rule changes, which are implemented by KPIT resources in India. So in effect, changes are made and re-run overnight which has worked very well

Technical Architecture:  Data is extracted from the legacy systems by Oracle Data Integrator (ODI), which also transforms the data in to the right ‘shape’ for review in EDQ.  An Audit Team reviews these results for completeness and correctness based on the supplied data compared to the required data standards.  A secondary check is also performed using EDQ, which verifies that the data is in a valid format to be loaded into JDE.

The Benefit:  The benefit of having data that is “fit for purpose” in JDE is that NOV can mothball the legacy systems and use JDE as a complete and correct record for all kinds of purposes from operational management to strategic sourcing.  The benefit of having a defined governance process is that it is repeatable.  This means that every time the process is run, the individuals and the governance team as a whole learn something from it and they get better at executing it next time around.  Because of this NOV has already seen orders of magnitude improvements in productivity as well as data quality, and is already looking for ways to expand the program into other areas.

All-in-all, Melissa and Deepak gave the audience great insight into how they are solving a complex integration program and reminded us of what we should already know: "integrating" data is not simply moving it. To be of business value, the data must be 'fit for purpose', which often means that both the integration process and the data must be governed. 

Monday Feb 16, 2015

The Data Governance Commandments

This is the second of our Data Governance Series. Read the first part here.

The Four Pillars of Data Governance

Our Data Governance Commandments are simple principles that can help your organization get its data story straight, and get more value from customer, performance or employee data.

Data governance is a wide-reaching disciple, but like all walks of life, there are a handful of essential elements you need in place before you can start really enjoying the benefits of a good data governance strategy. These are the four key pillars of data governance:


Data is like any other asset your business has: It needs to be properly managed and maintained to ensure it continues delivering the best results.

Enter the data steward; a role dedicated to managing, curating and monitoring the flow of data through your organization. This can be a dedicated individual managing data full-time, or just a role appended to an existing employee’s tasks.

But do you really need one? If you take your data seriously, then someone should certainly be taking on this role; even if they only do it part-time.


So what are these data stewards doing with your data exactly? That’s for you to decide, and it’s the quantity and quality of these processes that will determine just how successful your data governance program is.

Whatever cleansing, cleaning and data management processes you undertake, you need to make sure they’re linked to your organization’s key metrics. Data accuracy, accessibility, consistency and completeness all make fine starting metrics, but you should add to these based on your strategic goals.


No matter how ordered your data is, it still needs somewhere to go, so you need to make sure your data warehouse is up to task, and is able to hold all your data in an organized fashion that complies with all your regulatory obligations.

But as data begins filling up your data warehouse, you’ll need to improve your level of data control and consider investing in a tool to better manage metadata: the data about other data. By managing metadata, you master the data itself, and can better anticipate data bottlenecks and discrepancies that could impact your data’s performance.

More importantly, metadata management allows you to better manage the flow of data—wherever it is going. You can manage and better control your data not just within the data warehouse or a business analytics tool, but across all systems, increasing transparency and minimizing security and compliance risks.

But even if you can control data across all your systems, you also need to ensure you have the analytics to put the data to use. Unless actionable insights are gleaned from your data, it’s just taking up space and gathering dust.

Best Practices

For your data governance to really deliver—and keep delivering—you need to follow best practices.

Stakeholders must be identified and held accountable, strategies must be in place to evolve your data workflows, and data KPIs must be measured and monitored. But that’s just the start. Data governance best practices are evolving rapidly, and only by keeping your finger on the pulse of the data industry can you prepare your governance strategy to succeed.

How Many Have You Got?

These four pillars are essential to holding up a great data governance strategy, and if you’re missing even one of them, you’re severely limiting the value and reliability of your data.

If you’re struggling to get all the pillars in place, you might want to read our short guide to data governance success.

Thursday Jan 22, 2015

OTN Virtual Technology Summit Data Integration Subtrack Features Big Data Integration and Governance

I am sure many of you have heard about the quarterly Oracle Technology Network (OTN) Virtual Technology Summits. It provides a hands-on learning experience on the latest offerings from Oracle by bringing experts from our community and product management team. 

The next OTN Virtual Technology Summit is scheduled to February 11th (9am-12:30pm PT) and will feature Oracle's big data integration and metadata management capabilities with hands-on-lab content.

The Data Integration and Data Warehousing sub-track includes the following sessions and speakers:

Feb 11th 9:30am PT -- HOL: Real-Time Data Replication to Hadoop using GoldenGate 12c Adaptors

Oracle GoldenGate 12c is well known for its highly performant data replication between relational databases. With the GoldenGate Adaptors, the tool can now apply the source transactions to a Big Data target, such as HDFS. In this session, we'll explore the different options for utilizing Oracle GoldenGate 12c to perform real-time data replication from a relational source database into HDFS. The GoldenGate Adaptors will be used to load movie data from the source to HDFS for use by Hive. Next, we'll take the demo a step further and publish the source transactions to a Flume agent, allowing Flume to handle the final load into the targets.

Speaker: Michael Rainey, Oracle ACE, Principal Consultant, Rittman Mead

Feb 11th 10:30am PT -- HOL: Bringing Oracle Big Data SQL to Oracle Data Integration 12c Mappings

Oracle Big Data SQL extends Oracle SQL and Oracle Exadata SmartScan technology to Hadoop, giving developers the ability to execute Oracle SQL transformations against Apache Hive tables and extending the Oracle Database data dictionary to the Hive metastore. In this session we'll look at how Oracle Big Data SQL can be used to create ODI12c mappings against both Oracle Database and Hive tables, to combine customer data held in Oracle tables with incoming purchase activities stored on a Hadoop cluster. We'll look at the new transformation capabilities this gives you over Hadoop data, and how you can use ODI12c's Sqoop integration to copy the combined dataset back into the Hadoop environment.

Speaker: Mark Rittman, Oracle ACE Director, CTO and Co-Founder, Rittman Mead

Feb 11th 11:30am PT-- An Introduction to Oracle Enterprise Metadata Manager
This session takes a deep technical dive into the recently released Oracle Enterprise Metadata Manager. You’ll see the standard features of data lineage, impact analysis and version management applied across a myriad of Oracle and non-Oracle technologies into a consistent metadata whole, including Oracle Database, Oracle Data Integrator, Oracle Business Intelligence and Hadoop. This session will examine the Oracle Enterprise Metadata Manager "bridge" architecture and how it is similar to the ODI knowledge module. You will learn how to harvest individual sources of metadata, such as OBIEE, ODI, the Oracle Database and Hadoop, and you will learn how to create OEMM configurations that contain multiple metadata stores as a single coherent metadata strategy.

Speaker: Stewart Bryson, Oracle ACE Director, Owner and Co-founder, Red Pill Analytics

I invite you to register now to this free event and enjoy this feast for big data integration and governance enthusiasts.

Americas -- February 11th/ 9am to 12:30pm PT- Register Now

Please note the same OTN Virtual Technology Summit content will be presented again to EMEA and APAC. You can register for the via the links below.

EMEA – February 25th / 9am to 12:30pm GMT* - Register Now

APAC – March 4th / 9:30am-1:00pm IST* - Register Now

Join us and let us know how you like the data integration sessions in this quarter's OTN event.

Wednesday Dec 10, 2014

Oracle Enterprise Metadata Management is now available!

As a quick refresher, Metadata Management is essential to solve a wide variety of critical business and technical challenges which include how report figures are calculated, understanding the impact of changes to data upstream, providing reports in a business friendly way in the browser and providing reporting capabilities on the entire metadata of an enterprise for analysis and improvement. Oracle Enterprise Metadata Management is built to solve all these pressing needs for customers in a lightweight browser-based interface. Today, we announce the availability of Oracle Enterprise Metadata Management as we continue to enhance this offering.

With Oracle Enterprise Metadata Management, you will find business glossary updates, updates for a better experience to the user interface as well as improved and new metadata harvesting bridges including Oracle SQL Server Data Modeler, Microsoft SQL Server Integration Services, SAP Sybase PowerDesigner, Tableau and more. There are also new dedicated web pages for tracing data lineage and impact! At a more granular level you will also find new customizable action menus per repository object type for more personalization. For a full read on new features, please read here. Additionally, view here for the certification matrix details.

Download Oracle Enterprise Metadata Management!

Thursday Aug 15, 2013

Putting Data to Work Using Enterprise Data Quality

Data Quality is a hot new topic that may not get covered much on our data integration blog [we’re now changing that!]. Oracle Enterprise Data Quality an essential element in the data integration portfolio. You can think of Oracle Enterprise Data Quality as your favorite one-size-swiss-army-multitool which can be applied to any number of data management and governance situations. These cases range from Master Data Management, application integrations, system migrations, BI and data warehousing, and finally governance, risk and compliance solutions. Let’s look at a couple of examples for each case:

  • Master your Data. No matter the type of information you are mastering and syncing—product, customer, supplier, employee, site, citizen, or any other type—you need a data quality process for initial cleanup and load as well as for ongoing duplicate prevention and governance. A sophisticated MDM program will even verify data as it is entered into spoke systems, preventing data quality problems at the source. For some organizations, the idea of implementing a data hub is just too daunting. Implementing an enterprise data quality (EDQ) solution to clean up legacy systems and drive consistent standards won’t deliver as much benefit as a true MDM solution, but it’s a cheaper, faster alternative that can help in the short term and pave the way for a more comprehensive MDM strategy down the road.
  • Drive Application Innovations. Business applications are only as useful as the data they present. The more complex your application landscape, the more likely you are to need a data quality program to drive consistent data across all your systems. EDQ solutions can clean up legacy systems and put preventative verification and governance processes in place to ensure a steady stream of high-quality data for all kinds of systems, such as customer relationship management, human resources, product lifecycle management, and search (especially for solutions based on Oracle Endeca, which thrives on well-structured, well-standardized data).
  • Simplify Migrations. Merging disparate data from many sources requires a disciplined approach to profiling, discovery, standardization, match, merge, case management, and governance. EDQ solutions pave the way to a smooth implementation by providing the tools you need to make sure you maintain data quality as you migrate systems.
  • Improve Business Insights. As with other business applications, BI is a case where putting garbage data in means you’ll get garbage data out. EDQ solutions can be used to standardize and deduplicate the data being loaded into a data warehouse.
  • Deliver on Compliance. Regulatory compliance of all kinds—including policies related to taxes, privacy, antiterror, and antimoney-laundering—require matching up data pulled from a variety of sources. With EDQ solutions, organizations can meet regulatory mandates with capabilities that support everything from simple deduplication of customer lists to matching data against government lists of suspected terrorists.

You can learn more about our Enterprise Data Quality multitool in our upcoming webcast! Unlike swiss army knives it is guaranteed never to rust or stop you in an airport metal detector.

Watch Putting Data to Work Using Oracle Enterprise Data Quality Solutions on Tuesday, August 27 at 10:00 a.m. PT, and learn more about the Oracle Enterprise Data Quality suite of products.

Tuesday Oct 25, 2011

Enabling Data Governance with Oracle Enterprise Data Quality

Data Governance is one of those terms in the IT ecosystem with a pretty nebulous definition. It can mean different things to different people and depending on who you ask it's as simple as extra monitoring of your business process with some additional auditing for IT processees.  On the other hand, it can be defined as the point where data quality, data management, data policies, business process management, and risk management converge in terms of the handling of data in an organization.  Data Quality is an important component of Data Governance, for that matter to any type of IT initiative - Business Intelligence, Data Warehousing or Master Data Management, and enables organizations do a better job of governing their data in the following ways:

  • Brings clarify and transparency to data
  • Ensures high data quality for enterprise IT systems that need it
  • Provides data control and standardization

Oracle Enterprise Data Quality is a data quality solution that handles both customer and product data and has unique approaches geared to handling the unique characteristics of each data domain.  In addition, it plays an important role in helping organizations manage data quality as part of data governance projects by ensuring that data of poor data quality is cleansed, standardized and ready to meet compliance mandates, there by making sure that "dirty data" does not cascade throughout the enterprise and infiltrate IT systems.  Join us on a webcast on this very topic - October 27, 2011 at 10 AM PT and learn how Oracle Enterprise Data Quality can help give you the tools you need to drive your Data Governance projects forward to completion.  Register today for this webcast.

For more information on Oracle Enterprise Data Quality visit us here.


Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality


« December 2015