by Aaron Lazenby
Say goodbye to the standard deviation. With big data, you can get to know your customers without the guesswork, says Paul Sonderegger, big data strategist at Oracle.
“In the twentieth century, it was cost-prohibitive to look at your entire data set, so most companies would take a representative sample,” Sonderegger says. “Now, you can analyze the entire customer population, over a longer period of time. The advantage is a much more precise analysis, not just in terms of overall estimates but also getting detailed information on individual customers.”
Plus, says Sonderegger, company leaders can use big data to pivot quickly to make smart decisions about what their customers want. “Using this new abundance of data, and this new cost-effectiveness of computing and storage power, you can effectively reduce the time, cost, and effort of asking questions you just realized you care about, even if they require very large amounts of data,” he says.
Here, Profit talks to Sonderegger about why Oracle’s strategy for engineering hardware and software to work together makes it easier to experiment with data—and glean new insights. Sonderegger also shares some advice on what executives need to do now to maximize big data’s big benefits.
Profit: How is Oracle helping customers realize big data’s potential?
Sonderegger: Big data is not magic. Somewhere, a physical machine—or many machines working together—has to crunch your data. As the abundance of data grows exponentially, the cost of those physical machines to store, process, and move that data must become more cost-effective overall. As a technology company, one of the ways we do that is by designing hardware and software to mutually improve each other’s capabilities, like Oracle did with Oracle Exadata machines.
For example, Oracle has created algorithmic caching behaviors that exploit the known hardware configuration of our machines. These algorithms keep data that you use a lot, or used most recently, right at hand. Data that has not been used for a while can be moved out of memory and stored on disk. And these algorithms will do all of this management on your behalf.
Let’s say you’re a manufacturer who wants to do analysis using data that came from sensors on the test flight of a new plane. This data is stored in a NoSQL database, and now you want to compare that with data from past test flights, which is stored in a data warehouse. You can achieve this with Oracle Big Data SQL, which knows how to optimize a single query that will execute where the data lives in those different systems.
Expanding the footprint through engineered systems is basically the “iPhone-ization” of the data center, treating that whole storage, processing, analysis, and capacity work as if it were a part of a single high-performance system.
Profit: What role does database as a service play in terms of performance?
Big data is not magic. Somewhere, a physical machine—or many machines working together— has to crunch your data.”
Sonderegger: In many cases, large organizations have thousands of applications, each depending on a database underneath. Many of our customers have seen tremendous benefits from consolidating those databases onto Oracle Exadata machines. With Oracle Database 12c those databases can actually run as pluggable databases inside the 12c database container, which lets them share memory and processing resources. By doing this, the potential for cost savings and performance improvements goes up simultaneously.
Also, database as a service allows workers in the organization who want to experiment with data, or create an algorithm, to spin up a new database or even a working copy of a production one very quickly and cost-effectively. This is great when data scientists are exploring existing data for new value. They don’t know yet if they’ll find new insights or ideas. But there’s only one way to find out.
Profit: How can executives leverage these technologies to get more out of their data?
Sonderegger: One of the most important things executives can do is create teams that look for the option value of existing data. Option value means using data already collected for some new, secondary purpose. Let’s say as part of making your inventory management process more efficient, you’ve collected data to optimize where you put things on the shelf or how many times you move them in the warehouse. You may be able to use that same data in a way you hadn’t intended, such as to improve your negotiating position with your supplier by buying larger lots less frequently.
The challenge is that the data was not designed to answer your question about how to influence your supplier to, say, repackage what they ship to you. This is where data warehouses working together with data reservoirs—which are [Apache] Hadoop clusters—become extremely powerful tools for creating new combinations of data, and changing the shape of data you’ve already collected, in order to use it for new purposes you didn’t originally envision.
Photography by Shutterstock