Polyglot persistence is a term you might have heard a lot in the recent past when addressing a new cloud computing paradigm. Although it sounds sophisticated, in the context of computing, polyglot persistence is the use of different data stores for storing and processing data for different functionality of an application. For example, an e-commerce website that sells products online uses a NoSQL Store for storing the session state of the customers shopping on the website, while the payment system that captures the credit card information persists it to a relational database like Oracle.
In a similar fashion, you can implement different services to use different data stores and avoid building a monolith application where one database failure can lead to the entire business going down. The need for polyglot data stores isn’t just for high availability, but also for scalability demands of an internet-scale application.
If you’ve been in the enterprise data market or worked as a data professional in the last two decades, you’d have easily used polyglot persistence without ever knowing the term. An enterprise suite of applications, back in the on-premises days, always used various relational databases, flat file NFS storage, and other data storage technologies to store and process different types of data from different source to target systems.
For example, the main transactional database for an organization could be running on IBM Mainframe, with an auxiliary application running on Microsoft SQL Server, with a bunch of files post ETL being staged on a NAS Box before being consumed by an upstream Data Warehouse running on Oracle Database. This example is a classic for polyglot persistence before the cloud era. It simply had no fancy jargon for it.
With the definition of polyglot persistence out of the way, let’s look at what it means in the era of cloud computing, where terms like microservices, serverless, big data, elasticity are often thrown around as buzz words. In the cloud, you deploy and run an application that’s not hosted in your data center. With that capacity comes the added agility to provision and consume data more rapidly. Traditional enterprises had the grunt and resources to build data centers and maintain it by hiring an army of system and network operations staff.
However, in the early 2000s, a lot of companies started appearing. Although they started with building their own data centers, operating and building them was not profitable to their core business. Case in point, Netflix—Netflix is a creator and distributor of content. Managing data centers is not their strength and doesn’t any logical business sense for them. A few years back, Netflix completely migrated off from their own data centers to a cloud provider. Slowly, more new companies saw the sense in this model, to the point where the new startups started building their services with a cloud-first approach, which is now called cloud native.
With the advent of Big Data technology and internet scale object storage, similar to file storage of on-premises days, suddenly you had more choices to store and process data. The new increased velocity and volume of data creation quickly flooded the relational databases scaling capacity leading to a new breed of NoSQL databases: Databases with a simple schemaless model with only a key and a matching value. These new computing paradigms lead to an explosion in offerings for NoSQL databases, such as MongoDB, Cassandra, and Amazon DynamoDB.
Where does all this information fit in with having multiple data stores? Now the same startups or scale-ups started building all these applications in the cloud, which needed to have a more internet scale audience. With internet scale comes volume and data, which isn’t polished enough to be immediately consumed by an application that can derive value out of it. The relational databases weren’t made for these types of workloads.
So, the startups and the scale-ups took a different hammer for different nails. One size does not fit all. The frontend of a ride-sharing app uses Kafka to stream requests into a Hadoop hive store and uses a relational database to store customer and driver profiles, while a column datastore is used to perform analytics. This analogy is a simplified view on a complex orchestration of different services interacting with each other with each using a different data store to achieve a certain function. That’s what polyglot persistence is.
So, the startups and scale-ups need their multiple data sources, but then you might ask why enterprises need different data stores. In the last decade, enterprise computing and internet-scale computing have seen an overlap for increasing profits and advertising traditional products through the internet. You commonly see ads for home loans in your Instagram or Facebook feed these days. To launch a product, which is a niche offering from a large company, you need the same approach of a startup to process and store the data. Enterprises ended up having the same challenges of startups to minimize their capital expenditure on setting up data centers. The new priority focused on their core product and sought the means to consume massive amounts of data from variety of different sources. Storing and processing all this data meant enterprises now needed to cater to an internet-scale audience.
Oracle has always been identified with enterprise applications, and the Oracle Database service has been the main data store for over four decades of enterprise apps. In 2018, aware of growing enterprise workloads entwining with internet-scale computing, Oracle launched a Generation 2 Cloud offering called Oracle Cloud Infrastructure (OCI). OCI was built on the learned mistakes and short comings of the other clouds.
OCI offers a multitude of offerings of different data stores to make the discipline of polyglot persistence now available to enterprises on an internet scale, with security and balanced performance of enterprise applications.
Figure 1: OCI data stores for polyglot persistence
OCI has multiple data services to build a cloud native application or run a monolith application and have different data persistent data stores for each module of the app, including the following examples:
Oracle Cloud Infrastructure offers various data services to create cloud native polyglot applications or lift and shift enterprise applications to the cloud and refactor them to use different data stores for different functionalities.
The below example demonstrates via a wireframe diagram how you can use different OCI Data storage offerings to build an ecommerce store on OCI using polyglot persistence. The different microservices of this online store uses a different storage option to persist the data
Get started on OCI Data services now. The Always Free tier includes the services you can use for an unlimited time and the following features:
Two Oracle Autonomous Databases with powerful tools like Oracle Application Express (APEX) and Oracle SQL Developer
Two AMD Compute virtual machines
Up to four instances of Arm Ampere A1 Compute
Block, Object, and Archive Storage, load balancer, data egress, Monitoring, and Notifications
For more information on Oracle Cloud Infrastructure’s offerings, see the following resources: