Oracle Database 12c Release 2 has been available on Oracle
Cloud since Nov 4, 2016. Today, we have announced Oracle Database 12c Release 2 for
on-premises as well. Oracle Sharding is one of the marquee features of Database 12.2.
We, from the Oracle Sharding Product Development team will be publishing periodic blog posts on various topics on Sharding. The topics will include - Oracle Sharding benefits, capabilities, methods, data modeling and application requirements, high availability architecture, replication, deployment automation, direct and proxy routing, life cycle management, benchmarking results, monitoring, patching and many others that you will find interesting.
So, what is Oracle Sharding? It is a scalability and availability feature for custom-designed OLTP applications that enables
distribution and replication of data across a pool of discrete Oracle databases that share no hardware or software.
Each database in the pool is referred to as a shard. The pool of shards is presented to an application as a single
logical Oracle database (a sharded database or SDB).
Oracle sharding distributes data across shards using horizontal partitioning. Horizontal partitioning splits a database
table across shards so that each shard contains the table with the same columns but a different subset of rows.
The number of shards and the distribution of data across them are completely transparent to
database applications. SQL statements issued by an application do not refer to shards nor are they dependent on
the number of shards and their configuration.
OLTP applications must be explicitly designed for a sharded database architecture in order to realize the benefits of
scalability and availability. This is different from an HA architecture based upon Oracle Real Application Clusters
(Oracle RAC) where scalability and availability are achieved transparent to an application. Applications that use a
sharded database must have a well-defined data model and data distribution strategy (consistent hash, range, list or
composite) that primarily accesses data via a sharding key. Examples of a shard key includes customer_id,
account_no, country_id, etc. Oracle Sharding also supports data placement policies (rack and geo awareness) and
all deployment models: on-premises and public or hybrid clouds.
Transactions that require high performance must be single-shard transactions. For example, lookup and update of a customer’s billing record, lookup and update of a subscriber’s
documents etc. There is no communication or coordination between shards for high performance transactions. Multi-shard operations and non-sharding key access are also supported.
Such transactions include simple aggregations, reporting, etc. In return for these design considerations, applications that run on a sharded database architecture can achieve even
higher levels of scalability and availability. Performance scales linearly as shards are added to the pool because
each shard is completely independent from other shards. Each shard typically uses local storage, flash, and memory
offering customers a further opportunity to optimize performance at relatively low cost. The first release of Oracle
Sharding is designed to scale up to 1,000 shards. Isolation between shards also means that outages or poor
performance of one shard does not impact the availability or performance of transactions executing at other shards.
High Availability (HA) for individual shards is provided by automatic deployment of database replication. Simple,
one-way Data Guard physical replication with automatic database failover is the default configuration. Active Data
Guard (copies open read-only) or Oracle GoldenGate (bi-directional replication with all copies open read-write) may
also be automatically deployed. Shards may be replicated within and across data centers. Replication is data-center
and rack aware using data placement policies supported by Oracle Sharding. Optionally, Oracle RAC may be
manually configured to provide Shard HA.
Shards are front-ended by a set of replicated listeners called Shard Directors that act as routers. Oracle clients
(JDBC, OCI, and ODP.net) and the Oracle Universal Connection Pool (UCP) have been enhanced to recognize
shard keys specified in a connection string and to insure availability by controlling the maximum number of
connections allowed per shard. A shard routing cache in the connection layer (populated by the initial request to a
shard) is used to route requests directly to the shard where the data resides for optimal runtime performance. The
shard routing cache is automatically refreshed if there is any change made to the sharded database (e.g. automatic
rebalancing or add/delete of shards).
In this post, we have introduced you to Oracle Sharding at a high level. In the next post, we will look at the benefits of Oracle Sharding.