Introduction
Spring JDBC 6.1.2 introduces a new database Sharding feature. This blog walks you through this new capability, when and how to use it and provides a demo using Oracle Sharding.
Sharding is a database scaling technique based on the horizontal partitioning of data across multiple independent physical databases. Each physical database in such configuration is called a shard. A sharded table is partitioned across all shards. From an application perspective, a sharded database looks like a single database; the number of shards, and the distribution of data across those shards, are completely transparent to the application.
Benefits of Sharding
Quoting the documentation, sharding provides linear scalability, complete fault isolation, and global data distribution for the most demanding applications.
- The shared–nothing architecture eliminates performance bottlenecks and provides unlimited scalability.
- Single points of failure are eliminated because shards do not share resources such as software, CPU, memory, or storage devices. The failure or slow-down of one shard does not affect the performance or availability of other shards.
- Sharding enables Global Database where a single logical database could be distributed over multiple geographies. This makes it possible to satisfy data privacy regulatory requirements (Data Sovereignty) as well as allows to store particular data close to its consumers (Data Proximity).
Database query and DML requests are routed to the shards in two main ways, depending on whether a sharding key is supplied with the request or not. These two routing methods are called direct routing and proxy routing.
- Direct Routing: you can connect directly to the shards to execute queries and DML by providing a sharding key with the database request. Direct routing is the preferred way of accessing shards to achieve better performance, among other benefits.
- Proxy Routing: Queries that need data from multiple shards, and queries that do not specify a sharding key, cannot be routed directly by the application. Those queries require a proxy to route requests between the application and the shards. Proxy routing is handled by the shard catalog query coordinator.
APIs description
Spring JDBC version 6.1.2 introduces a ShardingKeyDataSourceAdapter (adapter for a target DataSource) that is designed to apply sharding keys, if specified, to every standard getConnection() call, returning a direct connection to the shard corresponding to the specified sharding key value. All other methods are simply delegated to the corresponding methods of the target DataSource. This proxy datasource takes a ShardingKeyProvider object as an attribute and uses it to obtain the sharding keys.
The ShardingKeyProvider is an interface that defines two methods : getShardingKey and getSuperShardingKey. These methods are used by the ShardingKeyDataSourceAdapter to retrieve the sharding key relevant to the current context in order to obtain a direct shard connection.
In order to use direct shard queries with Spring you have to:
- Implement a ShardingKeyProvider
- Create the DataSource using the ShardingKeyDataSourceAdapter
The ShardingKeyDataSourceAdapter and ShardingKeyProvider can be used both with Spring’s JDBCTemplate and JPA.
Note that the ShardingKetDataSourceAdapter can only be used with a DataSource that implements the createConnectionBuilder method introduced in JDK 9. The Oracle libraries below implement this method, they can be downloaded here.
- Version 23c supports Oracle Database versions 23c, 21c and 19c
- JDBC Driver : ojdbc11
- UCP : ucp11
- Version 21c supports Oracle Database versions 21c and 19c
- JDBC Driver : ojdbc11
- UCP : ucp11
Implementing a ShardingKeyProvider
The following sample uses a HTTP request attribute to build the sharding key. When getShardingKey is called, the provider will read the HTTP request attribute, build a ShardingKey object and return it, if the attribute is not present or is null, this method will return null. In this example there is no super sharding key, the getSuperShardingKey method was not implemented and will return null.
@Component
public class NumericHttpRequestAttributeShardingKeyProvider
implements ShardingKeyProvider {
@Autowired
HttpServletRequest httpRequest;
@Override
public ShardingKey getShardingKey() throws SQLException {
Object httpRequestAttribute = httpRequest.getAttribute("attribute name");
if (httpRequestAttribute != null) {
int value = Integer.parseInt(httpRequestAttribute.toString());
return new OracleShardingKeyBuilderImpl()
.subkey(value, JDBCType.NUMERIC).build();
} else {
return null;
}
}
}
Declaring the DataSource bean using ShardingKeyDataSourceAdapter
The DataSource configuration uses a UCP connection pool and the ShardingKeyDataSourceAdapter sets the ShardingKeyProvider to a instance of NumericHttpRequestAttributeShardingKeyProvider.
@Configuration
public class DataSourceConfiguration {
@Autowired
NumericHttpRequestAttributeShardingKeyProvider requestProvider;
@Bean
DataSource getDataSource() throws SQLException {
PoolDataSource poolDataSource = new PoolDataSourceImpl();
poolDataSource.setURL(URL);
poolDataSource.setUser(USERNAME);
poolDataSource.setPassword(PASSWORD);
poolDataSource.
setConnectionFactoryClassName("oracle.jdbc.pool.OracleDataSource");
poolDataSource.setInitialPoolSize(0);
poolDataSource.setMinPoolSize(0);
poolDataSource.setMaxPoolSize(MAX_POOL_SIZE);
ShardingKeyDataSourceAdapter dataSourceAdapter =
new ShardingKeyDataSourceAdapter(poolDataSource, requestProvider);
return dataSourceAdapter;
}
}
Demo using Oracle Sharding
A sample application performing direct shard queries using Spring and Oracle JDBC can be found in this GitHub project. This application implements a REST API that manages user notes and has been developped in two versions: one using Spring’s JDBCTemaplate and the other using Spring data JPA.
These are the endpoints implemented in the sample application : (cf. api description).
| GET | /users | Gets all users the currect user has access to | Multi-shard |
| GET | /users/all/notes | Gets all notes for all users | Multi-shard |
| POST | /users/{userid}/notes | Adds a note for a user | Single-shard |
| GET | /users/{userid}/notes | Gets all notes for a user | Single-shard |
| GET | /users/{userid}/notes/{noteid} | Gets a note for a user | Single-shard |
| PUT | /users/{userid}/notes/{noteid} | Updates a note for a user | Single-shard |
| DELETE | /users/{userid}/notes/{noteid} | Deletes a note | Single-shard |
Deploying the sharded database
To deploy the sharded database you can either use the Oracle Database Operator for Kubernetes or follow the instructions in the documentation. The data used in this application is stored in two sharded tables “users” and “notes” belonging to a single sharding group, the sharding key is the “user_id” column.
CREATE SHARDED TABLE users (
user_id NUMBER PRIMARY KEY,
name VARCHAR2(100),
password VARCHAR2(255),
role VARCHAR2(5),
CONSTRAINT roleCheck CHECK (role IN ('USER', 'ADMIN')))
TABLESPACE SET ts1 PARTITION BY CONSISTENT HASH (user_id);
CREATE SHARDED TABLE notes (
note_id RAW(16),
user_id NUMBER NOT NULL,
title VARCHAR2(255),
content CLOB,
CONSTRAINT notePK PRIMARY KEY (note_id, user_id),
CONSTRAINT UserFK FOREIGN KEY (user_id) REFERENCES users(user_id))
PARTITION BY REFERENCE (UserFK);
To create a schema in an SDB, you must issue DDL commands on the shard catalog database, which validates the DDLs and executes them locally first. Therefore, the shard catalog database contains local copies of all of the objects that exist in the sharded database, and serves as the master copy of the SDB schema. If the catalog validation and execution of DDLs are successful, the DDLs are automatically propagated to all of the shards and applied in the order in which they were issued on the shard catalog.
Detailed instructions on how to create the sharded database used by the application can be found here.
Database access
This application performs both single shard and multi-shard queries, and uses two different datasources: a direct-shard data source and a catalog data source. The direct-shard data source is used to perform single shard operations and the catalog data source is used to perform both single-shard and multi-shard operations. The direct-shard data source uses the new sharding features added in version 6.1.2 of Spring. The ShardingKeyProvider uses the path variable “userid” to set the sharding key.
The CatalogDataSource uses routing by proxy and will be used in the following cases:
- When the query involves several users (multi-shard). Ex.: listing all users or listing all notes for all users.
- When the query involves a sing user (single-shard) and the userId (sharding key) is unknown. Ex.: getting a user by username.
The DirectShardDataSource uses direct-shard routing and will be used when a query involves a single user and the userId (sharding key) is known. Ex.: Getting all notes for a user
Using Spring’s JDBCTemplate
When using Spring’s JDBCTemplate the application configures two different JDBCTemplate and TransactionManager beans. The first is used to perform queries through the catalog database using the catalog data source and the other one is used to connect directly to the database where the data is stored using the direct-shard data source.
Both JDBCTemplate beans are injected in the service layer and the appropriate been is using according to the operation to perform. If the sharding key (userId in this case) is known and the query involves a single user, the JDBCTemplate instance using the direct-shard data source is used (cf. method getNotesForUser). The image below shows a simplified sequence diagram representing a direct-shard query execution.
@Override public List<Note> getNotesForUser(Long userId) {
String query = "SELECT * FROM notes n WHERE n.user_id = ?";
return directShardJdbcTemplate.query(query, noteRowMapper, userId);
}
If the sharding key is unknown or if the query involves several users the JDBCTemplate instance using the catalog data source will be used (cf. method getAllNotes below). The image below shows a simplified sequence diagram representing a catalog query execution.
@Override
public List<Note> getAllNotes() {
return catalogJdbcTemplate.query("SELECT * FROM notes n", noteRowMapper);
}
Using JPA
Similarly to JDBCTemplate, when using JPA two data sources will also be needed. Each data source will have its own EntityManagerFactory, TransactionManager and Repositories. In this demo the repositories are exactly the same for both data sources. The package com.oracle.jdbc.sample.springsharding.dataaccess contains these repositories, its child packages catalog and directshard contain child repositories for each data source.The annotation EnableJPARepository allows to make the link between the package, the EntityManagerFactory and the TransactionManager.
@EnableJpaRepositories(
basePackages = "com.oracle.jdbc.sample.springsharding.dataaccess.catalog",
entityManagerFactoryRef = "entityManagerFactory",
transactionManagerRef = "transactionManager"
)
@EnableJpaRepositories(
basePackages = "com.oracle.jdbc.sample.springsharding.dataaccess.gsm",
entityManagerFactoryRef = "gsmEntityManagerFactory",
transactionManagerRef = "gsmTransactionManager"
)
As for JDBCTemplate, both repositories are injected in the service layer and the different repositories are used according to the operation to perform.
@Override
public List<Note> getNotesForUser(Long userId) {
return gsmNoteRepository.findByUserId(userId);
}
@Override
public List<Note> getAllNotes() {
return catalogNoteRepository.findAll();
}
Conclusion
Sharding is a well-known database scaling solution. To make sharding responsive applications must make direct-shard queries. The new sharding feature in Spring JDBC makes it easier to perform such queries in Spring. All you need is to do is to implement a ShardingKeyProvider and use the ShardingKeyDataSourceAdapter to define that this ShardingKeyProvider should be used with your data source.
References
- JDBC landing page
- #javaOracleDB

