Acknowledgements: Dave Rubin, Senior Director of NoSQL Database Development, Oracle
I like things that help me become productive: fast online search, faster devices, and faster access to my order status and delivery estimate etc. It is no surprise that 53% of mobile users will abandon a website if it takes more than 3 sec and an additional two seconds of loading time can increase site bounce rate by 103%.
In this blog, I’ll discuss the role of the database in building a fast, snappy, and responsive application to achieve predictable low latency, and deliver consistent performance using Oracle NoSQL Database Cloud Service.
What eats away at website response time?
A key factor in determining the overall user experience will be the average response time of rendering each section of a page after certain user actions. When building an application, there’s always some persistent data that needs to be read, written, updated, or deleted based on the page context and the interactions of a user. Ideally, we would want all the data required to process an event to be available at a single place, however, typically, this is not the case. In many cases, that event will request data from more than one persistent data store to ultimately be rendered on a page.
When considering response times, it’s useful to review the following “Numbers Every Programmer Should Know” from Google’s Jeff Dean (only a subset of the entire list is shown below):
Latency Comparison Numbers
Latency
Numbers
L1 cache reference
0.5 ns
L2 cache reference
7 ns
Mutex lock/unlock
100 ns
Main memory reference
100 ns
Random read 4k from SSD
150 ns
Round trip within same datacenter
0.5 ms
Disk seek
10 ms
Read 1 MB sequentially from network
10 ms
Read 1 MB sequentially from disk
30 ms
Let’s see how these numbers apply to a simple real-world scenario. For example, when a user searches for a product on an online retail site from a mobile device, and taps a control (a.k.a. click a button or ask for related products etc) a sequence of calls are made behind the scenes such as:
Device to load balancer
Load balancer to HTTP server
HTTP server to session manager REST endpoint
Authenticate and authorize the request (may require database call if session not in cache)
Session manager to a search REST endpoint
Run regex search over product catalog and return the 1st 25 matches
Serialize results into JSON if not already serialized
Retrieve the current inventory in stock for all 25 matches
Call database REST endpoint
Prepare a query or API call for the data store using HTTP parameters and possibly session level parameters
Call the database API to retrieve the data
Convert the result to JSON (product ID, description, available inventory)
Return the JSON to our caller
Return the JSON payload to the session manager
Serialize and return final JSON payload to the device
Let’s also consider that you have deployed the back-end in a single data center in Europe. Using Jeff Deans’ number chart as reference and some rough estimates for our code path, and assuming that a single call to the data store is needed for this particular action, a user in San Francisco would encounter the following minimum latency (response time for tapping a control):
Latency response times
California to Amsterdam
150 ms
Load balancer to HTTP server
0.5 ms
HTTP Server to session manager REST endpoint
0.3 ms
session manager authN + authZ
0.5 ms
session manager to Search REST endpoint
10 ms
REST endpoint to data store, execute query, disk seek + transfer
15 ms
request processing in the REST server
1 ms
A few assumptions are made in the above sequence of operations:
It assumes that this user action results in a single database call
It assumes zero contention for resources (network, database, persistent storage)
It assumes that a single REST endpoint must be contacted
It does not account for the overhead for things like contacting a session cache, re-authentication if necessary, authorization of the request, serialization and de-serialization of the payload, etc.
As you can see, at the very best, with many assumptions not accounted for, the minimum latency for this user operation would be roughly 176 ms. In many instances, it is just not a simple matter of a single database call for rendering a page. Rendering a page can result in tens to a few hundreds of calls to the database layer to read small fragments of data, sometimes to personalize the user experience, and other times, just to collect data from disparate sources.
Factoring response time for real world applications
Now let’s consider a more real-world scenario where in the application pages are customized to an individual user for personalization. Rendering a page will entail anywhere from 50 to 200 database lookups to be able to read and synthesize different small fragments of data. In a typical personalized page, the application must gather the following types of data fragments:
Product recommendations (product name, description, price, user rating, comments, and thumbnail)
Currently trending items (product name, description, price, user rating, comments, and thumbnail)
Current sale items (product name, description, price, user rating, comments, and thumbnail)
Your order history (order id, order date, product name, description, price, quantity ordered)
Your current shopping cart contents (product name, description, price, thumbnail, quantity)
Your browsing history (product name, description, price, user rating, comments, and thumbnail)
Any preferences you may have set for how your pages get rendered (filtering rules, layout)
For the sake of simplicity, we’ll use the average number of 100 database operations to render a page. We’ll also factor in some non-deterministic overhead which includes other processing, like checking authorization for the operation, other processing that occurs on the server side in response to this user action, and any resource contention and database locking. This gives us the following, more realistic view of our overheads:
Latency response times
California to Amsterdam
150 ms
Load balancer to HTTP server
0.5 ms
HTTP Server to session manager REST endpoint
0.5 ms
1000 database I/O (100 * 15)
1500 ms
resource contention, locking
500 ms
authorization
200 ms
request processing in the REST server
1 ms
A minimum latency for this single user operation now stands at 2351 milliseconds, which puts it over the typical upper bound of page rendering in two seconds or less. Furthermore, any additional necessary future server side overheads puts it at risk of pushing the latencies further beyond acceptable levels.
How Oracle NoSQL Database Cloud Service provides predictable performance?
The Oracle NoSQL Database Cloud service delivers predictable single digit millisecond latencies at the 95th percentile. The following graphic shows an application running against the Oracle NoSQL Database Cloud service in the east coast region of the US (Ashburn, VA). The backend is a simple servlet, running in Apache Tomcat 9.0.41, incorporating a workload generator, a rate limiter, and a measurement framework that takes continuous samples and measures the latency of reads and writes at the min, max, average, and 95th percentiles. The workload generator creates 1k records with a key size of 20 bytes. The rate limiter throttles the workload generator, using the purchased read and write throughput values (reads/sec, writes/sec). Note the following:
Cost – In this run, the throughput that was purchased is 500 read units, 300 write units, and 100 GB of storage for a total cost of $47.42/month ($37.62 for writes, $3.20 for reads, and $6.60 for storage.
Read latency – The red dashed line represents the 95th percentile read latency and the scale for this is on the right hand side of the graph in milliseconds. Read latency for this workload is consistently under 5 milliseconds.
Write latency – The black dashed line represents the 95th percentile write latency and shows anywhere from 4 milliseconds to a maximum of 9 milliseconds
Oracle NoSQL Database is available in 22 regions world-wide, so finding a region that is within a low speed of light distance to your app is as simple as choosing the Oracle OCI region for your NoSQL Database table.
How Oracle NoSQL Database Cloud Service provides predictable Latency?
There are many architectural as well as implementation factors that contribute to the ability of Oracle NoSQL Database to deliver on the notion of predictable latency. For brevity, here are just of few:
Shared nothing with replicas for load balancing – At its core, the Oracle NoSQL Database is architected as a shared nothing, auto-sharded key/value database. Each shard in the Oracle NoSQL Database cloud service contains three replicas that are able to distribute the read load via a load balancing dispatcher. Strongly consistent reads are dispatched to a dynamically elected leader node where all writes for the shard are also routed. Should a leader node fail, an election is held to choose a new leader for the shard.
Optimized get/put paths – Given Oracle NoSQL Database’s core key/value architecture, writing and reading records by primary keys are highly optimized. Oracle NoSQL Database takes great care in its usage of DRAM, on each of the servers in the database cluster, paying special attention to how upper nodes in the index B-trees are cached.
No arbitrarily complex operations – Unlike a traditional relational database, the Oracle NoSQL Database limits all operations, whether API based or declarative query based, to be as simple as possible. Distributed joins, ad-hoc sorting needing server side memory or persistent storage, and non-deterministic transactions are all forbidden and will ultimately lead to the inability to provide predictable latency to all clients of the service. Further, every operation is resource bounded, and longer running operations will always return stateless iterators to API callers, limiting the resource consumption of every call, freeing up resources for other clients of the cloud service.
Shard local ACID short transactions – No long running, distributed, or interactive transactions are allowed in Oracle NoSQL Database. The API supports transaction demarcation at the call boundary, supporting multi-object transactions via collections in the API call. Further, multi-object transactions are limited to those objects that reside on the same physical shard, eliminating expensive two phase transaction coordination.
Novel query run-time – The Oracle NoSQL Database cloud service contains some novel features in its query runtime that have been designed to meet the needs of app developers while strictly maintaining predictable latency. Features like stateless order-by operations, ad-hoc sorting via query run-time fragments embedded in the client are just two examples of novel approaches taken by the Oracle NoSQL Database Cloud service to maintain the predictability of latency for all clients of the cloud service.
Get your Oracle NoSQL Database Cloud Service free today
Oracle NoSQL Database Cloud Service is built on Oracle Gen2 cloud with latest server, storage, and networking technologies ensuring fast and scalable performance. Even if your workloads are unpredicatable, it is designed to maintain single digit millisecond, response times ensuring a great user experience. You can read my other blog on ‘3 simple steps to get you started in building scalable applications on a fully managed NoSQL Database cloud service that never expires’.
Kiran Makarla is a dynamic business development and go-to-market leader, serving as the outbound product manager for both Oracle Blockchain Platform and Oracle NoSQL Database Cloud Services. With over 20 years of experience spanning roles from lead engineer and architect to product and marketing management, Kiran brings a wealth of insight to the business technology landscape.
Driven by a passion for building and marketing innovative solutions, Kiran excels at addressing customers’ unique challenges. He collaborates closely with senior technology executives to analyze industry trends, evaluate new product categories, and uncover market opportunities—all with a sharp focus on empowering developers, practitioners, and business users alike.
An avid reader and sought-after speaker, Kiran regularly shares his expertise at technology forums and major industry events both nationally and internationally.