Since almost all NoSQL databases have built-in high availability, it is easy to change hardware configuration which involves shutting down, replacing, or adding a server to a cluster while maintaining the continuous operations.

Seamless, online scalability is one of the key attributes of NoSQL systems which are sharded, shared nothing databases. Users can start with a small cluster (few servers) and then grow the cluster (by adding more hardware) as throughput and/or storage requirements grow. Similarly, users can also contract the cluster as throughput and/or storage requirements shrink due to decreased workloads. Hence, most NoSQL databases are considered to have horizontal scalability. Figure 1 shows horizontal scalability during different operational workloads.

Horizontal Scaling

Figure 1 – Horizontal Scaling for various workloads

 

Scalability is appealing to businesses for a variety of reasons.  In many cases, the peak workload is not known in advance of the initial setup of a NoSQL database cluster. Scalability enables the organization to deploy enough capacity to meet the expected workload without unnecessary over-expenditure on the hardware initially. As the workload increases, more hardware can be added to meet the peak demand. As the workload decreases, the hardware can be reclaimed and re-purposed for other applications. The cluster can grow and shrink as the workload changes with zero downtime.

Most NoSQL systems claim linear or near-linear scalability as the cluster size grows. Linear scalability simply means that when more hardware is added to the cluster, it is expected to deliver the performance that is roughly proportional to its capacities. In other words, if the number of servers in the NoSQL cluster is doubled, it is expected to handle roughly double the workload (throughput and storage) compared to the original configuration, if the additional servers have the same processing and storage capabilities.

Since almost all NoSQL databases have built-in high availability, it is easy to replace hardware incrementally which involves shutting down a machine, removing it from the cluster, bringing in a new server, and adding it to the cluster. Everything continues to work without downtime or interruption of the application.  

From a capital expenditure perspective, it can be more cost effective to purchase newer and “bigger” servers because hardware costs decline over time. Newer hardware has more processing and storage capability relative to older generations of hardware at the same price. Replacing older hardware with newer, more powerful and capable hardware gives NoSQL database vertical scalability.

From an operational cost perspective (OPEX) , a cluster with fewer servers is preferable to a cluster with a large number of servers, assuming both options deliver the same performance and storage capabilities. Hardware needs to be actively managed, needs space, cooling and consumes power. Smaller clusters can improve operational efficiency.

Over time, the combined effect of growing the number of servers in a cluster and/or replacing old servers with newer hardware will result in a cluster which has a mix of servers of varying processing and storage capabilities and capacity. It is common to find such “mixed” clusters in production NoSQL applications that have been upgraded over a period of time.

In the context of “mixed” cluster scenarios, it is important to choose a NoSQL solution that can leverage hardware with varying processing and storage capabilities in order to address the application requirements efficiently and effectively. As mentioned earlier, most NoSQL products claim horizontal scalability. But, does an administrator really want to maintain a large cluster for NoSQL product X if a smaller cluster running a different NoSQL product will do the job, or exceed the requirements? The example in Figure 2 shows that over time, and with increased storage and processing power, a 12-machine system could be replaced with just a 3-machine system.

Figure 2 – Replacing smaller servers with larger, more capable ones.

When evaluating a NoSQL product, it is wise to consider the horizontal as well as vertical scalability characteristics. Oracle NoSQL Database is designed for horizontal and vertical scalability in order to deliver the most cost-effective NoSQL solution for modern applications.

For every server deployed in Oracle NoSQL Database cluster, the administrator specifies the capacities (e.g., storage) of the server at the time of the initial setup.  Oracle NoSQL Database uses this information to automatically assign workloads such that each server process manages a subset of data that is roughly proportional to the storage capacities of that server.  A smaller server will be assigned less workload compared to a larger server.  Similarly, each process uses RAM based on the amount of physical memory available on the machine.  Figure 3 shows the Oracle NoSQL Database cluster with mixed hardware capacities.

Figure 3 – A Mixed cluster

Oracle NoSQL Database distributes the workload across the available hardware based on the capabilities of the server. More importantly, during the cluster creation and expansion/contraction operations, the system ensures that each replica for every shard (remember, Oracle NoSQL Database is a sharded, HA database) is distributed across different physical servers. This rule” ensures that the failure of any single server will never result in the loss of a complete shard.   Also, in steady-state operation, it is possible that servers might be shut down and restarted at various points in time.  Oracle NoSQL Database automatically monitors the health and workload on each server in the cluster and rebalances the workload across the servers in order to avoid hotspots”.  All of this happens automatically and without any administrative intervention, thus ensuring reliable and predictable performance and high availability over mixed clusters and cluster transitions.

Oracle NoSQL Database leverages hardware effectively in order to meet the performance and availability requirements of the most demanding and innovative applications..  When deciding which NoSQL solution to use, vertical scalability is just as important as horizontal scalability in order to ensure the lowest Total Cost of Ownership for the overall solution.