"There are only two hard things in
Computer Science: cache invalidation and naming things." - Phil Karlton
A typical Coherence application includes a tier of cache servers (JVMs dedicated to storing and managing data) and a tier of cache clients (JVMs that produce/consume data). This tiered approach is especially common with large web sites that use Coherence to manage sessions and/or content. With this type of architecture, caches that are accessed often can be configured as near caches. A near cache is a local in-memory copy of data that is stored/managed in cache servers. Coherence supports near caching in all native clients (Java, .NET, and C++).
The obvious advantage of a near cache is reduced latency for accessing entries that are commonly requested. Additionally, a near cache will reduce overall network traffic in several ways. Requests for cache entries that exist in the front of the near cache will be serviced by the client making the request, thus eliminating the need for a request to the cache server. If an entry does not exist in the front of the near cache, a single request will be made to retrieve that entry from the cache server, even if multiple threads on the client are requesting the same key. Another benefit of a near cache is that entries are stored in object form, meaning that clients don't have to spend extra CPU cycles on deserialization.
The aggregate of these benefits is increased scalability of a system. In order to determine if a near cache is appropriate, it is important to understand how near caching works internally. This knowledge is required in order to correctly configure near caches.
Invalidation is the biggest challenge for near caching. Storing entries in a client is easy, but detecting when changes are made to those entries in other clients is more difficult. Coherence provides several different strategies for near cache invalidation.
With the listener based strategies, Coherence will register map
listeners that receive events when the contents of the "back" cache are
modified. When these events are received by the client, the affected
entries will be removed from the front of the near cache. This ensures
that subsequent requests will force a read-through to the cache server
in order to fetch the latest version. Today Coherence supports two listener based strategies: PRESENT and ALL.
This strategy does not use map listeners to invalidate data, which means there is no way for the front of the near cache to know when entires are updated. This requires the configuration of an expiry on the front of the near cache. When an entry expires, this will force a read-through to the cache server to fetch the latest copy. If an expiry-based strategy is used, the invalidation strategy should be set to NONE.
The following is a description of each of the strategies in detail:
This invalidation strategy indicates that each near cache will register a listener for every entry present in the front of the near cache.
The near cache will only receive events for entries that it contains, thus greatly reducing the amount of network traffic generated by the cache servers.
This strategy results in increased latency for cache misses. This is because a map listener must be registered for a key before the entry is retrieved from the cache server. Since listener registrations are backed up (just like cache entries) the message flow looks like this:
The increased latency for the listener request will vary depending on the network. A good rule of thumb would be ~1ms of overhead for every cache miss. Another side effect is slightly higher memory consumption on the cache servers to maintain the key registrations. The exact overhead depends on the size of the keys. This is an area that we plan on optimizing in a future release.
This strategy works best for near caches that:
- Have access patterns where certain JVMs will access a small portion of the cache - for example, web applications that use sticky load balancing will mostly access the same subset of sessions stored in a cache
- Have back caches that won't expire or evict data often since there is a latency penalty for cache misses
The invalidation strategy of ALL indicates that each near cache will
register a single map listener to receive events on all updates
performed in the cache. This includes events for entries not present in
the near cache.
This strategy maintains the coherency of the near cache without the
extra latency associated with cache misses.
Every time an entry is updated on the storage tier, an event will be
delivered to each near cache. For clusters with a large client tier,
this can generate a large amount of network traffic. This strategy can
be especially problematic when bulk updates are preformed via:
- Bulk loading/seeding of the cache (i.e. populating the cache from a database)
- Clearing the cache
- Cache server failures which cause redistribution which cause mass evictions due to exceeding of high units
This strategy works best for near caches that:
- Contain a small amount of data (low hundreds of megabytes or less)
- Have access patterns that guarantee that a significant portion of the cache will be accessed by each client
When using the ALL strategy, it is important to avoid bulk updates or
deletes in order to limit the number of map events that are generated.
NONE invalidation strategy does not use map listeners; therefore
entries in the near cache are never invalidated. Since there is no invalidation, the front of the near cache must be configured with an expiry which will force entries to be removed and a read-through to the back tier.
The front of the near cache must be configured with an expiry. Here is an example:
<!-- front expiry -->
<expiry-delay>5m</expiry-delay> <!-- back expiry -->
An expiry in the backing map is not required; it is shown here for illustrative purposes in order to distinguish between front and back expiry.
This strategy is the most scalable since it does not require delivering
map events for every update made in the cache. It does not have to pay a
latency penalty for misses. Furthermore, this strategy works best for
caches that require a high rate of updates.
NONE near caches will require some tolerance for stale data. Note however that an expiry of as little as a few seconds can make this strategy a good compromise between low latency
access and scalable performance for the cluster.
This strategy works best for near caches that
- Have back caches containing large amounts (tens of gigabytes) of data
- Have large client tiers (many dozens or hundreds of JVMS)
- Have a requirement for bulk updates
If no invalidation strategy is selected, the AUTO strategy is the default. As of Coherence 3.7.1, this defaults to the ALL strategy. This is subject to change in future releases of Coherence; in fact this will be changing to PRESENT in the next release. Therefore, it is advised to always explicitly select an appropriate invalidation strategy for every near cache deployment.
Note that applications that are write-heavy may be better off without a near cache. This especially applies for the ALL and PRESENT strategies where every update to the cache will cause the propagation of map events.
In order to determine near cache effectiveness, look at the HitProbability attribute in the near cache MBean. This MBean is of type "Cache" and the ObjectName will contain "tier=front" which indicates that it is a near cache. As a rule of thumb, ensure that near caches are yielding a hit probability of at least 80%. Anything less may merit the use of NONE or the complete removal of the near cache.
In addition to selecting an appropriate invalidation strategy, there are a few other considerations to be made when using near caches.
The first has to do with mutable values placed into a cache. Consider the following sequence in a single client:
value = cache.get(key)
value = cache.get(key)
When this is performed against a distributed/partitioned cache, each thread gets its own deserialized copy of the cached item. Therefore Thread 2 won't see the modification made to property value X by Thread 1. However if this operation happens on a near cache, it is possible for threads to see mutations made by other threads to a cached value. Note that this is no different than using a ConcurrentHashMap or any other thread safe Map implementation in a single JVM. The major difference is the client behavior with a partitioned cache vs a near cache.
The recommended best practice is to use immutable value objects when using a near cache. If existing cache objects must be mutated, consider using an entry processor to modify the value.
The second concern has to do with cache configuration when using a proxy. Consider the following sample configuration:
When a cache server (storage enabled member) reads this configuration, it will skip the creation of the near cache (since this is not a client) and instead will create the back-scheme as defined by the configuration. When a storage disabled member reads this configuration it will do the opposite - it will create the front of the near cache and skip the creation of the back-scheme.
This presents an interesting scenario when proxy members are added into the cluster. The best practice for proxy members is to disable storage in order to preserve heap for handling client requests. This means that the proxy will create the front of the near cache, just like any other storage disabled member. This is not desirable because:
- The proxy will consume more memory and CPU for little benefit. The main function of the proxy is to forward binary entries to Extend clients. Therefore, the (deserialized) contents of the near cache are not being consumed in the proxy. This not only leads to more memory consumption, but also more CPU cycles for deserializing cache entries.
- If the proxy is serving more than one client, it is likely that the near cache will experience a lot of turnover, which results in more overhead for GC.
The cache configuration for a proxy member should never contain a near-scheme.
Near caching is a powerful tool of the Coherence in-memory data grid that - when used judiciously - can improve latency, reduce network traffic, and increase scalability for a Coherence application. The decision to use a near cache should always be made with the invalidation strategy as an explicit consideration.