Performance Implications of CNAME Chains vs Oracle ALIAS Record

Chris Baker
Principal of Threat Intelligence

The CNAME resource record was defined in RFC 1035 as “the canonical name for an alias.” It plays the role of a pointer, for example, the CNAME informs the requestor that is really this other name,

The CNAME record provides a “configure once” point of integration for third party platforms and services. A CNAME is often used as opposed to an A/AAAA record for the same reason developers often use variables in their code as opposed to hard coded values. The CNAME can easily be redefined by the third party or service provider without requiring the end user to make any changes.

A stipulation that prevents use of the CNAME at the apex is that no other records can exist at or alongside a CNAME. This specification is what prevents an end user from being able to place a CNAME at the apex of their zone due to the other records, which must be defined at the apex such as the Start of Authority (SOA).

ALIAS / ANAME – The way of the future 

The Oracle ALIAS record allows for CNAME-like functionality at the apex of a zone. The Oracle implementation of the ALIAS record at the apex uses private internal recursive resolvers to “unwind the CNAME chain.”

Consider, for example, a web application firewall, WAF, implementation which uses a CNAME to direct users to the WAF endpoint. The consumer of the service simply creates a CNAME to the endpoint provided. The initial mapping is the only thing which the consumer has control over. After implementing the service, we can dig deeper into the way the service is implemented in the DNS. Below we see the full CNAME chain.                          60      IN      CNAME            300     IN      CNAME                         120     IN      CNAME              3600    IN      CNAME  60      IN      A

In the example above, the WAF service is implemented via a CNAME record mapping to The service operator maps the vanity CNAME to a service name, This is a CNAME to another record in the zone which is ultimately a CNAME to a load balancer endpoint at a cloud provider.  

The Oracle ALIAS record is implemented in a way in which our internal resolver will constantly keep all of these records in cache. When a recursive resolver requests, we can hand back the A record for the cloud load balancer. This reduces variability from cache misses, network latency, packet loss, etc. Saying it reduces variability is one thing, quantifying it is another. 

ALIAS Testing 

To quantify the reduction in variability and potential performance gains from ALIAS/ANAME record implementation, we performed a number of tests using the RIPE Atlas network. The RIPE Atlas platform provides access to the internal resolvers used by a number of ISPs that are only accessible from their networks. It also allows us to run tests from the perspective of end users, providing insight into the last mile of a number of global networks. To select which networks would be included in testing, we took a one month sample of production traffic to our authoritative DNS platform and selected networks from the top twenty which also had appropriate RIPE Atlas probe density.  

Variables being considered: 

  • End User / Client – Testing from the perspective of end users is critical to understanding the nuance of internet performance.  
  • Recursive Resolver – Recursive resolver implementations have varying configurations. Some modify the TTL of records, some are operated as clusters with a large shared cache others have many individual caches, some perform prefetching of popular records, etc. 
  • Authoritative Resolvers – In the example above, there are three different namespaces being referenced. Each might be served by a different authoritative provider which might have varying proximity to the end user’s recursive resolver. 
  • Networks – The networks facilitating communication between all these components have different performance profiles from the last mile to well-connected internet exchanges 

Test 1: WAF Service Implementation 

A set of RIPE Atlas probes acting as clients configured their default local resolver to request two records. One record being the first in a CNAME chain for a WAF, the other being an ALIAS record for the same WAF service. As expected, the raw results contain a number of outliers in both test scenarios created by packet loss and last mile performance issues. 

For example: In the time series below, you can see some pretty serious outliers. 

A time series isn’t ideal for communicating what happened. As you can see above, it looks like “most” response times were less than 1000 ms. To better quantify, we look at a histogram of the results. 

The median response time for the WAF ALIAS record was 44.96 ms, whereas the median response time for the WAF CNAME Chain was 63.18 ms a difference of 18.22 ms. The boxplots below indicate that the median response time for the ALIAS record is aligned with the beginning of the 2nd quartile response times of the CNAME chain.  

Test 2: Cloud Load Balancer  

Test 1 focused on a CNAME chain with 5 links, whereas many implementations might have only a single CNAME. To test this scenario, the same population of probes requested one record, which was a CNAME, to a cloud load balancer and another record, which is an ALIAS, pointing to the same load balancer. 

Test 3: Counter Point 

The first two tests showed clear performance gains for the ALIAS ANAME implementation. We thought it was important to create an example of the opposite, an instance where the ALIAS record is slower to highlight some nuance. To accomplish this, we set up some tests in South Korea. South Korea is known for having well provisioned high-speed networks deployed within the country, but paths out of the country to the wider internet can be slower. 

For this test, the CNAME chain example can be resolved within South Korea. The clients, recursive resolvers, and authoritative providers all have a presence within the country. Resolving the ALIAS record requires the incountry resolver to issue queries to either Hong Kong or Tokyo, which takes much longer than resolving the CNAME chain in country. South Korea internally is well connected but the paths to Tokyo and Hong Kong require traversing undersea cables. This is why it is important to understand your customers use case and monitor performance.

The ANAME provides an option for infrastructure operators that are looking for CNAME at the apex of the zone functionality. The ANAME helps reduce variability in response times from the to recursive resolvers and clients by actively maintaining the CNAME chain in a local recursive cache. As Evan Hunt pointed out at the DNS OARC meeting in San Jose, as the ANAME standard is adopted, recursive resolvers may start to implement ANAME verification, potentially reducing some of the performance gains of the new record type. That being said, following Lord Kelvin’s advice “to measure is to know” … we will keep on measuring. 

For more detail check out our webinar.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha