Friday Feb 17, 2006

Linkability vs. Ajax

How can I bookmark/index/catalog/map/associate something interesting if the page's contents were assembled on the fly with Ajax? Ajax is a cake that I really want to eat, with more benefits than just snappy UIs, but is there a way to implement it without breaking linkability? Before facing that conundrum directly, let's reduce the surface area of the problem as much as we can.

Why is Ajax so nice? More bang for bandwidth, snappy UIs, offloads processing to the client. Do any non-Ajax techniques address these things without compromising linkability? Yes. Clean, lean (X)HTML saves bandwidth and quickens things up. What else? Minding your server's last-updated response headers for GET requests reduces transfers and makes things snappier. If sluggish, bandwidth-hogging web apps are pushing us toward Ajax, then these things alleviate some of that pressure.

Trying a different angle of attack, when does Ajax not break linkability? Some use cases don't break linkability, for example a login function. If logging in happens seamlessly over Ajax, there's no inherent breakage between URL and content. So obviously there are plenty of Ajax uses that don't cause linkability problems.

So a significant chunk of the problem's surface area is whittled away, leaving the central problem: Ajax that fetches new resources onto the current page breaks linkability.

I don't think there's a single solution to the problem, but one thing that springs to mind is flip-flopping the container/content paradigm. Rather than Ajax loading a new content resource onto the existing page, a plain-old link-click loads the new content resource, and Ajax loads the container, i.e. navigation lists, company logo, etc., all of which are cached by the client.

Wednesday Dec 14, 2005

On the Importance of URL Canonicity

When designing a linking schema between their pages, web developers often fail to take into account the importance of URL canonicity. For example, when implementing a tabbed navigation design, it's tempting to switch the tab based on a querystring such as page.jsp?tab=1. Similarly, a CMS might use a URL like showArticle.jsp?article=123456. Also, most webservers accept two versions of a directory index URL, for example /foo/ and /foo/index.jsp. Finally, It's common for a website to exist at both and

All of these URLs create canonical issues, or ambiguity, regarding where exactly a piece of content lives. You might ask why there should be ambiguity when a given URL always gets you the same piece of content? In other words, if both /foo/ and /foo/index.jsp return the same content, what's the big deal? Well for starters there's the extra performance hit on your webserver because the browser cached one version and not the other. In fact, any caching system is impacted by non-canonical URLs.

Another problem with non-canonical URLs is in web analytics. Two versions of a URL might really be the same page, but a web analytics package doesn't know this unless you hard code a special rule about it somewhere. The fact is that, canonical or not, the world treats URLs as canonical, and so broken behavior results when they are not.

But what about querystring driven URL schemes? Isn't showArticle.jsp?article=123456 always the same? Strictly speaking, yes, so why should querystrings be bad for canonicity? When there are multiple variables inside a querystring, order generally doesn't matter. foo.jsp?cid=123&uid=456 is the same as foo.jsp?uid=456&cid=123, so in this sense canonicity is broken. But even if you take pains to ensure querystring values are always ordered consistently, the world at large doesn't know this. Querystring driven URLs are treated as ambiguous. For example, Google isn't as quick to index a querystring driven URL as it would be to index a URL that looks canonical.

Fortunately, options exist to make URLs more canonical. Most webservers have URL rewriting capability that can redirect to a given URL from the corresponding URL, or vice versa. And most webservers can be similarly configured with regard to the /foo/ vs. /foo/index.html issue.

For web applications where pages are dynamically assembled, it's possible to use path info instead of querystrings. Consider the following URLs:

Here, "articles" serves the same role as "showArticle.jsp." Correspondingly, "?article=123456" and "/123456.jsp" serve as the pointer to the content piece. The implementation of this is beyond the scope of this post, suffice it to say that the capability is built into most web application environments.

Finally, here are a couple of related links:


My name is Greg Reimer and I'm a web technologist for the Sun.COM web design team.


« July 2016