cr.opensolaris.org gets an ATOM feed
By user12618941 on Mar 24, 2008
For the past couple of weeks, I have been working late at night and on weekends to add an ATOM feed (i.e. blog feed) to cr.opensolaris.org, so that as people post new code reviews, they are automatically discovered and published. Stephen has been heckling me to do this work for more than a year. This weekend I managed to finish it, despite the incredibly nice weather in the bay area: I was stuck inside with a nasty cold.
As an aside, I'm looking for help with cr.opensolaris.org. This is a great opportunity for someone to step up and help out with an important part of our community infrastructure. Send me mail.
You can check out the results of my hacking on cr.opensolaris.org. Or you can subscribe to the feed. If you want to opt-out of publishing reviews, you can create a file called "opt-out" in the same directory as your webrev's index.html file. Or you can create a file called "opt-out" in your home directory, if you'd like to opt out of all reviews.
This was an interesting learning experience for me, since I had to learn a lot about ATOM in the process. I also learned the XSLT language along the way as well, and how to process HTML using python. All in all, I'd say this project took about 20 hours of effort, and resulted in about 500 lines of python code. The most difficult problems to solve were:
- I wanted the feed to include some meaningful information about the codereview. If you subscribe to the feed using your favorite reader, you'll see that a portion of the "index.html" file from each webrev is included. This is done using a somewhat tricky piece of python code. In retrospect, using XSL for this might have been a better choice, although I've found that people have a tendency to introduce non-standard HTML artifacts into their webrev index.html files, and I don't know how well XSL would cope with that.
- ATOM has some rules about generating unique and lasting IDs for things-- this is the contents of the <id> tag in the ATOM specification. I found a lot of valuable information on dive-into-mark. For cr.opensolaris.org, this was complicated by the fact that the user might log in and move their codereview around, or might copy one review over another. In the end, I solved this by remembering the <id> tag in a dot-file which rides along with the codereview. A cronjob roves around the filesystem looking for new reviews, and adds the special tag-file. By storing the original <id> tag value, and looking at the modtime of the index.html file, I can correctly compute both the value of the <id> and <updated> fields for each entry. If a user deletes a codereview, the dot-file will go away with it.
- Once I had an ATOM feed I needed to transform it back into HTML for display on the home page. The only problem was that there aren't a lot of good examples of this on the web-- many of the ATOM-to-HTML conversions only work with ATOM 0.3, not the 1.0 specification, and I didn't know the first thing about XPATH or XSL. In the end, I only needed 25 lines or so of XSLT code.
I think of the current implementation as a "1.0"-- it'll probably last us pretty well for a while. One thing I'd like to research for a future revision is actually placing the entries into a lightweight blog engine, and letting it do the rest of the work: Using an excellent list from Social Desire I took a quick look at Blosxom, Flatpress, Nanoblogger, and some others.