Wednesday Oct 07, 2009

Last day at Sun

I really enjoyed my tenure at Sun, but I am leaving to go work at NASA on MCT. You can find my new blog, once I get some time to get things going. You can find me (Chris Webster) through linked in

Monday Jun 01, 2009

Community One presentation on zembly architecture

Girish and I presented a session at community one on What you need to know about creating and running a scalable web site based on our experience on zembly. Here are the slides. Community one slides are available here.

Tuesday Apr 28, 2009

The Agile Development Process used to create Zembly

I recently saw a blog about agile development where the author was doing a production deployment with every commit that successfully passed the unit tests, so I thought I would share the development process we used to creating Zembly.


Each planning cycle covers a single three week development cycle (sprints for those familiar with the lingo). Planning is done both top down and bottom up, that is features and tasks come from both strategic features as well as infrastructure related requirements. The input from everyone on the extended team is a categorized set of tasks and features in our bug tracking system (Jira). The planning meeting serves to further categorize this list of tasks to committed and target features (committed features are expected to be before the end of the sprint, where target features will be delivered if there is time).

While the above sound straightforward, there are several things to be aware of:

  • The issues must be clear and should have an accurate time estimate of the work. Having one line descriptions with no time estimate makes is difficult to determine the priority and importance of the task. The time estimate makes it possible to determine the load on each person (tools can help a lot at this point, specifically with dependencies and total time).
  • Make sure to include things like blogging, demos, presentations. Not including these will result in leftover tasks at the end of the sprint.
  • Bugs or time for bug fixing also needs to be included. What we have done is to work with higher level tasks (something like bug fixing) where each engineer can commit time for bug fixing and determine the bugs to fix. Trying to prioritize the various bugs proved to time consuming, but considering the time impact is important.  
  • Try to avoid adding additional work during the planning meeting, this meeting will become extremely long with feature debates anyway so trying to specify a feature enough to get a time estimate will be difficult.
  • Input to the planning is open to everyone, but the prioritization done during the planning meeting should be open to a few. This will make reaching consensus faster and also reduce the time everyone spends in meetings. We introduced a sprint lead rotating role (responsible for representing the engineering team during planning and update meetings) so that everyone would get to attend and participate in a planning meeting, but not all at once.  


The development process works as follows:

  1. A developer completes a unit of work which is production ready (the assumption is that tests are being developed in conjunction with code) and commits the change to the trunk. Code reviews are incorporated into the development cycle, mostly informal (peer review of change sets), but we have formal code reviews for large and complicated changes.
  2. A continuous build system (Hudson), detects when changes occur and does a checkout, build, test cycle. If the build or tests fail a mail is sent to the development team. If the build is successful the binary is published, for deployment to our continuous integration server.
  3. The continuous integration server picks up the last successful build (up to once an hour), deploys this, then runs a series of functional tests. The functional tests exercise the zembly platform APIs and ensure the integration is working. The functional tests serve as black box tests, where the unit tests are white box tests. If the deployment or functional tests fail a mail is sent to the development team for evaluation. Another set of UI tests are exected if the functional tests are successful which verify the basic functionality of the user interface (you get the picture if the tests fail) across multiple browsers.
  4. In addition, to the above automated checking there is a nighly build which measures code coverage and also runs FindBugs for Java code. The JavaScript code is run through JSLint on each build, which helps us detect problems early.
  5. A set of performance tests run nightly and detect performance variations across builds. The key here is to look for trends and specifically verifying performance optimizations are really working. One of the biggest challenges is to make the tests repeatable (if you can avoid network effects that will save you lots of headaches).

There are a few interesting things that we found work well:

  • breaking the build should be a big deal. Everyone will do it, but promptly fixing the problem is essential. The larger and more distributed the team the more expensive it is to break the build as updating and having to figure out why the build is broken is a pain. Establish a rule of waiting until the continuous build is successful before leaving. Also, peer pressure is effective. 
  • Establish a culture of writing tests, imposing this or trying to write tests after the fact is difficult.
  • Make sure commit messages are understandable. A commit message describing the change lives with the code and may be referenced long after the change commits, so describing the change as well as the potential impacts on other code help both future developers and people testing the code (we used a QE impact section describing what areas would be impacted). 

Deployment and Codeline management

We split a sprint into a number of deployment units (we have done several variations between daily and weekly) but this really depends on how fast the codeline can be vetted and stabilized. This time has increased as the SLA our users have grown to expect has increased. Each deployment unit has a release manager, who is responsibile for ensuring the code line is branched (more below), the bits are tested, stabilized, and pushed to production before the next release manager takes over. The release manager is announced but also gets to wear a special badge, the NASA style vest was not available.

Here are more details on what happens:

  • There are three codelines all the time:
    • trunk - this is never closed, but for larger changes it is better to commit them right after the staging codeline is created. This allows more testing on developer machines and reduces the risk of stopping testing immediately after the staging deployment.
    • staging - this is created from the specific trunk revision that is currently running in the staging environment. Bugs which are detected in the staging environment that are serious enough to prevent a production deployment are fixed in the this branch.
    • production - represents the code which is currently running in production. This branch is used if there are blocker issues in production. Changes in this branch are emergency changes and keeping the production environment running is the highest priority.
  • There is an automated deployment which deploys the latest good bits (these are the bits which pass the test described in the development section) to the staging environment (a small replica of what we run in production) as well as runs the automated tests. The staging environment automated tests are the same as those that run continuously; however, these run in a horizontally scaled environment which has different characteristics.
  • The release manager's job begins after this deployment to ensure the automated test suites pass as well as the codelines described above are uptodate, which is essentially moving the staging codeline to production, and copying the specific trunk revision to the staging codeline (copy and move are used in the subversion context).  The release manager also does a manual sanity check on the bits to look for things that are difficult to detect in UI tests (alignment issues for example).
  • At this point the build is ready for testing. This can be done by a formal QE team and also perhaps through developer brown bag sessions (we use both). The developer brown bag session allows the team to try the software as a group and build or create things, so in addition to finding bugs, things like usability issue will surface.
  • If there are any defects the release manager will determine if the defects require immediate fixes and if so will ensure the fix gets committed to the right codeline and the changes get pushed to staging.
  • Once everything is working as expected, the build is given a go and deployed to production and the cycle starts again. 
The release manager role lets most of the team focus on the code and also provides a way for everyone to get a change to be the release manager. This is a difficult job and walking in the shoes of the release manager helps everyone think about the development process.

Saturday Feb 14, 2009

zembly update

We just finished pushing an update to the zembly production site, if you haven't been there in a while it is a good time to check it out as we have made major and minor improvements to the user experience so we would like to know what you think. 

Saturday Feb 07, 2009

MySQL 5.1 v 5.0

I recently had an opportunity to compare the performance of MySQL 5.1 v. 5.0. The daily performance test suite for zembly simulates the various loads that we see in our production environment, so our experiment was to change the JDBC connection pool in glassfish to point to an instance of MySQL 5.1 and run the tests.  Without any code changes, we saw a better than 25% improvement. Nice work guys!

We also use the MySQL enterprise monitor both in our testing and production environments. The 2.0 version has query analyzer which gives you the query statistics v. just telling you there are slow queries.  This functionality uses the MySQL proxy to aggregate the query information. There is a performance impact for this (we noticed this during our peformance testing scenarios), but we did some discover some queries which were running to often so it proved useful.

Wednesday Dec 24, 2008

zembly book is almost ready for release

The zembly book is now available from amazon (preorder). I am still waiting to see it in print, but I am hoping this will be useful to people interested in zembly directly or the technologies permiating the social application development space. 

Monday Sep 01, 2008

Safari Rough Cut release of zembly book

I have been working on a book describing how to use zembly to create the social web (think wiki for code targeted to social networks and other web 2.0 platforms like a blog), which was released via Safari Rough Cuts. There is still time to influence the book's content, so please use the feedback mechanism.

Friday Jun 06, 2008


I have been working hard on zembly that began "perpetual beta" today. At zembly, we are trying to provide a service to easily create social network applications that run on social networking sites like Facebook and Meebo as well as hosting widgets.

One of our focal points is making invoking Restful services easy. There are several difficult (not necessarily in terms of technology) issues to deal with when consuming restful services including API key management, and perhaps one of the most challenging finding and determining relevant data providers.

We have also tried to remove the distinction of code and deployment space by developing and "deploying" applications in the browser. This is a similar model to how blogs or wikis have taken over what in the past was the tedious process of editing and uploading html and other content.

If you are interested in taking a look at zembly, please visit the site. This is currently in private beta but there are some invitations available :)  

Wednesday Apr 09, 2008

Interested in an Internship

The team I am working on has several openings for interns. We are looking for people who want to build Facebook, MySpace, iPhone, and Meebo applications. If this sounds fun to you, please take a look at the job posting.



Sunday Mar 30, 2008

Working with the Facebook data API

I just finished working with the Facebook Data API to see how it compared with other storage services like Amazon S3. According to the documentation there is no intention to charge for normal usage of this service, so if you are building a facebook application it is worth considering this API. 

The Data API requires referencing objects by identity. All objects have intrinsic identity which is returned whenever an object is created. This can be referenced from FQL as '_id'. The Data API also supports the ability to reference an object using a user defined key (referenced in the documentation as a hash key). Just as in object oriented programming, objects can have properties which must be one of the following set of types (integer, string (255 char max), and text storage as a blob).

I used this API to allow associating tags with a user.  I started by defining the data model using the DataStoreAdmin link on application page. Using this tool, you can quickly create object and association types as well as execute FQL queries. I created two object types user (containing user id) and tag (containing the tag value) as well as the tagged association (a symmetric association between user and tag).

When a user is tagged the following operations are performed:

\* Create a tag object by invoking data.setHashValue passing in the tag string for the hash key as well as the value property. This service returns the object id and allows the object to be referenced via hash key in addition to the object id. In this data model, invoking setHashValue is idempotent and thus this can be called every time a user is tagged.

\* Create a user object by invoking data.setHashValue passing in the user id as the hash key and the id property.

\* Create an association using the data.setAssociation service to link the user object and the tag object. The id's returned by setHashValue should be used to collect the id's.

When the tags for a user are displayed, the tagged association can be queried, using fql.query, to determine the set of tags applied to the user. The query for extracting the tags would roughly be:

SELECT value FROM app.tag WHERE _id IN
   (SELECT tag FROM app.tagged WHERE user = tagged_user)

The storage mechanism makes this data model pretty straightforward. This hash key mechanism can store properties of non scalar data such as JSON. This supports the ability to use the getHashKey method to retrieve more than a single property. There is also an interesting feature of hashkey, data.incHashKey, which allows atomically modifying an integer property value.

 The example above could be extended to support a tag cloud, by counting how many instances of the tag have been added. One way to achieve this is by adding a count object with a single count integer property. The count object type would have an association with both the user and tag objects to have a unique count for each user. The count would be obtained through a query over both associations. Another implementation technique would be to use a synthetic hash key and a unique tag object for each user. The tag object would be extended with the count property. The tag object's hash key would combine the user id and the tag name. The count property would be incremented each time the user is tagged.


  • The API would be smaller if only hash keys were required for each object. The object id is somewhat difficult to work with and typically domain data provides unique identifiers.
  • The batch API is required for any significant use of this API, as a single logical operation will translate to multiple service invocations. It would be nice to have more support directly in the data API.
  • A lot of common use cases could be supported by extending the increment hash key capability to any object property. This could be done by passing the last version id and the current update and rejecting the change if an intervening change has occurred. In relational databases, this is implemented using an optimistic locking technique such as a version column.
  • This technology is worth evaluating if you are working with a viral facebook application.


Saturday Jan 05, 2008

Interesting Site for a collection of web links

One of my colleagues sent me a link to the 50 most popular web design posts. This has some really cool and interesting Web 2.0 design topics, worth a look. 

Wednesday Jan 02, 2008

Dapper Camp

Dapper is an interesting software as a service website that serves as a transformation engine for the content of a website. The content can be converted to a variety of different formats including RSS, XML and JSON. One of the common use cases for Dapper is to select some subset of a website's content, which is certainly interesting. However, a more interesting use case for "Web Service" type applications is the ability to create an external API for web sites which do not have API's defined. These API's are constructed using the Dapper interface which allows publishing a RESTful service that can then be consumed by service side mashup services. There is a free dapper camp in San Francisco in February where you can learn more about this technology.

In addition to the dapper camp, a contest to build a NetBeans plugin is underway. If you are considering enhancing the NetBeans REST or other web technology support this provides an interesting incentive.

Sunday Aug 12, 2007

Free Open Source Desktop project management tool

I saw a link to an interesting free and open source desktop project management tool called OpenProj. The site mentions the tool available on Windows, Mac, and Linux is a compatible replacement for Microsoft project. 

Friday Aug 10, 2007

Working with Facebook Web APIs

I have recently been working with the Facebook APIs. I wanted to see how to incorporate Facebook data in a widget. By widget I am referring to reusable UI behavior which is driven by a set of services available via http. In contrast to the typical Facebook enabled web application, I wanted investigate the feasibility of using Facebook services as part of a mashup. Specifically, I wanted to ensure that without full page control (iframe embedding), it is possible to interact with Facebook services.

Here is what I did:

  1. From my Facebook account, I added the developer application. The developer application lets you create new Facebook applications. Creating a Facebook application generates a public and private key set which is required for invoking the APIs.
  2.  In order to start calling the API's to access Facebook profile information, authentication must be performed. The Facebook platform authentication mechanism is similar to the OpenId mechanism, whereby the Facebook platform actually performs the authentication and provides user information to the application. Since the target use case is to embed the widget within a Mashup and hence the reusable widget would not know the semantics of the page, the callback mechanism for web applications (after authentication on facebook, facebook redirects the browser to the provided URL) is not really appropriate. Facebook also provides a desktop authentication mechanism where a token can be generated, once the user is authentication (perhaps by launching a dialog or another browser window) the token is activated. The application can then obtains a session lease where the user id, the session id, the expiration time, and a session secret are provided. The widget can use the approach to generate the token before launching a facebook login. One problem with this approach is the exact order of obtain token, user authentication, get session must be preserved. If getSession is called before the user actually authenticates, then the token is invalidated and get session returns an error. This is a normal challenge with distributed authentication. The message signing is similar to other web APIs such as flickr, so that will be second nature to most people.
  3. Following, this I used the Facebook Query Language to start extracting data. This allows flexible queries (there are some limitations where the query must contain an index and there are no joins allowed) on different data sets.
Overall, the developer experience was positive. Developing a web application is very easy given the authentication callback mechanism, where a widget is more challenging.

Wednesday Aug 08, 2007

JavaScript code coverage tool

I came across this link to a tool for measuring JavaScript code coverage( I haven't yet had a chance to check it out but sounds interesting. 



« July 2016