Wednesday Sep 24, 2008

implementing ManagedObject or Serializable, and updating those objects

If you've looked at the Darkstar APIs, you're familiar with the notion of a Managed Object. These are the objects that get persisted for you in the data store. Implement the ManagedObject interface, and we'll take care of making that object durable, handle contention around it, etc. Real nice. The only requirement is that these objects must also implement Serializable, since we use Java serialization to actually store your objects.

A common pattern that you'll see in the Darkstar APIs is the ability to use a class that must implement Serializable but may optionally implement ManagedObject. What may not be obvious is why this pattern is used or what effect there is on your application code when you make one of these two implementation choices. Since I hear questions about this pretty regularly, I thought I'd try to lay out what to my mind are the two key issues involved here.

To make this a little more clear, consider the following example. The TaskManager can be used to schedule tasks:

  Task myTask = new MyTask();
  AppContext.getTaskManager().scheduleTask(myTask);

In this case, myTask is an instance of MyTask, which in turn must implement the Task interface. All implementations of Task must also implement Serializable:

  public static class MyTask implements Task, Serializable { /\* ... \*/ }

According to the javadocs, implementations of Task may also choose to implement ManagedObject. Several other classes (like ChannelManager) use this same pattern. Implementing ManagedObject won't change the methods that must be implemented, but it does have a few effects. Specifically, there are two big differences from the Application code's point of view.

The first, and perhaps more obvious difference (perhaps) has to do with handling object removal, and this is why I usually suggest that Task be implemented as above. If you only implement Serializable, then Darkstar will take care of managing this object in the data store until the task runs. This also means that we'll take of removing the object once the task has finished running. So, you as an application developer can schedule this instance and then pretty much forget about it. If, however, you said

  public static class MyTask implements Task, ManagedObject, Serializable { /\* ... \*/ }

then Darkstar assumes the application has already managed this object. We also assume, therefore, that the application code will take care of removing the object. In other words, even after the task has committed, the task object will remain in the data store until your application code explicitly removes the task instance.

Sometimes this is exactly what you want. If, for instance, you have some shared object that maintains state and it's convenient to schedule this object occasionally to do some work based on that state, you wouldn't want Darkstar to remove the object when the task completes. In this case, making your class implement ManagedObject means that you can keep a ManagedReference to it, and work with that object as long as you like. It also means, however, that if you don't keep a reference to the object, that it will continue to be persisted and you'll have no way to remove the object.

So, to summarize, if you implement Serializable only, we'll take care of managing the object and removing the object when it's no longer needed. If you also implement ManagedObject, then make sure that you have a ManagedReferece (or name binding) to the object, so that you can explicitly remove the object when you no longer need it. In other words:

  public static class MyTask implements Task, Serializable { /\* ... \*/ }
  public static class MyMOTask implements Task, ManagedObject, Serializable { /\* ... \*/ }
  // ...
      // this is ok:
      AppContext.getTaskManager().scheduleTask(new MyTask());
      // this is not ok, since the object will never get removed:
      AppContext.getTaskManager().scheduleTask(new MyMOTask());

Cool so far?

I said that there are really two issues to consider here. The first is how objects are managed and removed. The second has to do with managing updates.

Recall that to get an object from the DataManager you can either do a get() or a getForUpdate(). Even after you get() an object, you can always call markForUpdate(). As the docs say, the point of "marking" an object for update is to tell Darkstar that you're going to be modifying the state of the object. This can have some significant performance implications; more on this in a minute.

Now, if you look at the docs, you'll see that application code can only "mark" an instance of ManagedObject. It cannot "mark" an object if that object only implements the Serializable interface. Fair enough, since the DataManager interface is used to manage implementations of ManagedObject. In other words, you can't ask the DataManager to manage an object that doesn't implement ManagedObject.

What does this mean for the examples we've been looking at? Essentially, it means that any time you schedule a Task (or register a ChannelListener or ClientSessionListener, etc.), Darkstar is effectively wrapping your object in something that does implement ManagedObject so that the object can be persisted in the data store. We'll take care of creating and removing these wrappers as needed, and these wrappers in turn will either have a direct reference to the object (if it only implements Serializable) or a ManagedReference (if the object implements ManagedObject). What Darkstar won't do for you is "mark" this wrapper for update.

So, going back to the original example, if your implementation of Task only implements the Serializable interface, you will never be able to "mark" your object for update. This is because the actual ManagedObject that you would have to "mark" is a wrapper object that is internal to the TaskManager, ChannelManager, or some other system component. This is why you'll see language in the javadocs suggesting that if your Task (or ChannelListener, etc.) has mutable state, you should always implement ManagedObject so that you can "mark" your object if you are changing its state. In other words:

  // state will never change, so this never needs to be marked for update
  public static class MyTask implements Task, Serializable {
      private final int someValue;
      /\* ... \*/
  }
  // state can change, so you'll want to mark the object when modifying it
  public static class MyMOTask implements Task, ManagedObject, Serializable {
      private int someChangingValue;
      /\* ... \*/
  }

"But wait," you may be thinking, "I thought we didn't have to mark our objects for update and Darkstar would just magically take care of things." Yes, that's true. Darkstar does implement modification checking. That is, at the end of a transaction, we can (and by default, do) check every accessed object to see if it's been modified. So, strictly speaking, you don't need to mark an object each time you modify it. That said, there are some very good reasons why you should mark all updates, and therefore why you should follow the pattern I've illustrated above. Let me try to convince you.

Like the docs say, marking for update is all about performance. Whenever an object is modified in one transaction, it may cause conflict with other transactions. The sooner that Darkstar knows about an update, the sooner it can start tracking this possible conflict, or decide that the calling transaction will have to be aborted and re-tried. If you mark your updates, then this will help optimize your Darkstar application; if you don't mark updates, then we have to wait until the transaction is ready to commit, and only then can we figure out what modifications happened and therefore decide if there was conflict. At the simplest level, this may mean that your transaction does a lot of work that may be aborted, but that could have been avoided if we'd known up-front about the update. There are lots of other neat tricks that we already do or are looking at doing with this information, but these are all optimizations that won't happen unless you mark your updates.

There's another performance issue here, however, and it has to do with how we do the modification checking. Remember way back at the start of this entry I said that all objects must implement Serializable because this is how we persist your data? Well, since we're using Java serialization anyway, this is also how we check for modifications. That is, for every object that hasn't been marked by the end of a transaction, we do partial serialization of the object to see if its state has been modified. So, even if you don't update an object, you'll still pay a sight serialization cost. It's not terrible, but it can definitely add up, especially if you access a lot of objects or have objects with large amounts of data.

What can you do about this? Well, if you're marking all of your updates correctly, you can actually turn off this modification checking. Obviously you need to be pretty careful about doing this. If you are actually modifying some objects that you don't mark correctly, you won't end up committing your changes. So, before you even consider turning off modification detection, you should see if your code is marking updates correctly. Luckily, there's a logger that will tell you just this. In your logging properties, set:

  com.sun.sgs.impl.service.data.DataServiceImpl.detect.modifications.level = ALL

This will continue to run with modification detection enabled, and will tell you whenever a modification was detected for an object that wasn't marked for update. Obviously a compile-time tool for complete code coverage would be better, but hey, we're working as fast as we can :) For the time being, if you set this logger and run your code through vigorous tests (you do have good test coverage for your game, right?) you should be able to see which updates, if any, you're missing.

If you're feeling confident that you're correctly marking all updates, then you can try turning off modification detection altogether. This is done with a property setting:

  com.sun.sgs.impl.service.data.DataServiceImpl.detect.modifications=false

This has the effect of only serializing and committing the objects that your application code has actually marked for update. Obviously the overall performance impact will vary depending on the game, but in practice we've found that this can make a noticeable difference. Try looking at profiling data and see for yourself if this helps with the runtime of your code.

Coming back to the original topic of this post, you won't be able to turn off modification checking unless you mark your updates correctly, and you can't do this unless your objects implement ManagedObject. So, if you have an object with mutable state (i.e., something where you'll be changing the value of some member variable), you really should be implementing ManagedObject and you should get in the habit of marking those objects each time they're updating. If your object's state doesn't change, then you may not need to implement ManagedObject. You'll have to figure out whether you need to keep the object around for an arbitrary amount of time, or keep ManagedReferences to it. If not, then it's probably easier to only implement Serializable and let Darkstar take care of managing and removing the object.

Like I said, the choice of whether or not to use ManagedObject may not be immediately obvious just by looking at the javadocs. There is some serious subtly here, but ultimately I think the few rules I've sketched out should make it fairly simple to figure out which approach to take. I hope this makes it a little easier when you're writing your games, and when you're trying to figure out how to use this pattern!

Saturday Aug 23, 2008

performance analysis: profiling

A lot of folks are taking Darkstar for a spin, which is exciting. It means, however, that we're getting more questions about how to understand what's going on in the system, why a given application doesn't perform as expected, etc. This isn't surprising: Darkstar is still a work in progress, and it represents a somewhat new programming model, so as a developer you really need good tools to help you build great games. We're still building up these tools, but while some are on the wish-list, others are available today.

Reading the forums you see some discussion about profiling, or specific numbers getting cited. I've posted a couple times about the basics, and how to use profiling to learn about what's going on in the system, but this is a little scattered. I thought I'd collect some of those details here, and give a quick round-up of what's available now, how you use it, and where this work is going. Remember this is very much a work in progress, so we really want feedback about what's useful, what's confusing, and what you'd like to see added!

If you've played with profiling or runtime-debugging facilities in other systems, what we've got in Darkstar should look pretty familiar. Basically, it collects data about what's happening in the system, and provides that data as a stream of reports. If you're writing a Service, you can register to provide data into this stream, and any developer can write any number of listeners to consume this stream of reports which then choose how to aggregate or represent the output. Real easy.

Recall that Darkstar is an event-driven system. Rather than starting threads or invoking long-running, monolithic tasks, the model is many short tasks that are in response to some event. In the case of Application code these tasks are always run in a transaction, though there is also support for non-transactional tasks. These tasks get run through the system schedulers, which decide how and when to run these task. For each task that gets run through a scheduler, a new Profile Report is generated. This represents a collection of details about what happened in the scope of that task, and makes it easy to correlate events and actions (e.g., timeout and the number of objects accessed in the data store).

If you want to start playing with all of this, where should you start? Probably the best place is in the com.sun.sgs.profile package, where you'll find ProfileReport and ProfileListener. The former is the actual structure that is provided for each task run through the system. The latter is the interface that anyone can implement to consume these reports. I won't go through all the details in this entry, but hopefully the basics of the reporting structure are pretty clear (if not, we want to know about it!). In addition to basics about task type, length, delay, failure causes, etc. there are also more general-purpose components like counters and operations that let listeners track specific kinds of data-points, like bytes read by the data store, tasks scheduled, etc.

If you're impatient (well, ok, most of us fall into this category), and just want to see some data in action, skip these interfaces and go right to the com.sun.sgs.impl.profile.listener package. Here you'll find a collection of ProfileLIsteners that we've written to get folks started. For the most part, we've written these as we've found the need to watch for certain details or to answer specific questions; we're pretty sure these are useful debugging tools. You can include any number of these listeners in your running system, though of course you do have to be careful about the performance overhead. For the most part these are very light-weight, but YMMV.

So, how do you use these? There are two properties you need to set. The first defines the overall profiling system, and the second decides which listeners are actually used. The property that defines the entire profiling system is:

  com.sun.sgs.impl.kernel.profile.level

In the 0.9.6 stack all profiling is off by default, so you need to set this variable to on. In the latest trunk (which I definitely recommend you grab if you're actually playing with performance analysis) a minimal amount of profiling is always running, but you can increase the amount of data being gathered by setting the property to medium or max (by default, it's running at min).

Once you have the profiler running, the next step is to define which listeners are included. You do this with a second property on startup:

  com.sun.sgs.impl.kernel.profile.listeners

In both the 0.9.6 and the current codebase, this variable is a colon-separated list of fully qualified classes that implement the ProfileListener interface. So for instance, if you set

  com.sun.sgs.impl.kernel.profile.listeners=com.sun.sgs.impl.profile.listener.SnapshotProfileListener

then the system will start up with the SnapshotProfileListener running. This is a pretty simple class that collects data over some period of time. By default it reports on the last 10 seconds worth of tasks, detailing how many tasks ran versus how many succeeded, how long on average the scheduler queue was, and the number of threads being used by the schedulers. The output is displayed on port 43007 (by default), so once the system is running, just telnet to this port and you'll see an update every 10 seconds. This is not a particularly complicated or detailed listener, but it gives you a quick, high-level view of how much work the system is doing, and whether it's keeping up or falling behind.

Take a look at the other listeners in that package. They all have class javadocs that explain what they do and how they can be configured. Some output on sockets, some via loggers, and others to standard out. Collectively, they can give you some good insight into what's happening in the system. Obviously these are only a starting-point, so we'd love you to suggest other features, or write your own listeners to contribute to the community. Hopefully these are useful to you in understanding how you can push Darkstar and your Application code, and when things aren't working, why that is.

I haven't talked about the other end of this system, namely, how you get data into the profiling stream. If you're writing a Service and want to include some specific detail, Take a look at the com.sun.sgs.profile.ProfileRegistrar interface. This is available in the registry that you get when your Service is constructed. Through this you can register to report different kinds of events and data. Note that we don't currently expose any of these interfaces to Application code yet, so you can't include details directly from your game logic, but that's on the list of features to add and (detect a theme here?) we'd definitely like to hear thoughts about how make this most useful.

This brings me to the final issue here: where we're headed. Like I said at the start of this entry, this is still a work in progress. We've got some basic features that have proven to be pretty useful in understanding application/system behavior, but we're really just getting started.

In addition to exposing interfaces to Application code, we also plan to add hooks to capture events from some of the Services that aren't reporting anything right now (especially the Session and Channel services). These will be like the details that the Data Service currently reports. You'll be able to see which calls were made (like creating a channel or sending a direct message), how many calls were made, how many bytes were sent/received, etc.

An effort that Jane is working on right now is making the whole system more dynamic. Right now you have to set the profiling level and the set of listeners on startup, and while you can set the level globally, you can't easily tune a specific set of data (e.g., turn off all reporting except from a specific Service). She's also working on exposing some of these controls through a more general JMX-driven management interface, which should make it much easier to run and manage a Darkstar stack and control how profiling happens. The ability to set levels is only the first piece of this work.

At the same time David and I have been working on a new piece of the system to track and manage conflict. While the work is more generally designed to drive the transaction and scheduling systems, one result is that ProfileReports will now include details about what objects were accessed, and on failure, what the likely source of conflict was. This isn't in the trunk yet, but hopefully it will be soon. In the meantime, check out the contention-rev branch for the latest bits. Once we get this committed, I'll post more details about this project, and get a wiki setup to track progress and feedback.

I know this is a pretty quick, high-level overview of how profiling works, but hopefully it's useful. Remember, the goal of these features was to provide a light-weight, fairly simple set of interfaces for tracking the kinds of details that give you insight into what's happening. Like everything else in Darkstar, this is a layered system, so we've got a lot of the core collecting and reporting, but we're still building up to some of the higher-level tools and the right ways to aggregate across a cluster. We've got some good ideas about this work, but none of us has had the spare cycles to make any progress. If this sounds interesting to you please let me know, and I'll point you in the right direction.

Happy profiling!

Tuesday Jul 01, 2008

writing services - example code

There were several questions about the status of the code in my last posting. There were also (I noted as I read back through the entry) a few typos. So, I've written up the example as compilable/runnable code, complete with some (somewhat) clean documentation. Better still, it's under a Creative Commons Public Domain license, so feel free to use it any way that you like!

Get it here.

Sunday Jun 29, 2008

writing services

Lately there's been a lot of discussion on the forums about how to write Services for Project Darkstar. Specifically, there seems to be some confusion about some pretty fundamental issues around transactions and how to actually participate in this model. Part of this confusion is undoubtedly due to the lack of tutorials (although there are lots of good examples and javadocs available to get folks started). So, I thought I'd spend a little time laying out some basics of Services, and how you go about writing them. Note that this example is written against the 0.9.6 APIs, and while I haven't actually tested/compiled all the code snippets, I'm pretty sure they work (he says hopefully).

Before I begin, a warning: you really shouldn't be writing Services. Or, rather, you usually shouldn't be writing them. The whole point of the Darkstar project is to make it easy to write server-side logic for games by hiding the individual nodes in the cluster, handling all the threading, doing persistence for you, etc. When you write an application, you can ignore all the "hard stuff," but that's because there are Services in the system supporting you. In some cases you will need to get into the lower-levels of the systems, but it really should be avoided, because, well, writing this kind of code is hard. You will be exposed to multi-threaded code. You will have to deal with failure. You will have to model your own way of working between all the nodes on a cluster. You will have to understand the transaction model and how to handle things like 2-phased commits and aborts. So, my advice is that you don't treat the Service APIs as "just another API to use when writing Darkstar games." You have been warned :)

Like I said above, the reason for Services is because we need a place to do much of the hard stuff. Perhaps most importantly, we need a clear layer that sees both the transactional and the non-transactional nature of the system. Services fill this role. They are an abstraction that supports the application and ties the cluster together. They are designed to be pluggable, so that you can swap in and out different implementations. There are a handful of "standard" Services (meaning that they will always be available in the system). Beyond that, you can write any number of additional Services that you need.

An application, of course, doesn't see these Services directly. This is by design, to provide some isolation boundary (think user-land versus kernel code in an operating system). In this way, Services can define pretty complex interfaces that any other Service can take advantage of, without exposing this to the application. What the application sees is a set of Managers. These are effectively the bridge between Services and applications, and are used to provide whatever subset of the Service API makes sense, pre or post-process application inputs and outputs, etc. Some Services don't even have Managers, and instead just define an interface for other Services to use. But, maybe I'm getting a little ahead of myself here.

Here's the basic model: when an instance of the Darkstar stack starts up, some core components are created (we'll get to this in a minute). After that, all of the standard Services are loaded and initialized. Once these are in place, any custom Services are started. If you're writing a Service, it will almost always be a custom Service (as in the example below), which means that you will be able to take advantage of all the other Services in the system. First all of the Services are constructed, and then once all Services have successfully been created, they are told that the system is ready. Remember that this happens in each instance of the stack (i.e., on each node), so while an application is only initialized once, your Service will get created on each node.

Let's take a simple example. To start, all Services must implement the com.sun.sgs.service.Service interface, and must implement a constructor with a specific set of parameters:

  import com.sun.sgs.kernel.ComponentRegistry;
  import com.sun.sgs.service.Service;
  import com.sun.sgs.service.TransactionProxy;
  import java.util.Properties;

  public class MyService implements Service {
      public MyService(Properties properties,
                       ComponentRegistry registry,
                       TransactionProxy txnProxy) {
          // ...
      }
  }

You can look at the javadocs for more detail, but basically the properties are all properties associated with the application, the registry gives you access to core components, and the transaction proxy gives you access to individual transactions. More on all of this in a minute. You should treat this constructor as any other Java programming language constructor: it's a chance to do any initial setup you need to do. Once you return from your constructor, other Services may call you, so plan accordingly. Note that this is also your chance to decide that there's something wrong with the setup of the system, and throw a Runtime Exception. If this happens, startup will fail and the node will shutdown. This constructor is not invoked in a transaction, so you can spend as much time as you want initializing. Just remember that you can't invoke the AppContext, many other Services, etc. without setting up a transaction. Again, more on this in a minute.

In addition to a specific constructor, you need to implement a couple other methods:

      public String getName() {
          // ..
      }

      public void ready() throws Exception {
          // ..
      }

      public boolean shutdown() {
          // ...
      }

The getName() method is just an identifier for your Service, typically the fully-qualified class name or something similar that will be unique and easy to use in identifying your Service. The ready() method is called on all Services in turn once all the Services have been constructed. It's basically a notification that the system is finished setting up, and your final chance to bail if there are any last-minute problems getting setup. The shutdown() method, unsurprisingly, is called when the local node is shutting down. You can take as long as you need to shut down, but if for any reason you can't finish shutting down, then you can return false.

The only thing left is to get your Service started. To do this, you use the com.sun.sgs.services property, which is a colon-separated list of Services to include on startup. Make sure your Service implementation is in your classpath, and then specify the fully qualified class name to the property, either in your application's property file or on the command-line:

  com.sun.sgs.services=MyService

That's it. You've got a Service implemented, setup and running in the stack. Any other Service can resolve it, and use the functionality that it exports. Of course, our Service doesn't do much at this point. Let's work on that.

Suppose you are building some infrastructure around Darkstar, including a web interface where players can login to chat, post to forums, etc. It would be nice to allow players who are in-game know when friends are logged into the web site, since then they could chat back and forth, invite the friends to login to the game, etc. There are lots of ways to accomplish this, but in the spirit of trying to come up with an example that shows most aspects of writing a Service, let's assume that you want to do this by calling out to the web site to get the player's status. The model we'll assume here is that there's a known URL you can query that will return a boolean representing the status. Pretty simplistic, but not too unreasonable, eh? (work with me here...)

First off, your Service will need to know where to go to make this query. Since you're already getting properties as an input, this is a good place to define the server end-point:

  public static final String URL_PROPERTY = "MyService.baseURL";
  private final String baseURL;

  public MyService(Properties properties, /\*...\*/) {
      baseURL = properties.getProperty(URL_PROPERTY);
      if (baseURL == null)
          throw new NullPointerException("Base URL must be specified");
  }

Now it's just a matter of providing a method so that other Services can query your Service:

  import java.net.URL;
  // ....
      public boolean isLoggedIn(String userName) {
          try {
              URL queryURL = new URL(baseURL + userName);
              int result = queryURL.openStream().read();
              return (result == 1);
          }
          catch (Exception e) {
              return false;
          }
      }

Easy right? Now, note that none of what we've done so far has been transactional. This means that the call to isLoggedIn() can take as long as we want, and nothing will time out. Of course, this means that we can't call this method from a transaction, and it also means that we have to be pretty careful calling this method, since it could block something else that needs to run in a timely manner. So, while we've got some nice basic logic that other Services may be able to use, we don't have something we can export to the application.

This gets to the core of perhaps the hardest problem with writing code at this layer: you are working on the boundary between transactional and non-transactional code, and often have to switch between these two worlds. The key is to be careful in documenting your methods, and keeping track of what state you're in at any given time. It can twist your mind around, but once you get into the zen of how this works, it's a lot of fun (where define "fun" to mean "fun to crazy people like me who like hurting their brains on occasion"). In case you're wondering, no, this isn't specifically an artifact of how we do things in Darkstar. Pretty much any transaction-driven system has this layer, and it's always difficult to work here.

So, what's the trick to writing good code at this level? You need to work asynchronously. If you look at how the standard Services are implemented, you'll see a lot of hand-off and Future-like interfaces. The isLoggedIn() method above is synchronous: it blocks until a result is available or an error occurs. With this in mind, let's add a new method that hands-off control:

      public void isLoggedIn(String userName, StatusCallback sc) {
          // return immediately, queuing up the status query...more on
          // the details below
      }

      public interface StatusCallback {
          public void notifyLoggedIn(String name, boolean loggedIn);
      }

The new method is defined to return immediately, and takes a callback object that is called when a result is ready. Now we have a method that will take a small, bounded amount of time to run, and can therefore be called from within a transaction. Better still, this is something that can easily be exposed to application code, since the application can make this call and then wait to be notified with a result. The only thing left to do now is implement the query method (details, details..). Usually in Java this would be a place where you'd create a new Thread to do the work, and that would be fine here. But, there's another option that has some nice benefits. Darkstar is built on a task model, with core schedulers that schedule and run the tasks, report profiling data, etc. When you write a Service, you can use this core facility:

  import com.sun.sgs.kernel.TaskScheduler;
  // ...
      private final TaskScheduler taskScheduler;
      public MyService(Properties properties,
                       ComponentRegistry registry, /\*...\*/) {
          //...
          taskScheduler = registry.getComponent(TaskScheduler.class);
      }

This interface will give you a scheduleTask method that you can use to submit a task to run. The interface you use is KernelRunnable, which has a run() method as well as a method for identifying the type of task (which is really useful when you look at the profiling output). Just implement the run() method to call the original isLoggedIn() method, and then invoke the callback when it's finished. If you look at the scheduler methods, you'll see that they take an Identity as well as a task. This is the owner of the task, or the entity who is actually doing the work (for all the gory details, check out my last blog entry).

The easiest thing to use here is the identity of the calling task, which can be fetched from the TransactionProxy provided to the constructor (yes, I know, you can get the current identity even when you're not in a transaction...it's weird, and something of an historical artifact of the system but we're unlikely to fix it by changing the name now). Putting it all together:

      public void isLoggedIn(final String userName, final StatusCallback sc) {
          try {
              taskScheduler.scheduleTask(new KernelRunnable() {
                      public String getBaseTaskType() {
                          return "MyService.loggedInQueryTask";
                      }
                      public void run() throws Exception {
                          boolean loggedIn = isLoggedIn(userName);
                          doNotify(userName, loggedIn, sc);
                      }
                  }, txnProxy.getCurrentOwner());
          } catch (Exception e) {
              // for simplicity, we'll just assume that if there's any trouble we'll
              // just report the identity as not logged in, but in a real system you
              // may want to handle this differently
              doNotify(userName, false, sc);
          }
      }

      private void doNotify(final String userName, final boolean loggedIn,
                            final StatusCallback sc) {
          sc.notifyLoggedIn(userName, loggedIn);
      }

Sweet. We've now have a Service that does some setup when the node starts up, and provides two methods for querying the status of a user at a web site: one synchronous and the other asynchronous. The asynchronous one uses a call-back interface, so that the caller returns immediately and is later notified about the result. This hand-off is done using one of the core components of the system for scheduling tasks, so you'll get to collect profiling details about this task each time it runs. So, we're done, right?

Almost. In spite of everything we've done, we still haven't actually run any transactions. We do have a method that can be called within a transaction (although it doesn't need to be) because it returns immediately, but if we want to let application code call down into this method, we'll need a way to get "back into" a transaction to call back up to the application when the query finishes. In other words, when we're ready to the notification, we want to do that in a new transaction.

The way to do this is by using the other scheduler. Just as there's a TaskScheduler for scheduling non-transactional tasks, there's also a TransactionScheduler that has similar methods, but runs its tasks within a transactional context. This is actually all it takes in Darkstar to start a new transaction. So, when your Service starts up, get the transactional scheduler the same way you got the non-transactional one:

  import com.sun.sgs.kernel.TransactionScheduler;
  // ...
      private final TransactionScheduler txnScheduler;
      // ..
        txnScheduler = registry.getComponent(TransactionScheduler.class);

Here's why the doNotify() was included in the example above. To get into a transaction, rather than calling the callback object directly, now you can use one of the scheduleTask methods on the transactional scheduler just like you did to run the non-transactional task. Your run() method is now running in the context of a new transaction, meaning that it can interact with application code, access the AppContext, etc. Now you have a method that can be called from a transaction, and will provide notification of the result in a new transaction. So, we're done, right?

Well, not exactly. There's one more piece of this that needs to be taken care of before you can expose this functionality to your application. Recall that one of the nice things about programming to transactional systems is that transactions can be aborted and re-tried, but as a developer all you ever see is the final, successful run. From the failed transactions, there are no side-effects. Of course to support this the underlying infrastructure needs to support this model. In this case, that's our Service. Since the isLoggedIn() method will actually query a web server and then call back to application code, we really only want to do this operation if the calling transaction commits. This is much like the networking model for application code, where Session and Channel sends only actually happen if the calling transaction succeeds.

To support this model, our Service needs to add one extra layer of indirection. When the isLoggedIn() method is called, rather than actually scheduling the task, we want to delay until the current transaction commits. Then we want to schedule the tasks to run. This involves what's called participation in the transaction. By participating in the transaction, the Service will know when the various stages of a transaction happen, and can act accordingly. This also gives the Service a chance to abort the transaction if we get to the end an there's any trouble, for for this simple example we don't need to worry about that. Note that there are some utility classes for writing participants in the com.sun.sgs.impl.util package, but again, this example is small enough that we'll stick with the basic APIs.

You can participate as a durable or a non-durable participant. The former is something that actually stores data persistently, and needs to maintain consistency of the data (e.g., the DataService). The later is something that may use durable Services to store data, but doesn't maintain data itself. In our current system we only allow one Durable participant per transaction, so unless you're replacing the DataService implementation, you'll always be writing a non-durable participant:

  import com.sun.sgs.service.NonDurableTransactionParticipant;
  // ...
  public class MyService implements Service,
          NonDurableTransactionParticipant {
      // ...

Now that the Service implements the participant interface, it can, well, participate in transactions (we'll look at the implementation of this interface in a minute). To do so, it needs to get the current transaction and join it. You can join a single transaction as many times as you like, but as long as you call join() at least once, you'll start participating in the given transaction. In addition to joining the transaction we'll need to keep some state associated with each transaction, in this case the loggedIn queries that we want to make. This can be done any number of ways, so for this example we'll just use a map (again, the details of the map will get filled in a little later). Adding the code for joining a transaction and maintaining state, we end up with:

  import com.sun.sgs.service.Transaction;
  import java.util.concurrent.ConcurrentHashMap;
  // ...
      private final ConcurrentHashMap txnMap =
          new ConcurrentHashMap();
      // ..
      public void isLoggedIn(final String userName, final StatusCallback sc) {
          Transaction txn = txnProxy.getCurrentTransaction();
          Object o = txnMap.get(txn);
          if (o == null) {
              o = new Object();
              txnMap.put(txn, o);
              txn.join(this);
          }
          // ...
      }

Before we continue, there are (at least) two things to note here. First, while the system is multi-threaded (thus the concurrent map), a given transaction always runs in a single thread. This means that, within the context of work done for that transaction, you know there won't be any contention. That's why it's safe to add the value to the map as above, and why no extra synchronization is needed. Second, this method now assumes that a transaction will always be active when its called. Otherwise the call togetCurrentTransaction would throw an exception. In a full version of this Service you should catch that exception, and use it to signal that there's no transaction so you can just schedule the task directly. As an aside, note that the join() call can be done as often as you like, but as an optimization (and since we're already implicitly checking to see if we've joined the transaction by seeing if we're maintaining any state for that transaction yet), each given transaction is only joined once.

Good so far? To review, the code above has setup state unique to each running transaction, and made sure to join each transaction so that our Service can act as a participant in any transaction where its doing any work (obviously if the Service's method is never called, then it will never join the transaction, so it won't add any processing overhead to other transactions). Now, what about that value in that map? Well, what we want is some kind of Set to keep track of each of the queries that we'll be making. The question is, what goes in that Set?

We want to keep track of the queries we're planning to run, without actually running them until the transaction commits. We also want (for reasons that will become clear later) to be guaranteed that when it comes time to commit, we can run those tasks. Conveniently, the scheduler interfaces provide methods with the same inputs as the scheduleTask() methods, but for requesting reservations to run tasks. This means we that we can make a non-binding reservation, and then decide later if we want to use it. Neat, huh? With this in mind, we can update the map:

  import com.sun.sgs.kernel.TaskReservation;
  import java.util.HashSet;
  // ...
      private final ConcurrentHashMap> txnMap =
          new ConcurrentHashMap>();
      // ...
      public void isLoggedIn(final String userName, final StatusCallback sc) {
          Transaction txn = txnProxy.getCurrentTransaction();
          HashSet set = txnMap.get(txn);
          if (set == null) {
              set = new HashSet();
              txnMap.put(txn, set);
              txn.join(this);
          }
          // ...

and update the code that previously was calling scheduleTask on the TaskScheduler:

          try {
              TaskReservation reservation =
                   taskScheduler.scheduleTask(new KernelRunnable() {
              // ...
              set.add(reservation);
          }
          // ...

Note also that in the case of failure the doNotify() method will have to be updated to do this kind of delayed logic, but since the TransactionScheduler has the same reservation mechanism, this is easy (just only do it where an Exception is caught, and not in the notification from the running query itself...I'll leave this as an exercise for the reader...heh).

Ok. So, now we've updated our Service to participate in transactions, track state associated with any transaction it's participating in, and delay actually running any queries until the calling transaction commits. The last piece left is to actually implement the participation methods, and use the reservations:

      public boolean prepare(Transaction txn) throws Exception;
          return false;
      }

      public void commit(Transaction txn);
          for (TaskReservation r : txnMap.remove(txn))
              r.use();
      }

      public void prepareAndCommit(Transaction txn) throws Exception;
          commit();
      }

      public void abort(Transaction txn) {
          for (TaskReservation r : txnMap.remove(txn))
              r.cancel();
      }

      public String getTypeName() { return getName(); }

Ok, so what just happened? When a transaction commits, it actually uses a 2-phase commit protocol. First, all the participants are asked to prepare for the commit operation. This is your last chance to complain and cause the transaction to abort. Once your return from prepare, you may not fail to commit your state. Returning false from prepare() means that you still need to get called for the other stages of the transaction.

Once all of the participants are prepared, then they are all called to actually commit their state. This is the point where we know that the transaction is going to succeed, and so now we can actually use those reservations and schedule the tasks that will do the queries. Remember earlier when I said we wanted to make sure that we can run the tasks later? This is why. Once we get to the commit phase, we're not allowed to fail, so we need these reservations to make sure that we proceed. Note that in practice prepareAndCommit() will never be called on your Services, since this is an optimization used in special cases, and typically only on the DataService.

If at any time during the running of the transaction, or during the prepare phase, some fatal error occurs, then the transaction will be aborted. If your Service has joined the transaction, then it will get notified. This means that the transaction is failing, and state needs to be rolled back. For our Service, this is just a matter of canceling the reservations. This ensures that for any transactions that don't commit, we don't ever schedule any queries or notifications to the caller about errors. Note that the final method is used to identify the participant in a way that's useful when looking at profiling data or other management interfaces.

One final thing to note here is that once prepare() or abort has been called, the transaction is over. This means that you can't query for the current transaction state, or do anything that involves open transactions, including calling other Services. Our example Service hasn't actually made use of any other Service (you're likely to use the DataService at the very least), but had we done so, we couldn't invoke them at this point. Keep this in mind as you design your Services.

One other final thing to note is that I wanted a simple example of some pending operation so I used tasks. In practice, you may find it easier to use the TaskService which provides a much richer interface than the TaskManager. It is designed to handle delaying operations, persisting tasks to guarantee they run (with our Service, if the current node fails then the query operation is lost), etc.

This brings us to the final piece in all of this. One of the reasons for having Managers is to selectively decide what interfaces to expose only to low-level code, and what methods the application has access to. The final step in making our Service useful in supporting applications is writing a Manager. In our case, this Manager should be pretty simple, with just one method that can be called to make our query. Managers don't implement any specific interface, but they need a Constructor that will accept the Service instance so that they can call through. You should look at the implementation of the standard Managers for details on the full pattern we use for separating interfaces and implementations, but for the sake of simplicity, here's a fully implemented Manager for our Service:

  public class MyManager {
      private final MyService backingService;
      public MyManager(MyService backingService) {
          this.backingService = backingService;
      }
      public void queryIsLoggedIn(String userName, StatusCallback sc) {
          backingService.isLoggedIn(userName, sc);
      }
  }

The last step is make sure this Manager gets paired up with your Service and loaded on startup. Just like the Services, there's a property for including your Manager:

  com.sun.sgs.managers=MyManager

Wait. Is that it? Are we done? Really?

Yes. :)

From within your application code, you can now say:

    MyManager m = AppContext.getManager(MyManager.class);
    m.queryIsLoggedIn("seth", myCallbackInstance);

You know, it's almost too easy. (note: it's not too easy)

This was definitely not an exhaustive guide to Services. I didn't go into any detail on using other Services, some of the Service-level interfaces that aren't exposed to applications, etc. I didn't talk about node-local versus cluster-wide design, and how to use the Watchdog and Node Mapping Services. I didn't talk further about the details of Identity. I didn't get into the various design considerations for caching and working with external databases or other similar services. I figure this entry is already long enough (sorry about that), and those topics can wait for the next installment.

In spite of these omissions, I hope this was a useful introduction to some of the key concepts and details involved in writing your own Service, I hope you'll ask questions, and please, if you see an error in anything I've written, let me know! Most importantly, I hope you'll experiment, and let us know what you'd like to see added, or what utilities you think would help at this level. Finally, note that in the coming weeks, as we push our codebase and development activities into the open, we're planning on publishing a bunch of utility Services, so I hope you'll look at those as examples, and either suggest new Services or contribute your own (as some folks have already started doing...thanks!). It's my real hope that most people will never have to write Services, but that's only going to happen if there are enough pieces already in place that folks have the utility they need to really focus on game development. Thanks again to the whole community for your creativity and curiosity in the space, and please let me know how to help going forward!

Saturday Jun 14, 2008

authentication, or, the gaps where management should be

In a previous post I promised to write a bit about the profiling in Darkstar, and how to learn about what's going on in a running system. I'm still planning to do this. I've been slacking; sorry. I've also been rolling some thoughts around my head about a bunch of things I want to start bouncing off darkstar users, and several of them are in this area. Any day now I'm going to get those things out of my brain and onto, umm, paper. Then you'll see them here.

Until then, I wanted to talk about an issue that's come up in several forms recently in the forums. Namely, how authentication works, how the client protocol affects input to authentication, and how this connects (or is supposed to connect) to external components. This is actually something that's been discussed in several threads before, but I don't think that all the issues are collected in any one place. So, here's my first attempt at trying to herd all the different discussions together. Please let me know what you think, or if you've got suggestions about how to proceed!

To start, here's some quick background. When we were designing the core components for Darkstar we knew that authentication would be important. So too is account management, billing, etc. All of these really fall into a larger topic of management outside the Darkstar stack itself. This is an important problem, but not one that we're trying to solve; third parties and existing systems already do this really well. So, we wanted a simple point for connecting with the management infrastructure without having to actually address the implementation since we were (well, still are) focusing on the core of our stack. We also wanted something flexible, so that any system could be used but would still work well with a Darkstar application.

Since I'm the "security guy" on the team, I took a crack at this problem, and what we ended up with is modeled on some of the more successful and useful systems I've played with. The idea is to have an abstract identifier for a given user which tags all the work they're doing, but separates out the enforcement, policy, and identity/account management from the application code. This makes it easy to plugin different management systems, and swap different authentication mechanisms, while application developers don't have to do anything different or adapt their code to the underlying infrastructure. More importantly, it provides a clear separation point for stuff that's part of the Darkstar stack and services that need to be separate. More on both these points in a little bit.

At the same time that we were thinking about these issues, we were thinking about a connected issue: the protocol between the client and the server. If you've looked at the current protocol at all, you know that we were trying to create something extremely simple so that any language or platform could be supported on the client-side. We also knew that we couldn't get all the right features into the protocol on our first try, and that different systems would require different protocol features, so we planned to support the same notion of a pluggable protocol stack that the EA release had. That is, you can swap in any protocol implementation you like, depending on what you need carried between the client and server. With this as a goal, we punted on any kind of flexible authentication mechanism in the default protocol and supplied space for a username and password only. Of course, we still haven't implemented the flexible protocol interfaces yet, but that's in the plan.

So, to support authentication today, the Darkstar stack has an IdentityAuthenticator interface. It gets called when a client connects to the system, and is provided an instance of IdentityCredentials. If authentication is successful, the authenticator returns an instance of Identity used to represent the connected client. That's it. This is the point that (in theory) ties together all the details I've been hand-waving about to this point. [Note that because there's only one possible protocol right now, there's only one possible type of credentials (name-password credentials), but when new protocols can be plugged in they could be designed to provide alternate types of credentials.]

The idea here is that, for a given infrastructure, an authenticator can be implemented. Swapping in or adding a different authenticators (you can actually have any number of authenticators, ordered based on how you want to prioritize authentication mechanisms) won't affect the application code. Because these authenticators are not part of the application, they aren't invoked in a transaction. This means you can't get at the AppContext and its associated Managers, but you can interact with any services outside the system. This was an intentional design, since most developers I've talked with want to store things like passwords, account details, etc. in something like a MySQL instance that they can manage separately from the Darkstar datastore. Really, the datastore management is an application issue, so if you want to have account detail there, that is something that should be done in an application-specific way within your application code (or at the Service-level, as discussed in a bit).

Among other things, this makes it easy to think about having different kinds of clients that are authenticated in different ways. For instance, an admin tool that is really a darkstar client could connect and be authenticated via a completely separate database or set of credentials. Conversely, users who can login to multiple games can easily be authenticated against a single set of credentials. You get a lot of flexibility here.

Ok. We all good so far? To re-cap, there's a simple notion of an authenticator that gets called when a client tries to login. It's given some credentials, and asked to validate them. This is the connection point to external management of identity, account detail, etc. This is abstracted from the details of the application and the protocol, although the protocol implementation dictates what kinds of credentials will be provided. If the client is successfully authenticated, then an Identity is provided that represents the client.

This is the point where some confusion tends to arise. What is this Identity, and who is supposed to use it? Why is it hidden from the application and is it a good way to share your external data with the game itself?

Abstractly, we use notions of Identity throughout the system. All tasks and transactions are associated with an owning identity. For instance, when a message arrives from a client, the transaction run to handle that message is owned by the identity associated with the client that sent the message. We can look at the identity to make scheduling decisions, to order events and messages, and to provide detailed per-client profiling data. In the multi-node system, we use Identity as the key for doing load-balancing and clustering, assigning each Identity a home node, and then trying to do all work for that Identity on that node. The application itself has an Identity as well, which is what owns tasks that aren't running on behalf of anyone specific. So, this Identity abstraction, while pretty simple, is a key concept throughout the stack.

The Identity implementation serves another purpose, however, and that's to be a link back to the infrastructure that created it. The idea is that if you're managing (for instance) account details about a given player outside the stack, you can use the Identity as a way to get back to that detail, either by including it directly in the Identity instance, or implementing the Identity returned from your authenticator to have accessor methods that can call out to your external system.

In order to keep all these things abstract from the application code, and to make sure that you don't do things like call an external service directly from your application code, the Identity itself is hidden from the application. If you want to access any custom features, this should be done via a Service (which can see the Identity) and a Manager (which is the bridge between your application code and the Service). This also acts as another abstraction point, so that again the actual Identity implementation can change without the Manager interface to the application code being affected. Of course we don't want everyone who writes an application to have to write this kind of detailed code, so why does this design make sense? Well, this finally brings me back to the title of this entry.

When we designed all of this, we never intended that application developers would have to write any authentication code. I mean, obviously in the short-term everyone will have to, 'cause you need something that verifies clients. But in the long-term, it's our hope that some number of third-party developers and community groups will build authentication and account management software for Darkstar. This can be developed completely separate from any specific game implementation. It would mean that game developers can focus on their application logic, and then just use whatever authentication mechanism they want. The infrastructure developers, in turn, can write detailed connections between the stack and their infrastructure, and decide what and how to expose to game developers.

The catch here, of course, is that this is something of a boot-strapping problem. Until we have serious games, there won't be too much development on tools. And until there are good tools, it's hard to write good games. So, for the moment, there's some serious pain for the bleeding-edge game developers who are trying to actually use this component of our system. Sorry about that. I've worked with a lot of security systems, and I know how how maddening it can be to do development without the kind of support that they require. All I can say is that we're moving as quickly as we can. In the meantime, I'm more than happy to help support y'all in your development efforts, and of course if anyone wants to propose some community projects for generalized management in this space, I'd be more than happy to act as a contributor and advisor!

Is that the end of the story? Well, not really, If you've paying attention, you may have noticed that I glossed over one key issue. Every task in our system has an associated Identity, and this is how we do load-balancing, resource-management, etc. This makes sense for clients, and for the tasks that the application itself has to run (initialization, maintenance, etc.). This does not, however, address things like NPCs, which are also "identities" in the system, just without a human (usually) driving them. Why do they get the short shrift here?

The answer is that they shouldn't, we just haven't gotten organized on this point yet. Ultimately, we want to expose some mechanism to the application code for defining new entities in the system that should have their own (little-i) identity. We don't know exactly what this will look like, and a lot of it has to do with how we handle movement through the cluster, something we're still learning as we go. When we address this issue, it may result in more explicit notions of Identity being exposed to the application, but nothing is firm here. If you have thoughts, please let us know!

One quick aside. Several folks have asked why there's no API to get the Identity associated with a given ClientSession at the Service level. This is a totally valid question There probably should be some way to do this, and I expect that we'll add this in some form as we make some of the changes discussed above.

Whew. Sorry for the long caffeine-powered discussion, but given all the recent questions I figured it would be useful to try laying out all these issues in one place. I hope this helps provide some insight into what's going on, and where we're heading. I would definitely like to hear questions and suggestions about all of this, especially if you've been implementing against these APIs and have any good/bad/other experiences to report. The discussions have been great thus far, and very useful; please keep it going!

Friday May 16, 2008

performance analysis (1 of n)

One of the questions I get a lot about Darkstar has to do with performance. Let me re-phrase that. One class of questions that I get a lot has to do with performance. These usually are of the form "how many clients can be connected to a server" or "how large can a zone be" or "what kind of hardware is required for a game of size X". A question of the last form was recently asked on the Darkstar discussion forums.

One of the things that I spend a lot of my time thinking about is performance. It's not easy to observe and profile a system like Darkstar, and it's even harder to come up with quantifiable answers to these kinds of questions. A large part of this is because the performance of the system has a lot to do with the game that is running. If you've got a game that basically just does communication and persists little data, this will obviously behave very differently than one that does a lot of server-side manipulation of data. I end up spending a reasonable amount of time every week just looking at running applications, trying to understand what we can learn from these games and also how we can use this to improve the overall performance and features of Darkstar.

Given all this, I tend not to talk a lot about specific benchmarks, or quote specific numbers to the questions above. It's not that I'm unwilling to talk about the system's performance; Darkstar is open source, and anyone can run it and see for themselves how it behaves. No, it's just that it's really hard to come up with general truisms, and I'd rather cite meaningful data. In regards to the forum post above, however, there was a question about specific hardware. After a short disclaimer that you really can't make general statements about how many players will fit on a given system, I made a somewhat flip comment about a specific Sun product that we've been using for testing. I say "somewhat" because I've played with this hardware enough to know that it's an excellent target system for the size of game that was being asked about. I say "flip" because I thought it would be obvious that I was joking around a little (note the smiley). I was very interested by the flurry of responses.

First off, I think a lot of the questions being asked are exactly right. You need to know what the tests are that are being used to come up with the numbers of clients that a piece of hardware can support. You need to understand what kind of behavior is being stressed or simulated, and why that looks at all like a real application. You need to know whether this is being run in different scenarios, with different kinds of networks, different client behaviors, etc. Without these kinds of data-points, metrics are pretty useless.

What tests have I been running? Unfortunately, the core team hasn't had a lot of time to write full applications, so the tests are still somewhat limited. One test we're running comes from Project Wonderland, which is building a set of automated stress tests. You can download these directly from their subversion tree. Another source of data are the simple tests that we've got in the Darkstar test and example directories. We have a few test apps that we're working to clean up and get posted. We also have a few test cases that I can't share right now (sorry). I will take it as a task to get a wiki page going with performance results so some of the specific numbers we're seeing can be collected.

Second, while it's important to base metrics on understood test cases, I think it's also important not to get so caught up in the minute that systems become less usable. A lot of people come to Darkstar and want to get a rough sense of what they should expect. Is it reasonable to expect an average game can host 200 clients on a server? If so, what kind of server? No, it's not very scientific, but answering these kinds of baseline questions help people who are new to the technology and the model, and may get them interested enough to start learning more about what that actually means.

This is what I was trying to do in the forums: engage a newcomer to the community, with the caveat that the answer doesn't mean very much. I think it's important to set the course-grained expectations, even if that doesn't match with my instincts as a systems hacker. I also know from experience profiling many applications that the expectation I was setting is pretty reasonable. Besides trying to set some expectations, I did something else: I asked for more details about the application. This is really important, because the whole community really needs to hear more about how Darkstar is being used, and how well it's performing for any given design. I was pretty disappointed that no one picked up on this point, especially because everyone chiming in on that thread has at least one game that they could profile today.

So, finally, let me issue a challenge back to the community. Yes, the core Darkstar development team can and arguably should be posting benchmarks and profiling experience. However, anyone else in the community can be doing the same thing. I want to hear about your applications. I want to hear what your design looks like, and what assumptions you're making. I want to see detailed profiling results that illustrate what you're experiencing (I'll make it a priority to post to my blog soon about the profiling system in Darkstar). The questions that get asked in the forums are often very good, but I'd like to see more data get pooled. If I setup a wiki (or something) for collecting these experiences, who will step up and help people understand how Darkstar is actually working?

Friday Mar 21, 2008

transactions and timeout

Lately I've been spending a lot of time talking about (justifying?) the model that Darkstar uses for transaction timeout. Specifically, there are a lot of questions about why transactions need to be so short. I want to lay out a little of the reasoning behind this.

First, however, I want to talk about what transactions are. I realize that last week I wrote I about scheduling without really going into depth on what a transaction really is. Essentially, a transaction represents a task where some collective work is done; at the end, either all of the work is done or none of it is. If you're familiar with data bases, or with the cool stuff that happens in the Transactional Memory communities, you know that it's common to think about transactions around data, probably for two reasons. First, it's nice to know that if you're updating two related values, there's no chance to change one without changing the other. Second, you often want to change one value based on an observed value, and know that the observed value hasn't changed. These are the kinds of guarantees that you get out of ACID semantics.

Darkstar involves more than just data in its guarantees about transactions (although the data part is obviously critical). For instance, any messages you want to send or any tasks you schedule for future execution are delayed until the transaction commits. To say that a transaction commits means that the block of work is completed successfully; any updates can be made globally visible, and nothing observed has changed during the lifetime of the transaction. If anything has changed, then the transaction is aborted, and tried again as if it never happened.

When a Darkstar application processes an event (like a message arriving from a client or a timed event starting) it does so in the context of a transaction. This is done to help the developer. It makes it much harder to code common mistakes around data consistency and integrity. It also means that these tasks are bounded in the amount of time they're allowed to take. By default, they're not allowed to run for more than 100 milliseconds, but in reality 10-20 milliseconds is already too long.

Why the short window for processing these tasks? There are many reasons, but probably two that are most important. First, all these transactions are executed by a pool of threads. A task is chosen, run, and then when the task is finished the next is chosen. These threads represent a limited resource, and so you want to share them as effectively as possible. There are many ways to do this, but since Darkstar is a latency-driven system, we have opted for a model where there are many short tasks so that we can respond to requests as quickly as possible, and minimize the overall jitter and delay. If these tasks start taking a long time, it will be harder to respond to all clients in a timely manner, which strongly affects the quality of any game.

Second, and perhaps more importantly, the longer a transaction runs the more things that it's likely to interact with, and the more likely it is (even if a given transaction only interacts with a few objects in the data store) for another transaction to update some state being used by the first transaction. Remember, a transaction is an all-or-nothing mechanism, so when there's any conflict only one transaction can proceed. If a transaction is very short its much more likely to complete, or at least it will take much less time to abort and try again. The longer a transaction runs, the more likely there will be contention and therefore wasted effort.

When a transactional system decides how to handle locking and contention, there are many strategies to take. You can be pessimistic and try locking each object as needed, trying to rush through your work before anyone else causes conflict. You can be optimistic and wait until commit time to see what happened to the rest of the system while you were working. You can version objects and try to detect compatible changes. There are many other strategies in between, but all deal with observing how transactions interact, and all are optimized in part based on some explicit notions of how long transactions are likely to run. If a transaction runs beyond its allowed window it won't get to commit, and will have to try again.

Can you change the timeout for transactions in Darkstar? Well, yes, you can. Should you? No. It's not just about having some number set, but about all of the reasons why we set the bar where we do, and what the implications are throughout the system. I know that it's not always easy to do all of your work in one transaction, and programming with continuations isn't always clean. Still, there's a reason we model the system the way we do. Really. No, really.

The one exception is the initialization task. This is the first task run for an application. We let this task run as long as you'd like. Why? Because it's safe to do so. Because nothing else should be running yet, and so there shouldn't be any conflict. Because until this transaction commits, we don't have to worry about sharing resources or being responsive to clients. Because you need some place to initialize your world, and we want to make some things a little easier.

What's your take on this? Do you like the trade-off of short events for the power of transactions? How would you model something like this?

Sunday Mar 16, 2008

scheduling transactions

Sigh. After finally getting re-started on my blog, I seem to have let two weeks elapse. This after I promised everyone (and myself) that I'd be better about posting. Oh well.

What kept me away? Lots of things, which is good because it's a sign that we're pretty active here getting the Darkstar technology ramped up. In particular, however, I've been busy working on some core features for the 1.0 release. I promised that I'd talk about some of this work, so I thought I'd spend a little time this week letting folks know what's been keeping me so busy.

At the core of each Darkstar node is a scheduler. For those of you familiar with operating systems, routers, or similar systems, the notion of a scheduler is nothing new. For those of you who haven't done much systems work, suffice it to say that a scheduler is (roughly speaking) a mechanism for resource management. You have tasks that need to get done, and the scheduler is the component that decides how to order those tasks. Generally speaking, the scheduling policy is based on which resources are scarce or valuable: CPU cycles, network bandwidth, available IO ports, etc. Any given system has different constraints and different priorities that help define how you order outstanding requests for pending tasks. Did I mention transactions? Yeah. That makes things more interesting. Because most of the tasks running though our system are transactional, it means that a given scheduled task may conflict with other tasks, or may need to be aborted and re-tried. It also means that tasks that didn't have any explicit notion of ordering may suddenly have some dependency based on which objects can be modified or other factors. Fun, huh?

What I've been working on recently is this scheduling problem. This is not a new area of research, although the kinds of factors in our system make it unusual (if you're active in the field of transactional memory you may have seen some of these discussions). Essentially, what I care about is how to accept tasks that we're going to run, and then how to decide when to run those tasks. Based on Transaction conflict we may want to re-order those tasks, and based on how tasks fail we probably want to re-try them in some intelligent manner. I've also been thinking about how to make it easy for the other components of our systems (e.g., Services and other Transactional components) to take advantage of our infrastructure. Lots of interesting problems.

In the last few months we've changed some properties of our system. Anyone who has tried to write a Service has probably seen some of the superficial changes in our interfaces. What you haven't seen are the more interesting and complex changes in our core. Last Friday I committed a lot of changes to our internal source tree. Essentially, what I was working on was updates to our scheduler, transaction, and dependency model. I've introduced two schedulers (one that handles transactions and one that doesn't), added dependency to the transactional scheduler, and simplified how transactions get setup and run in our system. Doing this let me remove a number of classes, which makes me pretty happy.

More important that just cleaning up the code (which is always a good reason for significant updates), this work let me re-factor our re-try logic into one place that is co-located with how we do scheduling. Why is this important? When a transaction fails, there may be many causes. Before we re-try that transaction, we really want to figure out why the transaction failed in the first place and whether that introduces any dependencies in the set of scheduled tasks. If you've been looking closely at the current source release, you'll see that we don't do much in this space; when you look at our next source release, you'll see that this will change.

One of the things that I'm exploring now is exactly how we re-try failed transactions. This can have a significant impact on contention in the system, which in turn effects how many tasks succeed, and how many times we have to hit the data store. On the whole, it's all pretty interesting. When we release the code I've talked about here (which I hope will happen within the next month or so) I'll talk more about the specifics of this behavior. Until then, I think it'll probably be confusing, but feel free to mail me if you're curious and I'll be happy to tell you more. Until then, suffice it to say that there's a lot of interesting stuff going on, and next week I'll try to continue this discussion by talking about the transaction model, and some of the details around contention.

What interests you most about our model? What would you like to learn more about? Let me know!

About

stp

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today