Saturday Feb 27, 2010

Vector Textures

Monday Feb 08, 2010

Digital Clock

Friday Feb 05, 2010


Thursday Jul 23, 2009


To conserve memory and minimize processor cache misses and GPU state changes, we need reasonable ways of "instancing" scene graph elements, meaning reusing the same objects in cases where the same element conceptually appears multiple times in the same scene.

It's possible to implement a transform and bounding volume hierarchy which supports instancing, however the overhead is significant, both in terms of its impact on processing costs and memory-use, as well as on ease-of-use for many use cases. If you introduce instancing at this level, every node potentially has many parents (and further ancestors) - which implies a node also has as many world space transforms and world space bounding volumes as the product of all its ancestors.

To avoid these costs but still retain some of the benefits of instancing, one approach is to make the instancing purely visual. In this case, a Node which appears multiple times in the same scene still may have at most one actual parent, and thus only a single world transform and bounding volume. A special auxiliary node type is provided which itself is part of the transform hierarchy and bounding volume hierarchy of the scene, however any nodes it contains remain disconnected from that, i.e it is not their parent. When this auxiliary node is rendered, it cancels the world transform of the actual parent (if any) of its child nodes and substitutes its own in place of that. In addition, it arranges for its children to derive their render states from its own rather than those of their parents.

In our system, this auxiliary node is called a "Lens" and has the following partial definition:

public class Lens extends Node {

    public var view: Node;


Example - Bubblemark

It's not a very interesting example, but it is (hopefully) somewhat familiar. In "Bubblemark" the same vector "bubble" graphic appears N times, each with a different translation. It would be extraordinarily wasteful to actually create N duplicate Bubble objects (which are quite heavyweight). Instead, we can use N Lens objects (which are lightweight by comparison) each viewing a single Bubble, and translate the lenses instead of the bubble. Here's the relevant code:

Stage {
    scene: Scene {
       def bubble = Bubble {}
       Group {
            content: bind lazy
                for (ball in model.balls)
                    Lens {
                        x: bind lazy ball.x;
                        y: bind lazy ball.y;
                        view: bubble;

With this formulation of instancing I no longer have transform and bounding volume information about the "instanced" bubbles, however in this case I didn't need that since it's already present in the Bubblemark application's "model".

Performance of "Bubblemark" in our system seems good compared to others:

By comparison, the vector Flex example on has roughly half the frame-rate, and more than double the CPU usage, even while rendering far fewer pixels. Of course, these discrepancies aren't just about instancing, but more so about using the GPU rather than the CPU for rasterization.

As a second example, here's a quickly thrown together example of a 3D chart, which was produced using one cylinder instance and one rectangle instance:

Saturday Jul 18, 2009

Lazy binding and functional programming

Functional programming techniques can be used in conjunction with lazy binding to rather easily and compactly express the complex multi-valued dependencies we require.

As an example, let's consider the humble HBox, a node which performs a simple horizontal layout of the nodes it contains:

public class HBox extends CustomNode {
    public var content: Node[];
    public var spacing: Number;

    bound lazy function layout(nodes:Node[], x:Number):Node[] {
        if (nodes == [])
        then []
        else {
           def theNextToLayout = nodes[0];
           def theRestToLayout = nodes[1..];
           [Group {content: theNextToLayout, x: bind lazy x},
            layout(theRestToLayout, x + spacing + theNextToLayout.bounds.width)]

    override var internalContent = Group {
         content: bind lazy layout(content, 0);


In our system CustomNode is defined like this:

public abstract class CustomNode extends Node {

    protected var internalContent: Node on replace { internalContent.parentNode = this };

    override var contentBounds = bind lazy internalContent.bounds;

Concrete subclasses of CustomNode are required to override its protected internalContent variable for their specific content, and CustomNode's contentBounds is defined as the bounds of its internal content.

HBox declares two public variables, "content", which is the sequence of nodes it will contain, and "spacing" which defines an additional uniform distance used to separate them. It overrides its inherited "internalContent" variable to consist of a Group containing a sequence of auxiliary nodes used to perform the layout, which are are set up in the recursive lazy bound function "layout".

The location on the x axis of a given child of HBox depends on the spacing and on the combined widths of all the nodes that precede it. All of these values can be directly or indirectly animated.

The layout function takes two parameters, a list of nodes on which to perform the layout, and an accumulated displacement along the x axis. Each time it's called, it wraps the first node in the list in a Group whose x coordinate is bound to the accumulated displacement, and then (lazily) calls itself with the remainder of the list, adding the spacing and the width of that node to the accumulator (note: because this is a bound function this expression is also bound). The layout group of the first node is then concatenated with the result of laying out the rest and returned. Note that calling a lazy bound function is quite unlike calling an unbound JavaFX function or a Java method. Basically all it does at the time of the call is build an unevaluated dependency tree.

The end result of this is that I can insert/delete/replace nodes in HBox.content, animate their transforms or other characteristics which affect their ultimate width, or animate the spacing, and the layout will be correctly recomputed - but lazily only when required.

Here's a simple example, which creates an HBox containing a variable number of spheres. The scale of the contained spheres is animated, as well as the count and the spacing.

def SPHERE_N = 30;
var count: Integer = 10;
var spacing: Number = 1;
var s: Number = 1.0;
var color: Color;

def shader = FixedFunctionShader {
    diffuse: bind lazy color;

def t = Timeline {
    [KeyFrame {
       time: 0s;
       [color => BLUE,
        s => 1.0,
        spacing => 1.0,
        count => 10]
    KeyFrame {
        time: 5s;
        [color => RED tween LINEAR,
         s => 3.0 tween LINEAR,
         spacing => 5.0 tween LINEAR,
         count => SPHERE_N tween LINEAR]
    autoplay: true;
    autoReverse: true;
    repeatCount: Timeline.INDEFINITE;

def spheres = bind lazy for (i in [1..SPHERE_N]) {
    Sphere {
        radius: 1;
        transform: bind lazy scale(s, s, s);
        shader: shader;

Stage {

   scene: Scene {
            HBox {
                spacing: bind lazy spacing;
                content: bind lazy spheres[0..<count];

This approach eliminates the need for any special procedural "layout" pass or protocol. Accessing any variable which depends on the layout (for example the bounds of the HBox itself, the parent transform or world transform of any of its contained nodes, etc), will implicitly evaluate the bindings woven together in our "layout" function as required, thus effecting the layout.

This technique is quite general, and is used throughout, for example in the case of aim, parent, orient, and point transform constraints. Here's part of the implementation of "point constraint". A point (or location) constraint is an operation which positions a node based on the weighted locations of a set of other nodes. For example, "position node A halfway between node B and node C".

 public class Constraint {
     public var node: Node;
     public var weight: Number = 1.0;


 bound lazy function pointConstraint(constraints:Constraint[], translation:Vec3, weight:Number):Vec3 {
        if (constraints == []) {
           if (weight == 0) then Vec3.ZERO else translation / weight;
        } else {
            def c = constraints[0];
            def cs = constraints[1..];
            def location = c.node.worldTransform.getTranslation();
                            translation + location \* c.weight,
                            weight + c.weight)

 bound lazy function pointConstraint(constraints:Constraint[]):Vec3 {
     pointConstraint(constraints, Vec3.ZER0, 0);

Here's an example of the how the above mentioned constraint might be expressed:

  var B:Node = ...;
  var C:Node = ...;
  def c1 = Constraint { node: B, weight 0.5 };
  def c2 = Constraint { node: C, weight: 0.5 }; 
  def transform = bind lazy translate(pointConstraint([c1, c2]));
  def A = Cube { transform: bind lazy transform, ... };
Note that the locations of the nodes and/or the weights associated with the constraints may be animated or otherwise change. The lazy bound recursive function "pointConstraint" above receives two accumulated values as it iterates the list of constraints, the accumulated translation and the accumulated weight. When the end of the list is reached the result is produced by dividing the total displacement (translation) by the total weight.

Here's a very simple test case of a point constraint imported from Maya. The red sphere is point constrained to the 4 yellow cubes. The test animates the position of the sphere around and among the cubes by simply animating the weights of the 4 constraints.

Wednesday Jul 15, 2009

The calm in the eye of the binding storm

Here's my attempt to write up a description of the benefits of lazy binding with respect to the construction and composition of 2d shapes. Don't be too surprised by the naming discrepancies compared to analagous classes from javafx 1.x. For the record, this project was no fork, rather its original version predated the release of javafx 1.0 by over a year, in fact it began prior to the start of any actual development of the javafx 1.0 runtime, aka "Reprise" - but that's another story. Anyway, with our requirements it wasn't technically feasible to maintain full compatibility with what was subsequently done in javafx 1.x, so no serious attempt to fully match the naming was ever attempted. Nevertheless, regardless of the naming or specific implementation details, there's no reason the general technique I'll describe here can't eventually make its way in some form into a future version of javafx, so here you go:

2D shapes extend an abstract base class class Shape2D, whose definition is essentially:

public abstract class Shape2D extends CustomNode, AbstractPathElement {
    public var strokeWidth:Number = 1.0;
    public var strokeLineJoin: StrokeLineJoin = StrokeLineJoin.MITER;
    public var strokeLineCap: StrokeLineCap = StrokeLineCap.SQUARE;
    public var strokeMiterLimit: Number = 10.0;
    public var strokeDashArray: Number[] = [];
    public var strokeDashOffset: Number = 0.0;
    public var fillRule: FillRule = FillRule.EVEN_ODD;
    public var fill: Paint;
    public var stroke: Paint;
    public var extrude: Number = 0;

    // Implementation details
    protected var theShape: com.sun.javafx.geom.Shape;

    protected var transformedShape: com.sun.javafx.geom.Shape =
        bind lazy applyTransform(localTransform, theShape);

    override var pathCommand = bind lazy getPathCommand(transformedShape);

    function getPathCommand(shape:com.sun.javafx.geom.Shape) {
        return function(path:Path2D, gp:com.sun.javafx.geom.Path2D) {
            if (shape != null) {
                gp.append(shape, false);

function applyTransform(t:Mat4, s:com.sun.javafx.geom.Shape):com.sun.javafx.geom.Shape
    if (t == Mat4.IDENTITY) {
	return s;
    var a = new com.sun.javafx.geom.AffineTransform(t.affine2d());
    return a.createTransformedShape(s);

I'll describe AbstractPathElement shortly.

The Java package com.sun.javafx.geom.\*, which is part of the implementation, is actually nothing more than a portable version of the familiar java.awt.geom.\* package, exorcised of its external AWT dependencies.

Concrete 2d shape classes extend Shape2D and override its "theShape" variable with an appropriate lazy binding which expresses the dependencies specific to that shape, for example Rectangle:

public class Rectangle extends Shape2D {

    public var height: Number;
    public var width: Number;
    public var arcHeight: Number;
    public var arcWidth: Number;

    override var theShape = bind lazy makeRect(width, height, arcWidth, arcHeight);

    function makeRect(width: Number, height: Number, arcWidth: Number, arcHeight:Number) {
        def x = width / 2;
	def y = height / 2;
        if (arcHeight == 0 and arcWidth == 0) {
            return new com.sun.javafx.geom.Rectangle2D.Float(-x, -y, width, height);
        return new com.sun.javafx.geom.RoundRectangle2D.Float(-x, -y, width, height, arcWidth, arcHeight);

Shape2D itself will ultimately (but lazily) produce 1 or 2 mesh nodes as its (internal) children (one for the fill shape and one for the stroke shape if present), taking into account theShape and all the other dependencies involved, of which there are many: stroke, fill, strokeLineCap/Join/DashArray, etc. Higher level lazy bindings in Shape2D's implementation (omitted here) express these dependencies. Its own contentBounds is also (lazily) computed from the bounds of theShape (or the stroke shape derived from that).

2d shapes may also be produced with constructive geometry operations or 2d path commands. The Area class and Path2D class represent these use cases respectively. These classes operate on the potentially transformed version of theShape - which is therefore an additional lazily bound variable of Shape2D - transformedShape.

Area composes other shapes with union/intersection/subtraction operations and is itself an instance of Shape2D. Its definition is quite simple and similar to that of Rectangle above in terms of how it uses lazy binding:

public class Area extends Shape2D {

    public var add: Shape2D[];
    public var remove: Shape2D[];
    public var intersect: Shape2D[];

    override var theShape = bind lazy makeArea(for (x in add) x.transformedShape,
                                               for (x in remove) x.transformedShape,
                                               for (x in intersect) x.transformedShape);

    function makeArea(add: com.sun.javafx.geom.Shape[],
                      remove: com.sun.javafx.geom.Shape[],
                      intersect: com.sun.javafx.geom.Shape[]) {
        var a0:com.sun.javafx.geom.Area;
        if (sizeof add > 0) {
            a0 = new com.sun.javafx.geom.Area();
            for (x in add) {
                a0.add(new com.sun.javafx.geom.Area(x));
        } else {
            return null;
        for (x in remove) {
            a0.subtract(new com.sun.javafx.geom.Area(x));
        for (x in intersect) {
            a0.intersect(new com.sun.javafx.geom.Area(x));
        def path2d = new com.sun.javafx.geom.Path2D.Float(a0); // reduce memory by converting to path
	return path2d;

Path2D objects are constructed from a sequence of commands which insert lines, elliptical arcs, quadratic curves, cubic curves, or other shapes.

AbstractPathElement is an interface which represents those things that may be inserted into a Path2D.

public mixin class AbstractPathElement {
    protected var pathCommand: function(path:Path2D, gp:com.sun.javafx.geom.Path2D):Void;

Command implementations such as Shape2D, LineTo, CurveTo, QuadTo, ArcTo, etc, extend AbstractPathElement and provide a lazy binding for its "pathCommand" variable. This is a function type variable which receives the path node and a path geometry object as input, and which must invoke appropriate operations on the geometry object and potentially update state in the path node in order to effect inserting itself into the path. The function value must lazily depend on the parameters of the command, for example in the case of LineTo, its absolute, x and y variables:

public class LineTo extends PathElement {
    /\*\* the x coordinate of the point \*/
    public var x: Number;
    /\*\* the y coordinate of the point \*/
    public var y: Number;

    override var pathCommand = bind lazy makePathCommand(absolute, x, y);

    function makePathCommand(absolute: Boolean, x:Number, y:Number) { return addTo }

    function addTo(path:Path2D, gp:com.sun.javafx.geom.Path2D):Void {
	if (absolute) {
            path.current = new Point2D.Double(x, y);
   = path.current;
            gp.lineTo(x, y);
	} else {
            path.current = new Point2D.Double(path.current.getX() + x,
                                              path.current.getY() + y);
   = path.current;
            var pt = gp.getCurrentPoint();
            gp.lineTo(pt.getX() + x, pt.getY() + y);

The ultimate geometry produced by these path commands is again created by means of a lazy binding analogous to how this was done in Rectangle and Area above.

public class Path2D extends Shape2D {

    protected var lastMoveTo: Point2D;
    protected var center: Point2D;
    protected var current: Point2D;
    public var content: AbstractPathElement[];

    override var theShape = bind lazy makePath(for (x in content) x.pathCommand);

    function makePath(cmds:Object[]) {
        def path2d = new com.sun.javafx.geom.Path2D.Float();
	center = null;
        lastMoveTo = null;
        current = null;
        for (x in cmds) {
            def cmd = x as function(path:Path2D, path2d:com.sun.javafx.geom.Path2D):Void;
            cmd(this, path2d);
        return path2d;


The net result (and benefit) of all this lazy wiring is that I can animate or otherwise modify the transform and/or any other public variable that affects the geometry or appearance of a shape (and possibly others that depend on it) and pay no price other than (transitively) invalidating precisely the dependent variables of my modification. Note that upon a subsequent modification which affects the same variables the propagation of such invalidation is "short-circuited" when a variable already marked invalid is reached.

None of the functions embedded in the lazy bindings above (whose operations are quite costly) will actually execute until the result of the binding is required, and the result will remain cached as long as it's valid. In typical programs (assuming changes were made) this update will occur exactly once per frame, which is ideal.

For example, if not touched earlier by any user defined event handler, then when the render pass occurs and view frustum culling is performed, the world bounds of the Shape2D node will be accessed (either directly, or indirectly from an ancestor). At that point, the value of the "theShape" variable will become (indirectly) required and its binding evaluated. However, the actual mesh nodes associated with the Shape2D object will not yet be created. If the Shape2D node is within the view frustum, then at some point its children will be requested by the renderer. As a result the lazy bindings that they in turn depend on will be evaluated, using the (now cached) value of "theShape" as one of their inputs.

This approach gives us a general, yet simple, elegant, extensible, declarative and compositional system for managing the very complex dependencies we encounter in scene graph design and elsewhere, which can at the same time be efficient.

Saturday Jul 11, 2009

3D Animation and Maya

Although I'm not a world class 3d animation expert, the artist member of our little team was. His name is John Yoon, and he's an amazing guy. He went to MIT with Ken and received a degree in computer science, then worked for a number of years at Alias/Wavefront as an engineer on Maya. Later he went to UCLA, received a masters of fine arts, and subsequently worked as a technical director at Disney and Dreamworks on animated feature films.

Thanks to John's expertise both in using Maya and with its internals, and his ability to educate us, Ken and I were able to build a pretty impressive content pipeline from Maya to our system in a pretty short amount of time (about a month). Sven Goethel helped out and did the normal maps.

Although it was our intention to also support other formats at some point, for the first go-around, we decided to support importing Maya ascii format files directly.

Maya is a huge and very impressive system. We worked incrementally out of necessity. Our first test case was a red spinning cube. John guided us along, creating incrementally more sophisticated test cases, deciphering the semantics of maya nodes for us, and helping with the math on our side.

Ken did the majority of the work and did an amazing job.

We translate Maya key frame animation into JavaFX script timelines. Ken wrote interpolators for Maya's tangent-based animation curves.

In general we translate [a subset of] Maya's dependency graph into a corresponding JavaFX script dependency graph, preserving the data flow. We're able to import lights, cameras, blinn-, lambert-, and phong- shaders, normal maps, light maps, point-, parent-, aim-, and orient- constraints, skin clusters, blend shapes, 2-bone ik, radial fields, motion paths, nurbs curves, triangle meshes, point emitters, particles, transform nodes, rigid bodies and rigid constraints, and utility nodes including math and logic

Here's a list of the Maya nodes we support in whole or part.

  • mesh
  • transform
  • bump2d
  • lambert
  • reflect
  • blinn
  • phong
  • file
  • animCurve
  • animCurveTA
  • animCurveUA
  • animCurveUL
  • animCurveUT
  • animCurveUU
  • materialInfo
  • skinCluster
  • blendShape
  • joint
  • ikHandle
  • ikEffector
  • poleVector
  • rigidConstraint
  • pointConstraint
  • aimConstraint
  • constraint
  • parentConstraint
  • orientConstraint
  • pointEmitter
  • radialField
  • particle
  • unitConversion
  • multiplyDivide
  • plusMinusAverage
  • condition
  • reverse
  • choice
  • layeredTexture
  • uvChooser
  • place2dTexture
  • clamp
  • blend
  • blendColors
  • blendWeighted
  • light
  • pointLight
  • spotLight
  • directionalLight
  • groupParts
  • polyPrimitive
  • polyCube
  • polyCylinder
  • polySphere
  • motionPath
  • nurbsCurve
  • addDoubleLinear

Using animated 3d models in our system is quite easy, for example:

var model = Model {
    url: "models/";

Animated models are Nodes which are also polymorphic with Timelines. You can play them, pause them, rewind them etc. The Model object also exports various named resources to the script, such as cameras, lights, shaders, meshes, transform nodes, etc. Thus after loading a model it's possible for the script to modify or replace elements, or parent script created nodes to imported ones. For example, the model could be created with a placeholder texture used by a shader. Upon loading the model the script could lookup the shader and replace the placeholder texture with, for example, a dynamically loaded movie. Anthony has a picture of this use case on his blog

Collision Detection and Dynamics

We require a high quality engine for both collision detection and dynamics animation. In the current version we utilized the open source Bullet engine which is quite good. It should be straightfoward to plug in alternate engines, although we haven't done so. The input to the engine is the initial world transform of the body, its collision geometry, mass, friction, restitution, initial velocity, initial spin, initial orientation, initial position, and center of mass. For each body, the output of the engine per frame is a new world space transform, and a set of contact points indicating its collisions with other bodies. Our current system operates uniformly on 2d and 3d objects.

Dynamics Worlds, Colliders, and Collidables

In our system the Node class also implements the Collidable interface, which is defined as follows:

public mixin class Collidable {
    public var collider: Collider;
    public var dynamicsWorld: DynamicsWorld;

    public abstract function setCollisionWorldTransform(mat:Mat4):Void;
    public abstract function getCollisionWorldTransform():Mat4;
    public abstract function getCollisionShape(): CollisionShape;

DynamicsWorld encapsulates the physics engine. There can be multiple such per scene. CollisionShape's are subset of general geometry which can be managed efficiently by the physics engine: convex hulls, boxes, spheres, etc. Dependending on their type, various Node subclasses override the getCollisionShape() function to return such appropriately. The other two functions provide the callbacks for the physics engine to obtain the body's current world space transform, and to subsequently assign a new one if the body is dynamic. Collider objects encapsulate the above mentioned inputs, provide dynamics operations such as applying forces, optionally record collisions and contact points, and provide user callbacks as collisions are entered and exited. Colliders may be dynamic or static. If static they still apply forces to dynamic objects, but do not themselves respond to forces applied to them.

public class Collider {

    public var mass: Number = 0;
    public var dynamic: Boolean = false;
    public var friction: Number = 0.5;
    public var restitution: Number = 0;
    public var tracksCollisions: Boolean = false;
    public var initialVelocity: Vec3 =  = Vec3.ZERO;
    public var initialSpin: Vec3 =  = Vec3.ZERO;
    public var initialOrientation: Vec3 = Vec3.Z_AXIS;
    public var initialPosition: Vec3 = Vec3.ZERO;
    public var centerOfMass: Vec3 = Vec3.ZERO;
    public var active: Boolean = true

    public-read protected var collisions: Collision[];
    public var onCollisionEnter: function(collision: Collision):Void;
    public var onCollisionLeave: function(collision: Collision):Void;

    public var collidable: Collidable;

    public var shape: Collidable;

    public function applyForce(v:Vec3):Void
    public function applyForce(x:Number, y:Number, z:Number):Void
    public function applyTarque(v:Vec3):Void
    public function applyTorque(x:Number, y:Number, z:Number):Void
    public function setAngularFactor(t:Number):Void
    public function getLinearVelocity():Vec3
    public function getAngularVelocity():Vec3
    public function setLinearVelocity(v:Vec3):Void
    public function setLinearVelocity(x:Number, y:Number, z:Number):Void
    public function setAngularVelocity(v:Vec3):Void
    public function setAngularVelocity(x:Number, y:Number, z:Number):Void
    public function applyImpulse(v:Vec3):Void;
    public function applyImpulse(x:Number, y:Number, z:Number):Void
    public function applyTarqueImpulse(v:Vec3):Void
    public function applyTorqueImpulse(x:Number, y:Number, z:Number):Void

The collision shape used by the Collider can be overridden by the user by assigning an alternate collidable to its shape variable.

Collision and ContactPoint are defined as follows:

public class ContactPoint {
    public-init var point: Vec3;
    public-init var normal: Vec3;
    public-init var thisCollider: Collider;
    public-init var otherCollider: Collider;

public class Collision {
    public var collider: Collider;
    public var contacts: ContactPoint[];

Force Fields and Constraints

In addition to gravity fields, we support the creation of arbitrary force fields, for example radial fields, vortex fields, turbulence fields, etc. Such fields extend the ForceField class and implement their behavior by overriding the applyForceField function and applying forces to the bodies passed in. The applyForceField function is called once per frame by the dynamics world.

public abstract class ForceField {
    public var active: Boolean = true;
    public var magnitude: Number;
    public var attenuation: Number;
    public var maxDistance: Number;
    public var worldTransform: Mat4;  
    public var restrictTo: Collider[];    
    public function apply(world:DynamicsWorld):Void {...}
    public abstract function applyForceField(bodies:Collider[]):Void;

The restrictTo variable allows the user to apply the field to only a subset of the bodies currently in the dynamics world.

For example, here's a possible implementation of a radial field:

public class RadialField extends ForceField {

    override public function applyForceField(bodies:Collider[]):Void {
        var fieldLocation = worldTransform.getTranslation();
        for (body in bodies) {
            var t = body.getCollisionWorldTransform();
            var bodyLocation = t.getTranslation();
            var dir = fieldLocation - bodyLocation;
            var dist = dir.length();
            if ((dist > 0 and dist <= maxDistance)) { 
                def force = dir \* -magnitude;
Rigid bodies may be linked with constraints, for example hinges, sliders, or springs. Implementations of such constraints extend the RigidBodyConstraint class. Calling its apply() function makes the constraint active in the dynamics world associated with body1 and body2, calling its remove() function deactivates it. Activation/deactivation of constraints is typically asynchronous, as such the onApply and onRemove functions are provided as callbacks to the constraint implementation, when the constraint actually becomes active or inactive in the dynamics world.

public abstract class RigidBodyConstraint {
    public-init var body1: Collidable;
    public-init var body2: Collidable;
    public function apply():Void { ... }
    public function remove():Void;
    protected abstract function onApply():Void;
    protected abstract function onRemove():Void;

Just Collision Detection

As mentioned, one can use this system for collision detection only. Here's part of a simple 2d example:

var rect1:Node = Rectangle {
    id: "draggable"
    collider: Collider {}
    height: 200, width: 200, fill: BLUE;
    onMouseDrag: function(e) {
        rect1.x += e.dragX;
        rect1.y += e.dragY;

var rect2 = Rectangle {
    id: "key-framed"
    var color = GREEN;
    height: 100, width: 100, fill: bind lazy color;
    var c:Collider;
    collider: c = Collider {
        onCollisionEnter: function(collision) {
            color = YELLOW;
        onCollisionLeave: function(collision) {
            if (c.collisions == []) {
                color = GREEN;

var t = Timeline {
    [KeyFrame {
        time: 0s;
        values: rect2.x => -300.0 tween LINEAR;
    KeyFrame {
        time: 2s;
        values: rect2.x => 300.0 tween LINEAR;
    autoReverse: true;
    autoplay: true;
    repeatCount: INDEFINITE;

In this example a green rectangle is key-framed along the x axis. If another object comes into contact with it, it turns yellow. There's also a blue rectangle which can be dragged with the mouse along its 2d plane. The two rectangles may come into contact as a result of the key frame animation or dragging. Here's what it looks like:

Friday Jul 10, 2009

Technical aspects Round 2


Triangle mesh geometry and texture mapping

The system under discussion here is fundamentally a 3d system. To get high performance, we must design our software systems in such a way as to be able to efficiently leverage the 3d graphics hardware. At the end of the day this boils down to constructing and populating sequences of vertex attributes and textures and making them available to the graphics driver.

The rendering system has several layers. At the lowest layer is a JNI interface to the graphics driver, in the case of OpenGL, JOGL. I should mention that starting during(!) JavaOne 2008, Ken and Sven Goethel completely rewrote JOGL to support portable profiles corresponding to desktop OpenGL, OpenGL ES 1.x, and OpenGL ES 2.x.

Above this layer is a "render context" which mirrors the state of the graphics driver, and eliminates any redundant state changes requested by the application. This optimization is essential as the application itself does not typically have full visibility of such state and cannot avoid such redundancies itself (and these are typically very costly operations). Our current system uses Josh Slack's open source Ardor3D project for this layer.

Vertex attribute arrays are generated on demand in the form of native nio Buffers allocated from the C heap, or as vertex buffer objects allocated by the graphics driver. Allocation and deallocation of native nio buffers and vbo's is costly. These resources are therefore stored and reused where possible in "mesh" objects associated with scene graph nodes that describe geometry. The same meshes are used for ray/triangle intersection tests to support picking.

2D geometry is treated uniformly with 3d geometry. There are 3 strategies available for rendering 2d geometry, depending on the capabilities of the available hardware. If multisample antialiasing is available it's possible to simply flatten the 2d shape into a set of vertices and tessellate it. This approach is used for extrusion of 2d shapes, and is occasionally useful in other cases, but typically requires generating a lot of polygons to look good. NVidia's proprietary 2d path rendering extension could also be leveraged if available. It does essentially the same thing, however without the cpu and data transfer overhead. If programmable shaders are available, it's possible to tessellate the shape directly without flattening, which requires far fewer polygons, and then generate an antialiased opacity mask dynamically per frame on the GPU with a pixel shader, which is very fast. Otherwise, we simply use a pair of triangles corresponding to the bounding box of the shape as its geometry and an antialiased opacity mask must be generated on the cpu side, uploaded, texture mapped, and blended in. This final strategy is quite slow, and typically impacts the overall frame rate in non-trivial cases. To partially counteract that we rely on caching such opacity masks and avoiding regenerating them where possible - for example in the common case of a 2d only scene where a shape is translated, but not rotated or scaled, the mask remains valid.

Depth buffer, transparency, and blending For high performance, we use the blending capability of the graphics driver to manage transparency. For 2d graphics, we have two issues to deal with. First, draw order: blending is performed based on draw order not based on z-depth. Second, in a 2d scene, elements are drawn on the same plane which causes artifacts (z-fighting) if depth writing and the depth test are enabled. Nevertheless, we want to leverage the hardware depth test, and we need non-transparent 2d shapes to occlude each other. Currently, we use the "polygon offset" feature of the graphics driver to make this work.

2D Clipping It's possible to use opacity masks and blending to perform 2d clipping, however this is costly. In some cases it can be accomplished more efficiently with acceptable results using the stencil buffer. Clipping involves using the intersection of a set of 2d shapes as a mask against which we draw other objects. It turns out you can use the stencil buffer itself to perform the intersection operation. When we add a shape to the clip, we draw it in the stencil buffer with the INCR stencil operation. When we remove a shape from the clip, we draw it in the stencil buffer with the DECR stencil operation. We then draw the scene with the stencil test set to EQ(number of shapes intersected in the clip). This approach works across both 2d and 3d transforms, and you can use any drawing (not just 2d shapes) as the "clip". If multisample antialiasing is available the result of the clipping operation is also properly antialiased. If not, this approach is still useful for the very common case of rectangular clipping.

The below is actually a blue rectangle clipped by some text, with a perspective transform

Thursday Jul 09, 2009

Technical aspects

Round 1

Scene graph design

The system mentioned in my previous post utilizes JavaFX script language features to construct a lazy dependency graph, part of which is the scene graph that provides the infrastructure for rendering, picking, and collision detection. This scene graph does not use the trigger feature of JavaFX script, but rather relies on its lazy binding feature.

Thus, when you modify the scene, nothing happens other than that things that depend on your modifications are (implicitly) marked invalid. Only when you subsequently access the scene does (minimal) reevaluation occur. In addition, due to the nature of binding, results are implicitly "memo-ized" and you get fast cached access to them if their dependencies haven't changed. Ultimately when the render pass occurs, it touches everything, bringing it up to date if necessary as a side effect.

Linear Math

Animation of 2d or 3d geometry consists of animating the components of 2d or 3d affine transformations that are applied to such geometry or of deforming the geometry itself. Linear algebra provides truly elegant mechanisms to express such, however, that elegance is severely diminished in programming languages that cannot provide the expected arithmetic operators. Therefore one of the language and compiler modifications I made is to provide overloaded arithmetic operators for vector and matrix operations. For our purposes (vector graphics) we only require 2d, 3d, and 4d vectors and 2x2, 3x3, and 4x4 matrices. These types are provided in a new package called javafx.math, namely Vec2, Vec3, Vec4, Mat2, Mat3, Mat4. Since rotations may be represented as a pair of angle/axis, or quaternion form, in addition to matrix form, we also provide the types AngleAxis, and Quat. All of the types mentioned are conceptually immutable value types, like Java/JavaFX primitive types. Ideally, they should be implemented with VM level value types.

Transform and bounding volume hierarchies

With these prerequisites, implementation of the transform hierarchy is trivial (simplified, with some details omitted - local transform consists of a number of sub-components):

import javafx.math.Mat4;

public abstract class Node {

    public var parentNode: Node;
    public var transform: Mat4 = IDENTITY;
    public def worldTransform: Mat4 =
        bind lazy
        if (parentNode == null)
        then localTransform
        else parentNode.worldTransform \* localTransform;

    public def localTransform = bind lazy transform \* ...

The above simply states that a Node has an optional parent, which is also a Node, and that its local to world space transformation is that of its parent multiplied by its own local to parent transform or just its own transform if it's the root of the hierarchy.

Implementing the bounding volume hierarchy is equally trivial:

public class Bounds {

    public-init var center: Vec3 = Vec3.ZERO;
    public-init var extent: Vec3 = Vec3.ZERO;

    public def width = bind lazy extent.x \* 2;
    public def height = bind lazy extent.y \* 2;
    public def depth = bind lazy extent.z \* 2;
    public def min = bind lazy center - extent;
    public def max = bind lazy center + extent;

public function transformBounds(b:Bounds, transform:Mat4):Bounds { ... }

public abstract class Node {
    public def worldBounds:Bounds = bind lazy Bounds.transformBounds(contentBounds,
    public def bounds:Bounds = bind lazy Bounds.transformBounds(contentBounds,

    public-read protected var contentBounds:Bounds;

The above states that the world space bounding box of a Node is the bounding box of whatever it contains multiplied by its local to world transform, and similarly for its local bounds in its parent's coordinate system.

This approach has a worst case cost which is the same as one which used eager bindings, for example if I modify a node's transform and then immediately read the root node's bounds, then modify another node's transform, then read the root node's bounds, etc. However, for real programs the average case cost is fantastically less, since reading all the dependents during a sequence of modifications rarely happens in practice. With eager bindings as in javafx 1.x and in prism, you pay this price all of the time regardless. I believe the "boundsInScene" variable of the javafx 1.1 Node class was actually removed in 1.2 due to this type of overhead.

A similar lazy binding strategy is used throughout, for the shading network, 2d shape creation and composition, for complex compositions of weighted transformations such as aim, point, and orient constraints, etc, all in contrast to what we have in javafx 1.x.

I had to make some compiler modifications to make lazy binding operate correctly. In particular ensuring that it truly is lazy - for example, in the above bounding volumes are not created at all until they're used.

Wednesday Jul 08, 2009

The real slim shady

The below are a few examples from a project I've worked on for the past 1+ years - all written in javafx script (and all run at 60fps) with a modified compiler and from the ground up custom runtime written by me and a few others - one of them Ken Russell - who sadly just left Sun. We weren't able to obtain the support we needed for this project to continue, which took its toll. For those of you who may be disappointed about this - as I am - you can blame me. It was my job to be persuasive enough to gain sufficient support for our efforts, and in the end I wasn't able to do that.

Extruded text with 3d physics animation

Physics animation with hinge constraints

Vanilla 2d vector graphics

Interactive, animated 2d vector scene inside a 3d model. 60 sec screen capture. Note the screen capture does not reflect the smoothness of the actual animation.


XHTML rendered with fixed-function 3d graphics primitives (uses Flying Saucer for layout)

The interesting part in this context is that on layout the entire document is baked onto one set of vertex arrays. The minimal number of draw calls are then issued based on overdraw requirements and texture mapping, yielding high performance - in this case around 5x that of webkit

GPU rendered antialiased 2d vector graphics with perspective transform

3d blend shape animation

Normal maps imported from Maya

Tuesday Jan 13, 2009

Collada MoonBuggy in JavaFX

Collada MoonBuggy with skin animation in JavaFX.

In the scripting API for this scene graph, animated Collada models are Nodes which are also polymorphic with Timelines. You can "play" them. Relevant code:

var model = Model {
    url: "lunar_lander_tris.dae"
    repeatCount: Timeline.INDEFINITE

var zoom = 0.0;

var scene = Scene {
   background: BLACK;
   camera: PerspectiveCamera {
       heightAngle: 15
       nearDistance: 1
       farDistance: 1000
       position: bind Transform.translate(0, 0, 30+zoom);
   lights: [...]
   content: model



Sunday Jan 04, 2009

From F3 to JavaFX 1.0 - Effects

An important and impressive innovation between F3 and JavaFX is the Effects framework created by Chris Campbell.

F3 had a simple system of software pixel filters, which could be applied to any Node or group of Nodes in a scene. However, thanks to Chris, JavaFX 1.0 includes a much more complete set of effects, and a sophisticated framework that enables GPU hardware acceleration where available.

Underlying the simple declarative expression of effects at the JavaFX script level, effect implementations are described in a GPU-shader-like procedural language, which Chris created, called JSL. Chris's JSL compiler then compiles to various targets, either GPU-based (GLSL/HLSL), or CPU-based (Java/Native).

Saturday Dec 27, 2008

Performance matters, but for whom?

Alexander A. Stepanov (inventor of C++ STL):
Computers that were able to deal just with numbers evolved into computers with byte-addressable memory, flat address spaces, and pointers. This was a natural evolution reflecting the growing set of problems that people were solving. C, reflecting the genius of Dennis Ritchie, provided a minimal model of the computer that had evolved over 30 years. C was not a quick hack. As computers evolved to handle all kinds of problems, C, being the minimal model of such a computer, became a very powerful language to solve all kinds of problems in different domains very effectively. This is the secret of C's portability: it is the best representation of an abstract computer that we have. Of course, the abstraction is done over the set of real computers, not some imaginary computational devices. Moreover, people could understand the machine model behind C. It is much easier for an average engineer to understand the machine model behind C than the machine model behind Ada or even Scheme. C succeeded because it was doing the right thing, not because of AT&T promoting it or Unix being written with it.

Yukihiro Matsumoto (inventor of Ruby):

In my point of view, efficiency and productivity are the same thing, because we focus on programming efficiency, not runtime efficiency. I don't care about runtime efficiency. The computer goes faster and faster every year, so I don't really care about performance. Actually, I care about performance when I'm implementing the Ruby interpreter, but not when I'm designing Ruby. The implementer should care about performance, but the designer shouldn't. If you care about performance, you will focus on machines in design instead of on humans. So you shouldn't focus on performance in design.

Stepanov definitely influenced me at the time (early/mid nineties) and I got plenty of mileage out of STL (in spite of c++ compiler limitations) and I found (and still find) STL amazing. I actually never used Java professionally until 2000, tbh - like Stepanov I felt it was unusably slow for my purposes (and it was). However, by 2000 Hotspot had achieved a very impressive level of performance.

Although I can't say the same for Matz and Ruby, I think it's interesting to juxtapose their points of view.

Although I've never actually written a Ruby program, the processing requirements for such must be quite low - since Ruby's runtime performance is unconscionably poor by Stepanov's standards. I guess I can imagine a web server application which is supported by mostly static cached documents and dominated by the file system synchronization of a transactional database could be such a low-performance use case, and that is where Ruby seems to have found a niche.

Programming language compilers are "translators", which we require to translate from human expression to that of the machine.

In my opinion, designing a programming language is always a "mediation" between the needs of the human programmer and those of the machine - contrary to Matz.

Now, given the fact that runtime performance is a factor to the end user (i.e the customer - and the customer always comes first), it cannot simply be sacrificed when required to make the programmer's job easier.

OTOH, if the programmer's burden is too high, runtime performance may never come into question, since you'll fail to even deliver a product.

The design of F3 (now JavaFX script) was largely based on the concept that many perceived programmer ease-of-use features are not primarily related to the typical implementation techniques of so-called scripting languages, but rather to their purely linguistic characteristics - i.e. their correspondence to the language of the mind, rather than that of the machine - and that such features could be provided thanks to the Java platform without severely compromising runtime performance.

Object-orientation is one such feature - it corresponds to the way we conceptualize the world around us and allows us to navigate such concepts in a way that is natural to us.

Interestingly, Stepanov completely rejects this, focusing instead completely on the "generic" processing that applies across collections of objects at very high levels of abstraction.

Admirably in my opinion, Stepanov steadfastly refused to sacrifice the needs of the machine, yet nevertheless produced with STL a level of programming expression which was quite lovely to the human programmer (given that former constraint and those of the C++ language).

The beauty of the Java platform in my opinion has been that it does the best job of mediating these needs, overall, and that largely explains its success.

Java has many detractors on both sides - the C/C++ world thinks it's too slow, the Ruby-and-various-other-scripting-languages world that it's too hard to use. And everybody knows it's long been too big and monolithic.

Now, no one language or platform can be all things to all people. However, it's my belief that the Java platform - in spite of its wide use - is actually still under-utilized for many important use-cases - given current available alternatives.

Obviously, it's clear that I think one of those is the world of high-performance graphics and multimedia, which is currently still dominated by C and C++.

Surprisingly (or maybe not?) a similar tension exists among my colleagues here at Sun with respect to JavaFX script. There are those who actually think they're designing a language to serve (from their point of view) some lesser form of "web scripter" for whom they (incorrectly) perceive performance doesn't matter. Conversely, there are those who are acutely aware of the performance requirements of graphics and multimedia, who are dubious of the necessity and viability of JavaFX script (although they tend not to be forthcoming with any credible alternative, other than "just use Java" - but the fact is that's been done, and the barrier to entry is still too high).

Since I happen to think both points of view are wrong, it's my job to prove that, and to bring us together, and that's what I'm currently working on.

Although its scope is limited, in terms of finding something closer to the right level of "mediation" of performance/programmer-ease-of-use/functionality for this problem domain (and based on the Java platform), a good place to start is here.

Tuesday Dec 23, 2008


For the observant who are in the "know" about making real media, there are a few glimpses (here, and here, and here, and here), and also the fact that the individuals who through self-motivation decided to revamp the Java browser plugin are actually 3d graphics experts, that we're not completely oblivious.

There's a broad cultural divide between the real media world, and the world of current 2d graphics toolkits, such as web browsers, Flash, JavaFX 1.0, Java Swing, Silverlight, QT, Cocoa, etc, etc.

An important thing to note is that this, and this (meaning the flash content on this web site), and this (meaning the actual product - not the web site), all run on the same consumer hardware. What differentiates them is the skill of the artists/designers involved, and the performance and capability of the software systems that enable such artists.

In spite of being composed of the very most simplistic graphics imaginable, the AJAX site above delivers unconscionably poor, stuttering, animation performance. In spite of being a premier 3d game engine provider, the Flash content on Crytek's web site is very poor animation-performance-wise (although a lot better than the web browser's) - and still remarkably simplistic (compared to Crytek's real product) - and obviously lack of availability of artists is not part of the equation in their case. Meanwhile EVE online brings you amazing graphics, sound, animation and interactivity with silky smooth performance.

At the end of the day, however, these discrepancies clearly are about hardware. It's impossible to get a sufficient frame-rate for smooth animation of non-trivial graphics without hardware acceleration (GPU + CPU SIMD) - even for 2d. This is the very reason for the existence of GPU hardware. By contrast, the web browser, Flash, Java2D, & etc have a software architecture which is not designed around leveraging such hardware, and such is the result you get.

The state of the art in real media is created and advanced in the world of movies, TV, and video games, not that of the PC desktop and web browser. Nevertheless, modern video game engines demonstrate that interactive, real-time, very high production quality, high-performance multimedia (2d and 3d animation and graphics, audio, and video), is possible on today's desktops and on the web, not to mention devices with multimedia hardware acceleration.




« April 2014