Technical aspects Round 2
By user12610627 on Jul 10, 2009
RenderingTriangle mesh geometry and texture mapping
The system under discussion here is fundamentally a 3d system. To get high performance, we must design our software systems in such a way as to be able to efficiently leverage the 3d graphics hardware. At the end of the day this boils down to constructing and populating sequences of vertex attributes and textures and making them available to the graphics driver.
The rendering system has several layers. At the lowest layer is a JNI interface to the graphics driver, in the case of OpenGL, JOGL. I should mention that starting during(!) JavaOne 2008, Ken and Sven Goethel completely rewrote JOGL to support portable profiles corresponding to desktop OpenGL, OpenGL ES 1.x, and OpenGL ES 2.x.
Above this layer is a "render context" which mirrors the state of the graphics driver, and eliminates any redundant state changes requested by the application. This optimization is essential as the application itself does not typically have full visibility of such state and cannot avoid such redundancies itself (and these are typically very costly operations). Our current system uses Josh Slack's open source Ardor3D project for this layer.
Vertex attribute arrays are generated on demand in the form of native nio Buffers allocated from the C heap, or as vertex buffer objects allocated by the graphics driver. Allocation and deallocation of native nio buffers and vbo's is costly. These resources are therefore stored and reused where possible in "mesh" objects associated with scene graph nodes that describe geometry. The same meshes are used for ray/triangle intersection tests to support picking.
2D geometry is treated uniformly with 3d geometry. There are 3 strategies available for rendering 2d geometry, depending on the capabilities of the available hardware. If multisample antialiasing is available it's possible to simply flatten the 2d shape into a set of vertices and tessellate it. This approach is used for extrusion of 2d shapes, and is occasionally useful in other cases, but typically requires generating a lot of polygons to look good. NVidia's proprietary 2d path rendering extension could also be leveraged if available. It does essentially the same thing, however without the cpu and data transfer overhead. If programmable shaders are available, it's possible to tessellate the shape directly without flattening, which requires far fewer polygons, and then generate an antialiased opacity mask dynamically per frame on the GPU with a pixel shader, which is very fast. Otherwise, we simply use a pair of triangles corresponding to the bounding box of the shape as its geometry and an antialiased opacity mask must be generated on the cpu side, uploaded, texture mapped, and blended in. This final strategy is quite slow, and typically impacts the overall frame rate in non-trivial cases. To partially counteract that we rely on caching such opacity masks and avoiding regenerating them where possible - for example in the common case of a 2d only scene where a shape is translated, but not rotated or scaled, the mask remains valid.
Depth buffer, transparency, and blending For high performance, we use the blending capability of the graphics driver to manage transparency. For 2d graphics, we have two issues to deal with. First, draw order: blending is performed based on draw order not based on z-depth. Second, in a 2d scene, elements are drawn on the same plane which causes artifacts (z-fighting) if depth writing and the depth test are enabled. Nevertheless, we want to leverage the hardware depth test, and we need non-transparent 2d shapes to occlude each other. Currently, we use the "polygon offset" feature of the graphics driver to make this work.
2D Clipping It's possible to use opacity masks and blending to perform 2d clipping, however this is costly. In some cases it can be accomplished more efficiently with acceptable results using the stencil buffer. Clipping involves using the intersection of a set of 2d shapes as a mask against which we draw other objects. It turns out you can use the stencil buffer itself to perform the intersection operation. When we add a shape to the clip, we draw it in the stencil buffer with the INCR stencil operation. When we remove a shape from the clip, we draw it in the stencil buffer with the DECR stencil operation. We then draw the scene with the stencil test set to EQ(number of shapes intersected in the clip). This approach works across both 2d and 3d transforms, and you can use any drawing (not just 2d shapes) as the "clip". If multisample antialiasing is available the result of the clipping operation is also properly antialiased. If not, this approach is still useful for the very common case of rectangular clipping.
The below is actually a blue rectangle clipped by some text, with a perspective transform