What's new in Derby 10.7?

Apache Derby was released earlier this week. You'll find the binaries and the release notes at the download page. This is the first release from the 10.7 branch, and it contains some useful new features that were not part of the 10.6 releases. I will present some of these features here. There's also an interview with Rick Hillegas, the release manager for Derby 10.7.1, posted on the Java Spotlight Podcast, where the new features are discussed.

Boolean data type

Starting with 10.7, Derby supports the Boolean data type (yay, finally!). In earlier releases you would have to work around this limitation by using a CHAR(1) column or an SMALLINT column. Now you can declare the columns as BOOLEAN directly:

ij> create table t(b boolean);
0 rows inserted/updated/deleted
ij> insert into t values true, false, null;
3 rows inserted/updated/deleted
ij> select \* from t;

3 rows selected
ij> select \* from t where b;

1 row selected

BOOLEAN may also be used for parameters and return values in stored procedures and functions, just like any other data type.

Plan exporter tool

Sometimes you want to know exactly how Derby executes a query. For example, you may have a big join that takes a very long time to complete, and you want to check if Derby uses the relevant indexes to get the best performance.

It has always been possible to get runtime statistics from Derby (see the Working with RunTimeStatistics chapter in Tuning Derby), but the statistics output has been hard to analyze (read: unstructured text format). The 10.6 release improved the situation somewhat by introducing XPLAIN style statistics, which puts the collected data into database tables that can be queried from SQL. Still, it can be difficult to dig up the information you want, especially for complex queries.

Derby 10.7 further improves this by adding a tool that processes the data in the XPLAIN tables for you, and produces an XML document representing the execution plan for a specific query. It can also do some simple rendering and output a graphical version of the plan in HTML format. This new tool is documented in the Derby Tools and Utilities Guide, in the PlanExporter chapter.

Let's take a look at an example. I wanted to inspect the query plan for a big and ugly join performed by OpenGrok to get the commit log for a version-controlled file. Enabling collection of statistics is easy. Just call one system procedure to enable it, and another one to say that you want XPLAIN style statistics and where to store them. I added this code in the method that executed the query I was interested in:

And for good measure I also disabled collection before returning from the method, so that I wouldn't get lots of statistics for queries I didn't care about:

Then I ran a JUnit test that I knew would exercise this code. After it had completed, I could connect to the database with the ij tool and verify that the statistics had been collected into the XPLAIN tables:

ij version 10.7
ij> connect 'jdbc:derby:/path/to/DB';
ij> select stmt_id from xplain.sysxplain_statements
>          where stmt_text like 'SELECT CS.REVISION%';

5 rows selected

The query I was interested in had been executed five times. I picked one of the STMT_ID values and passed it to the PlanExporter tool, asking it to produce both XML and HTML versions of the selected plan:

$ java org.apache.derby.tools.PlanExporter \\
    jdbc:derby:/path/to/DB XPLAIN \\
    f93441e2-012c-e092-02e5-000003bb1070 \\
    -xml plan.xml -html plan.html

The above command produced the following two files: plan.xml and plan.html. If you open the XML version, you'll see that most modern web browsers allow you to expand and collapse the nodes, which is neat when you analyze a large query and you want to focus on one specific part of it. For comparison, I'm also including plan.txt, which shows what the plan would look like in the old format (which is still supported, by the way).

Definer's rights for routines

A Derby stored function or stored procedure may execute SQL statements against the database using a nested connection. If you have SQL authorization enabled, the user that invokes the routine must have been granted privileges for all the SQL that is executed by the routine. So if the routine deletes rows from a table, you would have to grant the user delete privileges on that table, even if you don't really want that user to be allowed to delete rows except via the routine.

To address this concern, Derby 10.7 adds an optional clause (EXTERNAL SECURITY { DEFINER | INVOKER }) to the CREATE FUNCTION and CREATE PROCEDURE statements. This clause lets you specify whose privileges the routine should use when executing SQL statements on a nested connection, and could be used to delegate more fine-grained privileges to users.

For example, to allow a user to execute a procedure that performs some insert and select statements, you would previously have to give the user select and insert privileges on the tables involved, giving him carte blanche to insert whatever he wants into the tables. The code would look something like this, with multiple GRANT statements:

ij> create procedure add_item(id int, name varchar(10), category int)
>          external name 'MyProcedures.addItem'
>          language java parameter style java;
0 rows inserted/updated/deleted
ij> grant execute on procedure add_item to user1;
0 rows inserted/updated/deleted
ij> grant insert on items to user1;
0 rows inserted/updated/deleted
ij> grant select on categories to user1;
0 rows inserted/updated/deleted

If you use the new EXTERNAL SECURITY clause introduced in Derby 10.7 and specify that the procedure should execute with the definer's privileges, you only need to grant the user privileges to execute the procedure, and he will no longer have privileges to insert data directly into the table:

ij> create procedure add_item(id int, name varchar(10), category int)
>          external name 'MyProcedures.addItem'
>          language java parameter style java
>          external security definer;
0 rows inserted/updated/deleted
ij> grant execute on procedure add_item to user1;
0 rows inserted/updated/deleted

Less code, easier to maintain, and more secure.


New in 10.7 is also the TRUNCATE TABLE statement, which is handy if you want to delete all rows in a table. It is faster than DELETE FROM <TABLE>, especially on large tables, because it wipes the entire table and its indexes in one go instead of deleting one row at a time.

ij> select count(\*) from t;

1 row selected
ij> truncate table t;
0 rows inserted/updated/deleted
ij> select count(\*) from t;

1 row selected

Unicode database names

The embedded driver has always supported all characters in the Unicode character set in database names, as long as the underlying file system allows those characters in file names. But the network protocol has only worked with ASCII characters in the database names until now, so such databases couldn't be created or accessed from the client driver. In this release, the network protocol switched to UTF-8 encoding in the fields that hold the database name, so that you now, for example, can create and use a database with Japanese characters:

ij version 10.7
ij> connect 'jdbc:derby://localhost/データベース;create=true';

This only affects the database names. The network protocol used UTF-8 encoding for SQL queries and results even before this improvement, so inserting non-ASCII text values into the database, or using non-ASCII characters in table and column names, is also supported, just like it was in all previous releases.


Awesome, Knut. I can not believe you pull this off with all the wonderful work you do for Derby. Fabulous!!!

Posted by Lily Wei on December 17, 2010 at 10:04 AM CET #

I like the PlanExporter feature.

Posted by Ashwin Jayaprakash on December 17, 2010 at 10:14 PM CET #

Post a Comment:
Comments are closed for this entry.



« June 2016