An early look at Derby 10.6

The Derby community is currently working on the next feature release, Derby 10.6, and a number of new features have already been implemented. If you haven't already done so, now would be a good time to download the development sources and kick the tires on your favourite new feature! Early feedback to the community (either on the mailing lists or in the JIRA bug tracker) increases the likelihood of getting improvements and bug fixes implemented before the final release.

If you don't feel like building jar files from the development sources yourself, you can download the binaries used in the nightly regression testing from http://dbtg.foundry.sun.com/derby/bits/trunk/. Make sure that you understand all the warnings about the difference between a production build and a development build before you start using it!

The release page on the wiki lists some of the features that are planned for 10.6. Let's take a quick look at some of those that are mostly done and ready for testing.

Store Java objects in columns

DERBY-651 tracks the ongoing work that enables storing of Java objects directly in the columns of a table. The objects must belong to a class that implements the java.io.Serializable interface.

For example, if you want to store java.util.List objects in a table, you first need to declare a user-defined type that maps to the Java type, and create a column with that type:

stmt.execute("CREATE TYPE JAVA_UTIL_LIST " +
             "EXTERNAL NAME 'java.util.List' " +
             "LANGUAGE JAVA");
stmt.execute("CREATE TABLE T(C1 JAVA_UTIL_LIST)");

Next, you prepare a statement and use setObject() to insert the Java object that you want to store:

PreparedStatement ps = conn.prepareStatement(
        "INSERT INTO T(C1) VALUES (?)");

ArrayList lst = new ArrayList();
lst.add("First element");
lst.add("Second element");

ps.setObject(1, lst);
ps.execute();

Finally, you execute a SELECT query and use getObject() to restore the object from the database:

ResultSet rs = stmt.executeQuery("SELECT C1 FROM T");
while (rs.next()) {
    List list = (List) rs.getObject(1);
    System.out.println("Size of list: " + list.size());
}

You can also use your object types as arguments to user-defined functions or procedures:

stmt.execute("CREATE FUNCTION LIST_SIZE(LST JAVA_UTIL_LIST) " +
             "RETURNS INTEGER " +
             "EXTERNAL NAME 'MyListFunctions.listSize' " +
             "LANGUAGE JAVA PARAMETER STYLE JAVA");
ps = conn.prepareStatement("values list_size(?)");
ps.setObject(1, lst);

Multi-level grouping

Derby's GROUP BY syntax has been extended with the ROLLUP keyword, which allows for multi-level grouping. There's a fine write-up about the feature on this wiki page. The example below shows how you can ask for the total amount of sales of each product in each state, the total amount of sales in each state, the total amount of sales in each region, and the amount of sales in all regions, in one single query:

ij> SELECT REGION, STATE, PRODUCT, SUM(SALES) FROM SALES_HISTORY
        GROUP BY ROLLUP(REGION, STATE, PRODUCT)
        ORDER BY REGION, STATE, PRODUCT;
REGION    |STA&|PRODUCT   |4          
--------------------------------------
East      |MA  |Boats     |10     <-- all boat sales in Massachusetts
East      |MA  |Cars      |190        
East      |MA  |NULL      |200    <-- sum of all sales in Massachusetts
East      |NY  |Boats     |570        
East      |NY  |Cars      |10         
East      |NY  |NULL      |580        
East      |NULL|NULL      |780    <-- sum of all sales in region East
West      |AZ  |Boats     |40         
West      |AZ  |Cars      |300        
West      |AZ  |NULL      |340        
West      |CA  |Boats     |570        
West      |CA  |Cars      |750        
West      |CA  |NULL      |1320       
West      |NULL|NULL      |1660       
NULL      |NULL|NULL      |2440   <-- sum of all sales in all regions

15 rows selected

Richer JOIN syntax

Derby already supports INNER JOIN, LEFT OUTER JOIN and RIGHT OUTER JOIN. In 10.6, the syntax will be enhanced with support for

  • the CROSS JOIN operation
  • named column joins
  • sub-queries in ON clauses

CROSS JOIN is the simplest of the join operations, and SELECT \* FROM T1 CROSS JOIN T2 is just another way of writing SELECT \* FROM T1, T2. It may also be combined with other join operations:

ij> SELECT \* FROM T1 CROSS JOIN T2 LEFT OUTER JOIN T3 ON T1.A = T3.B;
A          |B          |A          |B          |A          |B          
-----------------------------------------------------------------------
5          |2          |2          |7          |NULL       |NULL       

1 row selected

Named columns join may save you some typing when the columns in the join key have the same name in the two tables being joined. In the example below, where two tables COUNTRIES and CITIES are joined, you no longer need to write the full join condition ON COUNTRIES.COUNTRY = CITIES.COUNTRY. Instead, you just say that the tables should be joined on the COUNTRY column with a USING clause:

ij> SELECT COUNTRY, COUNT(CITY_ID) FROM COUNTRIES JOIN CITIES
        USING (COUNTRY) GROUP BY COUNTRY;
COUNTRY                   |2          
--------------------------------------
Afghanistan               |1          
Argentina                 |1          
Australia                 |2          
.
.
.
United Kingdom            |1          
United States             |37         
Venezuela                 |1          

45 rows selected

Explain plan

The XPLAIN functionality makes it possible to access run-time statistics with SQL queries. See this section in the development version of Tuning Derby for details about how to enable it and use it. Here's one of the more advanced examples from the manual, which shows how to get all queries that performed a table scan on the COUNTRIES table, and the number of pages and rows visited in the scan:

ij> SELECT ST.STMT_TEXT,
           SP.NO_VISITED_PAGES AS PAGES,
           SP.NO_VISITED_ROWS AS "ROWS"
    FROM STATS.SYSXPLAIN_SCAN_PROPS SP, 
         STATS.SYSXPLAIN_RESULTSETS RS, 
         STATS.SYSXPLAIN_STATEMENTS ST 
    WHERE ST.STMT_ID = RS.STMT_ID AND 
          RS.SCAN_RS_ID = SP.SCAN_RS_ID AND 
          RS.OP_IDENTIFIER = 'TABLESCAN' AND 
          SP.SCAN_OBJECT_NAME = 'COUNTRIES';
STMT_TEXT                     |PAGES      |ROWS       
------------------------------------------------------
SELECT \* FROM COUNTRIES       |2          |114        

1 row selected

ORDER BY in sub-queries

Up till now, Derby has only accepted ORDER BY clauses in top-level SELECT queries. The ongoing work on DERBY-4397 will allow ORDER BY clauses in sub-queries as well. This is going to resolve the very old request for ordering of inserts (DERBY-4). There's also a request for OFFSET/FETCH in sub-queries (DERBY-4398), which would be very useful in combination with ORDER BY.

Dropping of in-memory databases

Derby 10.5 added support for in-memory databases, but there was no convenient API for deleting the in-memory databases and reclaiming the memory without taking down the Java process. In Derby 10.6, in-memory databases can be destroyed with an API similar to the one used for creating and shutting down databases. Simply add drop=true to the JDBC URL.

ij> connect 'jdbc:derby:memory:mydb;create=true';
ij> create table t (x int);
0 rows inserted/updated/deleted
ij> connect 'jdbc:derby:memory:mydb;drop=true';
ERROR 08006: Database 'memory:mydb' dropped.

Very nice if you're running JUnit tests and want to clear the database and free up the memory between each test case.

SHOW FUNCTIONS

Derby's interactive tool for running scripts or queries against a database, ij, lacked a command to show the functions that are stored in the database. Derby 10.6 adds SHOW FUNCTIONS to ij's list of SHOW commands, so now you can issue the following command to view all the functions in the APP schema:

ij> SHOW FUNCTIONS IN APP;
FUNCTION_SCHEM|FUNCTION_NAME               |REMARKS                            
----------------------------------------------------------
APP           |MY_FUNC                     |MyFunctions.f                      

1 row selected

Restricted table functions

In Derby 10.6 it will be possible to push predicates into table functions and process the restrictions before the function returns the results to the SQL engine. This can speed up execution of some table function queries significantly. Take this example:

SELECT \* FROM TABLE ( MY_FUNCTION() ) AS T WHERE X = 5

If MY_FUNCTION() returns thousands of rows, and only a handful of them match the restriction X = 5, a lot of work is wasted generating and scanning rows that are just thrown away. With restricted table functions, Derby's SQL engine will pass information about the restriction down to the table function, and the table function may use this information to produce and return only the rows that are actually needed.

Details about how to use restricted table functions can be found in the functional spec attached to DERBY-4357.

ROW_NUMBER improvements

Previous releases of Derby only allowed the ROW_NUMBER function to appear in the select list of a query, and it could not be used to build up more complex expressions. Now these limitations have been removed, and you can write queries like this one without getting syntax errors:

SELECT X / ROW_NUMBER() OVER ()
  FROM T
  ORDER BY ROW_NUMBER() OVER () DESC
Comments:

I don't want to replace SQLite in my Java projects until Boolean is supported. I hope it'll be soon.

Posted by Gabriele on December 13, 2009 at 08:07 AM CET #

Post a Comment:
Comments are closed for this entry.
About

kah

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today