Monday Apr 27, 2009

Derby 10.5 preview: SQL roles

Derby has supported SQL authorization with GRANT/REVOKE since version 10.2. In the upcoming 10.5 release, this is taken one step further with the addition of the concept of roles defined by the SQL:2003 standard.

Before, with SQL authorization enabled, you'd need to grant each required privilege explicitly to each user. With roles, this can be simplified by creating different roles which are granted sets of privileges, and granting roles instead of privileges to the users.

So instead of granting the same four privileges to three different users, you grant the privileges to a role and grant the role to those users.

ij> create role ordinary_user;
0 rows inserted/updated/deleted
ij> grant select on t1 to ordinary_user;
0 rows inserted/updated/deleted
ij> grant select on t2 to ordinary_user;
0 rows inserted/updated/deleted
ij> grant update on t2 to ordinary_user;
0 rows inserted/updated/deleted
ij> grant execute on procedure p1 to ordinary_user;
0 rows inserted/updated/deleted
ij> grant ordinary_user to username1, username2, username3;
0 rows inserted/updated/deleted

Later, if you want to take back the update privilege on T2 from those users, you simply revoke the privilege from the role, which is a simpler and less error-prone process than revoking the privilege from each single user.

ij> revoke update on t2 from ordinary_user;
0 rows inserted/updated/deleted

A role can also inherit privileges from another role. To create a superuser role which has all the privileges of the ordinary_user role, plus the privilege to insert rows into T1, do this:

ij> create role superuser;
0 rows inserted/updated/deleted
ij> grant ordinary_user to superuser;
0 rows inserted/updated/deleted
ij> grant insert on t1 to superuser;
0 rows inserted/updated/deleted

More detailed information about how to use SQL roles in Derby can be found in this section of Derby's reference manual.

Wednesday Apr 22, 2009

Derby 10.5 preview: FETCH/OFFSET

One of the most frequently requested features for Derby is the LIMIT/OFFSET syntax supported by many other databases. That syntax allows you to skip the first N rows from the result and return the next M rows. In the upcoming 10.5 release, it will finally be available to Derby's users too, except that it's called FETCH/OFFSET and uses the somewhat more verbose syntax dictated by the SQL:2008 standard.

The new clauses can be added to any SELECT statement, and they're defined like this:

[ OFFSET integer-literal {ROW | ROWS} ]
[ FETCH { FIRST | NEXT } [integer-literal] {ROW | ROWS} ONLY ]

(No, I'm not kidding, the standard actually requires you to type all those seemingly redundant keywords...)

You can use FETCH and OFFSET both alone and in combination with each other. For example, to select the three persons with the highest score from a table, you'd use a FETCH clause and no OFFSET clause, like this:

ij> SELECT NAME, SCORE FROM RESULTS ORDER BY SCORE DESC
> FETCH FIRST 3 ROWS ONLY;
NAME      |SCORE      
----------------------
John      |33         
Anne      |28         
Sue       |21         

3 rows selected

If you want to select all persons except the two with the highest score, you could use an OFFSET clause and skip the FETCH clause:

ij> SELECT NAME, SCORE FROM RESULTS ORDER BY SCORE DESC OFFSET 2 ROWS;
NAME      |SCORE      
----------------------
Sue       |21         
Peter     |19         
Bob       |7          

3 rows selected

And if you're only interested in the fourth person, you could use a SELECT statement that combines OFFSET and FETCH like this:

ij> SELECT NAME, SCORE FROM RESULTS ORDER BY SCORE DESC
> OFFSET 3 ROWS FETCH NEXT ROW ONLY;
NAME      |SCORE      
----------------------
Peter     |19         

1 row selected

The keywords FIRST and NEXT are synonyms, and so are ROW and ROWS, so all of these these four clauses are allowed and have the same meaning:

FETCH FIRST 10 ROWS ONLY
FETCH NEXT 10 ROWS ONLY
FETCH FIRST 10 ROW ONLY
FETCH NEXT 10 ROW ONLY

SQL:2008 also allows you to have ORDER BY/OFFSET/FETCH clauses in subqueries, but Derby doesn't implement that yet, so you'll be limited to using them on the top level of your query for now.

If you're interested in helping out testing the Derby 10.5 release candidate, more information (including download links) can be found here.

Tuesday Apr 21, 2009

Derby 10.5 preview: In-memory backend

The upcoming Derby 10.5 release will contain a long-awaited feature: an in-memory storage backend. With this backend, your entire database will be stored in main memory instead of on disk.

But isn't the whole point of using a database that the data should be stored safely on disk or some other kind of persistent storage? Normally, yes, but there are cases where you don't really care if you lose the database when the application crashes.

For instance, if you are running unit tests against your application, it's probably more important that the tests run fast and that it's easy to clean up after the tests. With the in-memory storage backend, you'll notice that many database operations (like database creation, inserts and updates) are a lot faster because they don't need to access the disk. Also, there's no need to clean up and delete the database files after the tests have completed, since the database goes away when the application terminates.

So how is the in-memory backend enabled? That's simple, you just add the memory subprotocol to the JDBC connection URL, and no other changes should be needed to your application. If you normally connect to the database with the URL jdbc:derby:MyDB you should instead use jdbc:derby:memory:MyDB (and of course add any of the connection attributes needed, like create=true to create the database). Here's an example in IJ, Derby's command line client:

$ java -jar derbyrun.jar ij
ij version 10.5
ij> connect 'jdbc:derby:memory:MyDB;create=true';
ij> create table my_table(x int);
0 rows inserted/updated/deleted
ij> insert into my_table values 1, 2, 3;
3 rows inserted/updated/deleted
ij> exit;

After exiting, you can verify that no database directory was created:

$ ls MyDB
MyDB: No such file or directory

More or less everything you can do with an ordinary database should be possible to do with an in-memory database, including taking a backup and restoring it. This can be useful, as it allows you to dump the database to disk before you shut down your application, and reload it into memory the next time you start the application. Looking again at the example above, you could issue this command before typing exit in IJ:

ij> call syscs_util.syscs_backup_database('/var/backups');
0 rows inserted/updated/deleted

Later, when you restart the application, you can load the database back into the in-memory store by using the createFrom connection attribute:

$ java -jar derbyrun.jar ij
ij version 10.5
ij> connect 'jdbc:derby:memory:MyDB;createFrom=/var/backups/MyDB';
ij> select \* from my_table;
X          
-----------
1          
2          
3          

3 rows selected

Thursday Apr 16, 2009

Derby 10.5 preview: Generated Columns

In the Apache Derby community we're currently working on a new feature release (version 10.5) which we hope to have ready very soon now. While we're waiting for it to be properly tested and approved, I'll give a sneak preview of some of the features that we've added since 10.4. In this entry, I'll take a look at generated columns.

A generated column is a column whose value is defined as an expression. The expression may refer to other columns in the same table, and the value of a generated column will be automatically updated if any of the columns on which it depends are modified.

Let's look at a simple example. The statement below will create a table with three integer columns, where the last one is automatically generated as the sum of the first two columns.

ij> CREATE TABLE T(COL1 INT, COL2 INT, COL3 GENERATED ALWAYS AS (COL1+COL2));
0 rows inserted/updated/deleted

Now insert a couple of rows, and see that the value of the third column is calculated automatically:

ij> INSERT INTO T (COL1, COL2) VALUES (1,2), (3,4), (5,6);
3 rows inserted/updated/deleted
ij> SELECT \* FROM T;
COL1       |COL2       |COL3       
-----------------------------------
1          |2          |3          
3          |4          |7          
5          |6          |11         

3 rows selected

Also, any updates of COL1 or COL2 will immediately trigger an update of COL3:

ij> UPDATE T SET COL2 = 100 WHERE COL1 = 1;
1 row inserted/updated/deleted
ij> UPDATE T SET COL1 = NULL WHERE COL1 = 5;
1 row inserted/updated/deleted
ij> SELECT \* FROM T;
COL1       |COL2       |COL3       
-----------------------------------
1          |100        |101        
3          |4          |7          
NULL       |6          |NULL       

3 rows selected

A somewhat more useful use case for generated columns is case-insensitive search. Traditionally, this has been solved by converting the column in question to upper case on the fly in the select statement. For example like this to find all the values starting with a 'c' regardless of case: SELECT ... WHERE UPPER(WORD) LIKE 'C%'

The traditional approach has some disadvantages:

  • The upper case values must be regenerated each time the query is performed
  • Derby doesn't currently let you create an index on an expression (like UPPER(WORD)), so the query must always go through the entire table to find the rows you are looking for

With generated columns, the upper case values are only generated once when the value is inserted or modified. You're also allowed to create an index on a generated column to speed up queries that access it. (Yes, you could achieve the same by using insert/update triggers to keep an ordinary, indexed column updated, but it's a lot easier with generated columns, and insert/update performance should also be better.) To make it possible to perform an efficient case-insensitive search on a column, add a generated column which contains an upper case copy of it and create an index on that column, as shown in the code below:

ij> CREATE TABLE WORDS(WORD VARCHAR(20), UWORD GENERATED ALWAYS AS (UPPER(WORD)));
0 rows inserted/updated/deleted
ij> CREATE INDEX IDX_UWORD ON WORDS(UWORD);
0 rows inserted/updated/deleted
ij> INSERT INTO WORDS(WORD) VALUES 'chocolate', 'Coca-Cola', 'hamburger', 'carrot';
4 rows inserted/updated/deleted
ij> SELECT WORD FROM WORDS WHERE UWORD LIKE 'C%';
WORD                
--------------------
chocolate           
Coca-Cola           
carrot              

3 rows selected

If you want to try it out yourself, you can download the latest Derby 10.5 release candidate (RC2 at the time of writing this).

Friday Dec 05, 2008

User-defined ordering of strings in Apache Derby

Apache Derby (and Java DB) has supported language-based ordering of strings (aka territory-based collation) since version 10.3.1.4. This means for instance that you can get an ORDER BY clause in an SQL query to sort the results alphabetically according to the rules of the language of your choice, instead of sorting the strings by the value of their Unicode codepoints, which is the default.

So when you're using the default codepoint-based ordering, your results would be ordered like this (note that 'a' comes after 'B'):

ij> connect 'jdbc:derby:TestDB;create=true';
ij> select \* from (values 'a','A','B') as s(x) order by x;
X
-
A
B
a

3 rows selected

Using language-based ordering for the en_US locale would give you a more natural ordering of the results (both 'a' and 'A' come before 'B'):

ij> connect 'jdbc:derby:EnglishDB;create=true;territory=en_US;collation=TERRITORY_BASED';
ij> select \* from (values 'A','B','a') as s(x) order by x;
X
-
a
A
B

3 rows selected

This feature is very useful as long as you're happy with the ordering provided by the locales supported by your Java Runtime Environment. But sometimes you may want to sort your strings based on some other rules than those in the languages supported by the JRE. The most common example is that users want case-insensitive string matching. Another example came up recently in this thread on derby-user, where a user wanted to modify the German collation rules to make Greek characters sort near their Latin equivalents ('α' near 'a', 'β' near 'b', and so on).

Derby currently (as of 10.4) doesn't allow you to define your own collation rules, it merely relies on the collators returned by java.text.Collator.getInstance(java.util.Locale). Fortunately, the JRE allows you to define new locales with their own collation rules. And it turns out that it's quite easy, just perform these three simple steps:

  • Create a class that extends java.text.spi.CollatorProvider and returns a collator that orders strings the way you want it to
  • Create a text file named META-INF/services/java.text.spi.CollatorProvider which contains one line with the name of your collator provider class
  • Put the compiled class file and the text file in a jar file which you drop into your JRE's lib/ext directory or in one of the directories specified by the java.ext.dirs property

Let's look at an example. Collators can have different strengths, which tells which differences to regard as significant when comparing strings. Case difference is a tertiary difference, so to get a case-insensitive collator we could set the strength to secondary and the case differences will be ignored. If we want to define a variant of the en_US locale, say en_US_caseinsensitive, whose collators follow the ordering rules of the English language and ignore case differences, we could write this small CollatorProvider:

public class MyCollatorProvider extends CollatorProvider {
    public Locale[] getAvailableLocales() {
        return new Locale[] {
            new Locale("en", "US", "caseinsensitive")
        };
    }
    public Collator getInstance(Locale locale) {
        Collator c = Collator.getInstance(Locale.US);
        // Ignore tertiary differences (case differences)
        c.setStrength(Collator.SECONDARY);
        return c;
    }
}

Now compile the class and package it in a jar, together with a text file in META-INF/services as described above. In the directory where the jar is stored, start ij (Derby's SQL client) with the following command:

$ java -Djava.ext.dirs=. -jar /path/to/derbyrun.jar ij
ij version 10.4
ij> 

Then create a database which uses the new locale that you just defined, and insert some test data and see that the string matching really is case-insensitive:

ij> connect 'jdbc:derby:NoCaseDB;territory=en_US_caseinsensitive;collation=TERRITORY_BASED;create=true';
ij> create table t (x varchar(12));
0 rows inserted/updated/deleted
ij> insert into t values 'hi!','Hi!','hello','Hello','HELLO','HeLlO';
6 rows inserted/updated/deleted
ij> select \* from t where x = 'hi!';
X           
------------
hi!         
Hi!         

2 rows selected
ij> select distinct \* from t order by x;
X           
------------
hello       
hi!         

2 rows selected
ij> select x, count(\*) from t group by x;
X           |2          
------------------------
hello       |4          
hi!         |2          

2 rows selected

If you want to define your own custom ordering, like having Greek characters in between Latin characters, you could define another locale with a CollatorProvider that returns a RuleBasedCollator with whichever rules you want. See its class javadoc for details about how you specify rules. In its simplest form, a set of rules might look like "a,A < b,B < c,C", which means more or less that a and A should be sorted before b and B, which should be sorted before c and C. So to get the Greek characters sorted near similar Latin characters, define a CollatorProvider that looks like this one:

public class MyCollatorProvider extends CollatorProvider {
    public Locale[] getAvailableLocales() {
        return new Locale[] {
            new Locale("en", "US", "greek")
        };
    }
    public Collator getInstance(Locale locale) {
        StringBuilder rules = new StringBuilder();
        // alpha should go between a and b
        rules.append("< a,A < \\u03b1 < b,B");
        // beta should go between b and c
        rules.append("& b,B < \\u03b2 < c,C");
        // add more rules here ....
        try {
            return new RuleBasedCollator(rules.toString());
        } catch (ParseException pe) {
            throw new Error(pe);
        }
    }
}

Again, put the compiled class and META-INF/services/java.text.spi.CollatorProvider in a jar file, and start ij with -Djava.ext.dirs=. in the directory where the jar file is located. Create a database which uses the new locale and insert some data with both Greek and Latin characters:

ij> connect 'jdbc:derby:GreekDB;territory=en_US_greek;collation=TERRITORY_BASED;create=true';
ij> create table t (x varchar(12));
0 rows inserted/updated/deleted
ij> insert into t values 'a', 'b', 'c', 'α', 'β';
5 rows inserted/updated/deleted
ij> select \* from t order by x;
X           
------------
a           
α           
b           
β           
c           

5 rows selected

Look, the ordering was just like we wanted it to be, with the Greek characters in between the Latin ones, and not at the end where they would normally be located!

One final word of caution: If you ever update your custom CollatorProvider so that the ordering is changed after you have created a database, you will need to recreate the database. This is because the indexes in the database are ordered, and you may see strange results if the indexes are ordered with a different collator than the one your database is currently using.

Want to try this out yourself? Here's a Java class that implements the CollatorProvider for the two examples. Put it in a directory together with this Ant script, and execute ant to create a jar file with the required file in META-INF/services (requires Apache Ant 1.7.0 or later).

About

kah

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today