First pass at JSR223 support

With encouragement from Steve, I went ahead and put in the code to allow attention processing with scripts in the data store, as proposed in my last post. To keep things simple, I added a single method for attention and didn't try to expand to other data types yet. The method signature looks like:

public Object processAttention(AttentionConfig ac,
                               String script, String language)
         throws AuraException, RemoteException;

The method is implemented in part in the BerkeleyItemStore class. The AttentionConfig describes the parameters for the attentions to process. In the BerkeleyItemStore, the database is queried to get the matching attentions, then they're passed as a list into the script.

The script is expected to implement two methods. A process method is called by the BerkeleyItemStore with the matching Attentions and returns any object. The results are collected by the DataStoreHead (in a different process) and passed as a list into a collect method. This method also returns an Object which is returned to the caller of processAttention.

So to sum up the play counts for a particular artist, we'd set the AttentionConfig's target to the band's key, and the type to "PLAYED". Using JavaScript, our script would look something like this:

//
// process takes a list of attention as its input
function process(attns) {
    var attentions = attns.toArray();
    var count = 0;

    for (var x in attentions) {
        count += attentions[x].getNumber() - 0;
    }
    return count;
}

//
// collect takes a list of the results from above
function collect(results) {
    var counts = results.toArray();
    var count = 0;
    for (var x in counts) {
        count += counts[x] - 0;
    }
    return count;
}

Or in Python like this:

def process(attns):
     return sum([int(x.getNumber()) for x in attns.toArray()])

def collect(results):
     return sum(results.toArray())

The oddity with subtracting zero ensures that the variable is treated as an integer rather than a string. To test the performance, I tried two different counting methods. I selected about 14,000 attention objects to use, representing about 24 million plays (this is unusually high because of a bug in some test code I was running, but the number of plays isn't too important here). In the first method, I pulled the attention objects to the client and added up their counts. On average, this seemed to take about 320ms. For the second method, I used the above script to do the counting deeper in the data store. This method took only about 120ms (including the time to compile the script). I suspect that caching the compiled scripts (since they're likely to run more than once) will save a chunk of that time as well!

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

Jeff Alexander is a member of the Information Retrieval and Machine Learning group in Oracle Labs.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Feeds