Monday Jun 09, 2014

Master-slave vs. peer-to-peer archictecture: benefits and problems

Almost two decades ago, I was a member of a database development team that introduced adaptive locking. Locking, the most popular concurrency control technique in database systems, is pessimistic. Locking ensures that two or more conflicting operations on the same data item don’t “trample” on each other’s toes, resulting in data corruption. In a nutshell, here’s the issue we were trying to address. In everyday life, traffic lights serve the same purpose. They ensure that traffic flows smoothly and when everyone follows the rules, there are no accidents at intersections.

As I mentioned earlier, the problem with typical locking protocols is that they are pessimistic. Regardless of whether there is another conflicting operation in the system or not, you have to hold a lock! Acquiring and releasing locks can be quite expensive, depending on how many objects the transaction touches. Every transaction has to pay this penalty. To use the earlier traffic light analogy, if you have ever waited at a red light in the middle of nowhere with no one on the road, wondering why you need to wait when there’s clearly no danger of a collision, you know what I mean.

The adaptive locking scheme that we invented was able to minimize the number of locks that a transaction held, by detecting whether there were one or more transactions that needed conflicting eyou could get by without holding any lock at all. In many “well-behaved” workloads, there are few conflicts, so this optimization is a huge win. If, on the other hand, there are many concurrent, conflicting requests, the algorithm gracefully degrades to the “normal” behavior with minimal cost.

We were able to reduce the number of lock requests per TPC-B transaction from 178 requests down to 2! Wow! This is a dramatic improvement in concurrency as well as transaction latency.

The lesson from this exercise was that if you can identify the common scenario and optimize for that case so that only the uncommon scenarios are more expensive, you can make dramatic improvements in performance without sacrificing correctness.

So how does this relate to the architecture and design of some of the modern NoSQL systems? NoSQL systems can be broadly classified as master-slave sharded, or peer-to-peer sharded systems. NoSQL systems with a peer-to-peer architecture have an interesting way of handling changes. Whenever an item is changed, the client (or an intermediary) propagates the changes synchronously or asynchronously to multiple copies (for availability) of the data. Since the change can be propagated asynchronously, during some interval in time, it will be the case that some copies have received the update, and others haven’t.

What happens if someone tries to read the item during this interval? The client in a peer-to-peer system will fetch the same item from multiple copies and compare them to each other. If they’re all the same, then every copy that was queried has the same (and up-to-date) value of the data item, so all’s good. If not, then the system provides a mechanism to reconcile the discrepancy and to update stale copies.

So what’s the problem with this? There are two major issues:

First, IT’S HORRIBLY PESSIMISTIC because, in the common case, it is unlikely that the same data item will be updated and read from different locations at around the same time! For every read operation, you have to read from multiple copies. That’s a pretty expensive, especially if the data are stored in multiple geographically separate locations and network latencies are high.

Second, if the copies are not all the same, the application has to reconcile the differences and propagate the correct value to the out-dated copies. This means that the application program has to handle discrepancies in the different versions of the data item and resolve the issue (which can further add to cost and operation latency).

Resolving discrepancies is only one part of the problem. What if the same data item was updated independently on two different nodes (copies)? In that case, due to the asynchronous nature of change propagation, you might land up with different versions of the data item in different copies. In this case, the application program also has to resolve conflicts and then propagate the correct value to the copies that are out-dated or have incorrect versions. This can get really complicated. My hunch is that there are many peer-to-peer-based applications that don’t handle this correctly, and worse, don’t even know it. Imagine have 100s of millions of records in your database – how can you tell whether a particular data item is incorrect or out of date? And what price are you willing to pay for ensuring that the data can be trusted? Multiple network messages per read request? Discrepancy and conflict resolution logic in the application, and potentially, additional messages? All this overhead, when all you were trying to do was to read a data item.

Wouldn’t it be simpler to avoid this problem in the first place? Master-slave architectures like the Oracle NoSQL Database handles this very elegantly. A change to a data item is always sent to the master copy. Consequently, the master copy always has the most current and authoritative version of the data item. The master is also responsible for propagating the change to the other copies (for availability and read scalability). Client drivers are aware of master copies and replicas, and client drivers are also aware of the “currency” of a replica. In other words, each NoSQL Database client knows how stale a replica is.

This vastly simplifies the job of the application developer. If the application needs the most current version of the data item, the client driver will automatically route the request to the master copy. If the application is willing to tolerate some staleness of data (e.g. a version that is no more than 1 second out of date), the client can easily determine which replica (or set of replicas) can satisfy the request, and route the request to the most efficient copy. This results in a dramatic simplification in application logic and also minimizes network requests (the driver will only send the request to exactl the right replica, not many).

So, back to my original point. A well designed and well architected system minimizes or eliminates unnecessary overhead and avoids pessimistic algorithms wherever possible in order to deliver a highly efficient and high performance system. If you’ve every programmed an Oracle NoSQL Database application, you’ll know the difference!

Monday May 12, 2014

Article on NoSQL Database by James Anthony (CTO - e-DBA)

James Anthony recently published this article about the latest release of Oracle NoSQL Database.  James does an excellent job describing the basic NoSQL concepts such as CAP theorem and ACID transactions.  His insights into how to use database systems and NoSQL systems are based on extensive experience building large production applications.

 Definitely worth a read.

Friday Dec 20, 2013

Big Data and NoSQL expert e-DBA lends a helping hand

Found this offer on one of the Oracle Partner sites and thought it would be worth sharing with the community.

Oracle partner e-DBA offers an Oracle NoSQL Database free trial including personalized assistance for getting setup, building applications, loading your data and highlighting its scalable performance for your specific project objectives.  This is an excellent opportunity to get some direct support from a partner who has successfully deployed the Oracle NoSQL database into production for mission critical, high value, low latency application requirements. See the following site for complete details:

Hope everyone is having an awesome holiday season.  Cheers .... Robert

Monday Dec 09, 2013

Look I shrunk the keys

Whether we use relational or non-relational database to durably persist our data on the disk, we all are aware of the fact that how indexes plays a major role in accessing the data in real time. There is one aspect most of us tend to overlook while designing the index-key i.e. how to efficiently size them.

Not that I want to discount the fact but in traditional databases where we used to store from hundreds of thousands to few million of records in a database, sizing the index-key didn’t come (that often) as a very high priority but in NoSQL database where you are going to persist few billion to trillions of records every byte saved in the key goes a long mile.

That is exactly what came up this week while working on one of the POC and I thought I should share with you the best practices and tricks of the trade that you can also use in developing your application. So here is my numero uno recommendation for designing the index Keys:

  • Keep it Small.

Now there is nothing new there that you didn't know already, right? Right but I just wanted to highlight it, so if there is anything you remember from this post then it is this one bullet item.

All right, here is what I was dealing with this week: couple billion records of telematics/spacial data that we needed to capture and query based on the timestamp (of the feed that was received) and x and y co-ordinates of the system. To run the kind of queries we wanted to run (on spacial data), we came up with this as an index-key:


How we used above key structure to run spacial queries is another blog post but for this one I would just say that when we plugged in the values to the variables our key became 24 bytes (1+13+5+5) long. Here’s how:

Table Prefix => type char = 1 byte (eg. S)

Timestamp => type long = 13 bytes (eg.1386286913165)

X co-ordinate => type string = 5 bytes (eg. 120.78 degree, 31.87 degree etc)

Y co-ordinate => type string = 5 bytes (eg. 132.78 degree, 33.75 degree etc)

With amount of hardware resource we had available (for POC) we could create 4 shards cluster only. So to store two billion records we needed to store (2B records/4 shards) 500 million records on each of the four shards. Using DBCacheSize utility, we calculated we would need about 32 GB of JE cache on each of the Replication Node (RN).

$java -d64 -XX:+UseCompressedOops -jar $KVHOME/lib/je.jar DbCacheSize -records 500000000 

-key 24 

=== Database Cache Size ===
 Minimum Bytes        Maximum Bytes          Description
---------------       ---------------        -----------
 29,110,826,240   32,019,917,056         Internal nodes only

But we knew that if we can shrink the key size (without losing the information) we can save lot of memory and can improve the query time (as search is a function of # of records and size of each record) as well. So we built a simple encoding program that uses range of 62 ASCII characters (0-1, a-z, A-Z) to encode any numeric digit. You can find the program from here or build your own but what is important to note here is that we were able to represent same information with less number of bytes:

13 Byte Timestamp (e.g. 1386286913165) became 7 byte (e.g. opc2BTn)

5 byte X/Y co-ordinate (e.g. 132.78) became 3 byte each (e.g. a9X/6dF)

i.e. 14 byte encoded key (1 + 7 byte + 3 byte + 3 byte). So what’s the fuss that we shrunk our keys (it’s just 10 bytes saving), you would ask? Well, we plugged in the numbers again to DBCacheSize utility and this time the verdict was that we needed only 20GB of JE cache to store same half a billion records on each RN. That’s 40% improvement (12GB of saving/Replication Node) and is definitely an impressive start.

$java -d64 -XX:+UseCompressedOops -jar $KVHOME/lib/je.jar DbCacheSize -records 500000000 
-key 14 

=== Database Cache Size ===
 Minimum Bytes        Maximum Bytes          Description
---------------       ---------------        -----------
 16,929,008,448       19,838,099,264         Internal nodes only

To conclude: you just seen how simple encoding technique can save you big time when you are dealing with billions of records. Next time when you design an index-key just think little harder on how you can shrink it down!

Thursday Oct 31, 2013

Oracle Social Analytics with the Big Data Appliance

Found an awesome demo put together by one of the Oracle NoSQL Database partners, eDBA, on using the Big Data Appliance to do social analytics.

In this video, James Anthony is showing off the BDA, Hadoop, the Oracle Big Data Connectors and how they can be used and integrated with the Oracle Database to do an end-to-end sentiment analysis leveraging twitter data.   A really great demo worth the view. 

Thursday Oct 24, 2013

Fast Data - Big Data's achilles heel

At OOW 2013 in Mark Hurd and Thomas Kurian's keynote, they discussed Oracle's Fast Data software solution stack and discussed a number of customers deploying Oracle's Big Data / Fast Data solutions and in particular Oracle's NoSQL Database.  Since that time, there have been a large number of request seeking clarification on how the Fast Data software stack works together to deliver on the promise of real-time Big Data solutions.   Fast Data is a software solution stack that deals with one aspect of Big Data, high velocity.   The software in the Fast Data solution stack involves 3 key pieces and their integration:  Oracle Event Processing, Oracle Coherence, Oracle NoSQL Database.   All three of these technologies address a high throughput, low latency data management requirement.  [Read More]

Friday Oct 11, 2013

Accolades - Oracle NoSQL customers speak out with praise

For all of those participating in the Oracle NoSQL Database community and following the product evolution, there have been a number of changes emerging on Oracle OTN for the NoSQL Database.

In particular, on the main page Dave's Segleau's NoSQL Now presentation on Enterprise NoSQL is prominently displayed.  This is a great discussion on the trends involved in NoSQL adoption which highlights the most important aspects of NoSQL technology selection and what Oracle in particular is bringing to the movement.    Many of you know that for Oracle getting companies to speak up publicly on their use of our technology is much harder than it is for pure open source startups.  So, I am particularly pleased with the accolades starting to emerge from the users of Oracle NoSQL.   Plus, there is new content getting published every day to help our growing community to champion NoSQL technology adoption within their teams and organizations.

Starting to grow: I've noticed that our Meetup group is also gaining a lot of momentum.  We are now over 400 members strong and growing aggressively.   There is an awesome Meetup coming next week ( Oct 15th at Elance 441 Logue Avenue, Mountain ViewCA ) where Mike Olson, co-founder and Chief Strategy Officer of Cloudera will be talking about the virtues of NoSQL key-value stores.  There are already 88 people signed up for this event, so hurry up and join now or you may end up on a wait-list. 

Spread the word, tell your friends, an Enterprise backed NoSQL is on the move!!

Friday Oct 04, 2013

Flexible schema management - NoSQL sweetspot

I attended a few colleague sessions at Oracle Open World focusing on NoSQL Database use cases.   Dave Segleau from the Oracle NoSQL Database team did some work on the challenges associated with Web Scale personalization.   The main point he was emphasizing is that these personalization kind of applications have very simple data lookup semantics, but that the data itself is quite volatile in nature and comes in all shapes and sized making it difficult to store in traditional relational database technology.   The other challenges then follow, which are commonly involved in most NoSQL based applications, dealing with this data of variety at scale and in near real-time. Here are some references to those session which are worth a review: 

Then the other day, I stumbled upon this story about how Airlines are planning to provide a more personalized shopping experience in the travel process.  I could not help be see the parallels between the requirements found in the online shopping world and those found in ticketing within the Airline industries plans to roll out new personalized services to the travelers.   Clearly, this is a great application area to be considering the use of NoSQL Database technology.  Data variety, scale, responsiveness, all the ingredients that make for an ideal use case for employing NoSQL technology in the solution. 

Wednesday Sep 04, 2013

Oracle delivering value to the Startup while embracing the Enterprise

At the recent NoSQL Now! conference in San Jose, Andy Mendelsohn, SVP Database Server Technology at Oracle delivered a double punch announcement of both the world’s first Engineered System for NoSQL and a move to the open source business model of per server annual support subscription. 

These two options highlight the drive by Oracle to provide value to the developers of both the high end Enterprise and Startup customers alike. Surprisingly, both of these announcements reveal low Total Cost of Ownership solutions for both ends of the business spectrum.  Startups who are just getting started with their business and controlling costs using open source packages, renting their infrastructure in the cloud and Enterprise companies who are controlling expenses while building out substantial Big Data clusters to leverage well understood reserves of corporate data.   Read more to find out what these announcements mean for the youthful Startup and the established Enterprise. 

[Read More]

Tuesday Aug 13, 2013

Oracle NoSQL Database Events

Hey all you Oracle NoSQL Database (OnDB) fans, here is a note regarding some of our upcoming activities.  In particular, the Oracle team is giving several session presentations at the upcoming NoSQL Now! conference in San Jose.   Andy Mendelsohn, SVP Database Server Technologies at Oracle is actually giving the keynote presentation for this conference. Its going to be awesome and I highly recommend folks to come out and meet the NoSQL team.

Also, there is a webcast coming up with the Oracle Developers User Group (ODTUG) that will give a great overview of application development using OnDB

NoSQL Now 2013 ( Aug 20th -22nd )

San Jose Convention Center 150 West San Carlos St., San Jose, CA 95110

Register to attend this event with an Oracle exclusive discount.

KeyNote – Andrew Mendelsohn, SVP Database Server Technologies ( Wed 8/21 9-9:30am )

In this keynote session, Andy Mendelsohn, Oracle's Senior Vice President of Database Server Technologies discusses how Oracle NoSQL Database is delivering on the markets expectations for flexibility, performance and scalability without compromising data reliability. In addition, you'll learn how Oracle is helping customers take advantage of Big Data by integrating NoSQL technologies into enterprise data centers and evolving its portfolio of Engineered Systems. - See into here.

Panel: Enterprise NoSQL, Where next? - David Rubin, Director NoSQL Engineering ( Wed 8/21 2:15-3:00pm )

Enterprise customers are waking up to the NOSQL message in a big way, comfortable now that the technology is robust enough for many enterprise applications, and that the leading companies developing NoSQL products will be around to provide support. Meanwhile, NoSQL vendors are doing their part by addressing enterprise requirements for security, ACID transactions, and even SQL compatibility. The latest moves by major database incumbents (such as Oracle) give further reassurance to enterprise customers that NoSQL technologies should become part of the data management and software portfolio. - See info here.

Session: NoSQL and Enterprise Apps- Dave Segleau ( Thur 8/22 9:30-10:00am )

This session will discuss the key database and application requirements that we hear from our enterprise customers. This session will explore how these requirements are addressed in Oracle NoSQL Database. It will describe several real-world use cases from customers who are using Oracle NoSQL Database today. From a technical perspective, this session will focus on: * Performance, Scalability and Predictability * NoSQL cluster management * Integration with Hadoop, Oracle Database, Coherence and Event Processing * Real-world use cases - See info here

Session: Graphs, RDF and Oracle NoSQL - Zhu Wu ( Thur 8/22 2:00-2:45pm )

Graph and NoSQL are both hot areas of activity, and are seriously considered for Big Data modeling and storage, respectively. Graph, a different data modeling paradigm than the traditional relational and XML data modeling, provides intuitive and flexible data construction, manipulation, query and navigation. NoSQL, in turn, is a database repository, providing an excellent storage architecture that is distributed, horizontally scalable and fault tolerant. We believe that integration of the Graph data model with the NoSQL storage model can provide a robust and massively scalable platform for managing Big Data. In this talk, we will share our first-hand experience in implementing RDF Graph (a W3C standards-based Graph language) capabilities on the Oracle NoSQL Database, a K/V based, distributed and horizontally scalable platform - See info here.

Oracle Developer Users Group Webcast ( Aug 27th )

ODTUG Webcast: Oracle & NoSQL from the Inside – Aug. 27, 9:00am PT
The NoSQL (Not Only SQL) database market is hot with activity, serving new requirements for application development and deployment.  Get up-to-date on this subject, take a dive into how Oracle's NoSQL database has evolved to integrate with the Oracle ecosystem of projects, and get a hands-on tour of what it takes to build an application with the Oracle NoSQL Database and deploy its distributed, highly available, scale-out architecture. Click to Register

Friday Jul 26, 2013

InfoQ looking for users to categorize NoSQL Tech use

Often, when people talk about NoSQL technologies, there is an attempt to categorize the solutions.  In a new Adoption Trends breakdown by InfoQ they also take this tact, providing the following categorizations:   Columnar, Document, Graph, In-Memory Grid, Key-Value.  I think this definitely has some utility, yet in another respect, misses the main point about this technology segment.  These technologies have come to existence to fulfill the needs of 1 primary and 1 ancillary core ability.  The primary ability, surfaced by Amazon, LinkedIn, Facebook, etc is the ability to scale and remain available in the face of tremendous data and concurrent users.  The ancillary ability is to provide more agility for the line of business which is constantly adjusting its solutions with changing needs and understanding of the consumer.  What considerations should drive the thought process of adoption for NoSQL? 

[Read More]

Monday Jul 15, 2013

High-Five for Rolling Upgrade in Oracle NoSQL Database

In today’s world of e-commerce where businesses are operated 24/7, the equations for revenue made or lost are some times expressed in terms of latency i.e. the faster you serve your customer the more likely they are going to do the business with you. Imagine in this scenario (where every millisecond count) your online business going to be inaccessible for few minutes to couple of hours because you needed to apply an important patch or an upgrade.

I think you got an idea that how important it is to stay online and available even during the course of any planned hardware or software upgrade. Oracle NoSQL Database 12c R1 ( release puts you on a track where you can upgrade your NoSQL cluster with no disruption to your business services. And it makes it possible by providing some smart administration tools that calculates the safest combination of storage nodes that can be brought down in parallel and upgraded keeping all the shards in the database available for read/write at all the time.

Let's take a look at a real world example. Say I have deployed a 9x3 database cluster i.e 9 shards, with 3 replica each shard (total of 27 replication-nodes) on 9 physical nodes. I got a highly available cluster (thanks to the intelligent topology feature shipped in 11gR2.2.0.23) with replicas of each shards spread across three physical nodes so that there is no single point of failure. All right so here is how my topology looks like:

[user@host09 kv-2.1.1]$ java -jar lib/kvstore.jar runadmin -port 5000  -host host01

kv-> ping
Pinging components of store mystore based upon topology sequence #136
mystore comprises 90 partitions and 9 Storage Nodes
Storage Node [sn1] on host01:5000    Datacenter: Boston [dc1]    Status: RUNNING   Ver: 12cR1.2.1.1
        Rep Node [rg2-rn1]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5012
        Rep Node [rg3-rn1]      Status: RUNNING,MASTER at sequence number: 41 haPort: 5013
        Rep Node [rg1-rn1]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5011
Storage Node [sn2] on host02:5000    Datacenter: Boston [dc1]    Status: RUNNING   Ver: 12cR1.2.1.1
        Rep Node [rg1-rn2]      Status: RUNNING,MASTER at sequence number: 45 haPort: 5010
        Rep Node [rg3-rn2]      Status: RUNNING,REPLICA at sequence number: 41 haPort: 5012
        Rep Node [rg2-rn2]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5011
Storage Node [sn3] on host03:5000    Datacenter: Boston [dc1]    Status: RUNNING   Ver: 12cR1.2.1.1
        Rep Node [rg2-rn3]      Status: RUNNING,MASTER at sequence number: 45 haPort: 5011
        Rep Node [rg3-rn3]      Status: RUNNING,REPLICA at sequence number: 41 haPort: 5012
        Rep Node [rg1-rn3]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5010
Storage Node [sn4] on host04:5000    Datacenter: Boston [dc1]    Status: RUNNING   Ver: 12cR1.2.1.1
        Rep Node [rg6-rn1]      Status: RUNNING,MASTER at sequence number: 41 haPort: 5012
        Rep Node [rg4-rn1]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5010
        Rep Node [rg5-rn1]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5011
Storage Node [sn5] on host05:5000    Datacenter: Boston [dc1]    Status: RUNNING   Ver: 12cR1.2.1.1
        Rep Node [rg6-rn2]      Status: RUNNING,REPLICA at sequence number: 41 haPort: 5012
        Rep Node [rg4-rn2]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5010
        Rep Node [rg5-rn2]      Status: RUNNING,MASTER at sequence number: 45 haPort: 5011
Storage Node [sn6] on host06:5000    Datacenter: Boston [dc1]    Status: RUNNING   Ver: 12cR1.2.1.1
        Rep Node [rg5-rn3]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5011
        Rep Node [rg4-rn3]      Status: RUNNING,MASTER at sequence number: 45 haPort: 5010
        Rep Node [rg6-rn3]      Status: RUNNING,REPLICA at sequence number: 41 haPort: 5012
Storage Node [sn7] on host07:5000    Datacenter: Boston [dc1]    Status: RUNNING   Ver: 12cR1.2.1.1
        Rep Node [rg9-rn1]      Status: RUNNING,MASTER at sequence number: 41 haPort: 5012
        Rep Node [rg7-rn1]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5010
        Rep Node [rg8-rn1]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5011
Storage Node [sn8] on host08:5000    Datacenter: Boston [dc1]    Status: RUNNING   Ver: 12cR1.2.1.1
        Rep Node [rg8-rn2]      Status: RUNNING,MASTER at sequence number: 45 haPort: 5011
        Rep Node [rg9-rn2]      Status: RUNNING,REPLICA at sequence number: 41 haPort: 5012
        Rep Node [rg7-rn2]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5010
Storage Node [sn9] on host09:5000    Datacenter: Boston [dc1]    Status: RUNNING   Ver: 12cR1.2.1.1
        Rep Node [rg7-rn3]      Status: RUNNING,MASTER at sequence number: 45 haPort: 5010
        Rep Node [rg8-rn3]      Status: RUNNING,REPLICA at sequence number: 45 haPort: 5011
        Rep Node [rg9-rn3]      Status: RUNNING,REPLICA at sequence number: 41 haPort: 5012

Notice that each storage node (sn1-sn9) is hosting one MASTER node and two REPLICA nodes from entirely different shards. Now if you would like to upgrade the active cluster to a latest version of Oracle NoSQL Database without a downtime then all you need to do is grab the latest binaries from the OTN and lay down the bits (NEW_KVHOME) on each of the 9 nodes (only once if you had a shared drive accessible from all the nodes). From the administration command-line-interface (CLI) simply perform 'show upgrade':

[user@host09 kv-2.1.8]$ java -jar lib/kvstore.jar runadmin -port 5000  -host host01

kv-> show upgrade
Calculating upgrade order, target version:, prerequisite:
sn3 sn4 sn7
sn1 sn8 sn5
sn2 sn6 sn9

SNs in each horizontal row represents the storage nodes that can be patched/upgraded in parallel and multiple rows represent the sequential order  i.e. you can upgrade sn3, sn4 & sn7 in parallel and once all three are done, you can then move to the next row (with sn1, sn8 & sn5) and so on and so forth. You must be asking what if you have fairly large cluster and you don't want to manually upgrade them, can you not automate this process by writing a script, well we have already done that for you as well. An example script is available for you to try out and can be found from:

I hope you would find this feature useful just the way I do. If you need more details on this topic then I would recommend that you visit Upgrading an Existing Oracle NoSQL Database Deployment  in the Administrator's Guide. If you are new to Oracle NoSQL Database, get the complete product documentation from here and learn about the product from self paced web tutorials with some hands on exercises as well.

Wednesday Jul 10, 2013

EclipseLink JPA and Oracle NoSQL Database (ONDB)

Back in 2005, I was the Project Lead for JSR220-ORM tooling in Eclipse. Sun’s JSR220 project was the early POJO persistence standard for Java that became the EBJ 3.0 spec, the predecessor of the JPA Interface found today in every Java download.

In the same timeframe, Oracle announced it was joining the Eclipse Foundation and in that process launching a competing JSR220 tooling project called Dali. Needless to say, it did not take long for the JSR220-ORM and Dali project teams to merge into a single project. Eventually, the Oracle team took lead on Dali and marched into the future.

Here I am now 8 years later at Oracle helping drive the standardization of NoSQL technology. JPA presents a great abstraction layer for dealing with database persistence, allowing users of Java to persist their application objects with literally the push of a button.  Plus, when using JPA with a NoSQL database, it allows the developer to use a soft schema approach to application development, where the data model is driven from the application space rather than the database design and evolution of the application can occur much more rapidly.  In fact, in 2005 when I was V.P. Technology for a NoSQL Database company, one of the things we did was create a JPA interface for standards based access to our NoSQL store, the reason we launched that JSR220 tool project in Eclipse. So, I thought I would poke around a little bit with the Oracle NoSQL Database (ONDB) and JPA interfaces. To my surprise, I found that some folks had already made a great start down that path….pretty cool.   There is an EclipseLink plugin that supports NoSQL Databases , including ONDB.

[Read More]

Wednesday Jul 03, 2013

NoSQL : Key-Value stores and range based operations

In the NoSQL database space, the key-value store (as opposed to Document, Graph, Object, XML stores) has the strongest reputation for being able to scale out well, providing high throughput writes along with highly concurrent reads.  That is possible because of the power in developing the database architecture over a simplistic data model where the key is easily used to hash into a growing number of logical groups (physical nodes). 

However, in some cases, that simplistic key-value model could tie your hands when you want to provide a use case that needs a data access pattern involving a range or collections (especially if large) of data.  It turns out, this is a fairly common use case pattern showing up in areas of analysis involving time-series, sensor data aggregation, activity grouping, and a whole bunch of use cases where you've got logically related, nested data models.  

So, one of the key questions to ask is whether or not your key-value store is providing facilities to help in the implementation of those types of use cases.
[Read More]

Tuesday Jul 02, 2013

Oracle's Director NoSQL Database Product Management talks with ODBMS.ORG

I was pinged by one of my favorite database technology sites today, ODBMS.ORG - informing that Dave Segleau, the Director of Oracle NoSQL Database product management spent some time talking with their editor Roberto Zicari about the product.   Its a great interview and I highly recommend the read.  I think its important to understand the connectivity that Oracle NoSQL Database (ONDB) has with Berkeley DB, as it says a lot about the maturity of ONDB as it relates to data integrity and reliability.[Read More]

This blog is about everything NoSQL. An open place to express thoughts on this exciting topic and exchange ideas with other enthusiasts learning and exploring about what the coming generation of data management will look like in the face of social digital modernization. A collective dialog to invigorate the imagination and drive innovation straight into the heart of our efforts to better our existence thru technological excellence.


« October 2015