Series: Committing data to the database - Introduction
By Gerald Venzl-Oracle on Aug 29, 2014
Database Version: 1.0+
Availability: All Database Editions
You may think that I start my blog with a really trivial topic but believe me that there are still people out there who don't know how and when to commit data to the Oracle Database. Of course, the answer to how is via COMMIT and to when is whenever your business logic requires you to but what most people don't know is what overhead a commit actually brings with it and then limit performance by something that could be described as "over-committing". Now, I need to go a bit back to the basics first before I dig right into code and ask the question what a commit actually is and what happens when you do issue a commit. As you may know a commit ends the current transaction and makes all changes performed in the transaction permanent and visible to other users. That is all well and good however, in order to do so the database has to perform some tasks in the background. When a transaction commits, the following actions occur:
The log writer process (LGWR) writes remaining redo log entries in the redo log buffers to the online redo log and writes the transaction SCN to the online redo log. This atomic event constitutes the commit of the transaction.
Oracle Database deletes savepoints.
Oracle Database performs a commit cleanout.
If modified blocks containing data from the committed transaction are still in the SGA, and if no other session is modifying them, then the database removes lock-related transaction information (the ITL entry) from the blocks.
- Oracle Database marks the transaction complete.
After a transaction commits, other users can view the changes.
As you can see there are a couple of steps required to process a commit. Typically, a commit is a fast operation, regardless of the transaction size. The speed of a commit does not increase with the size of the data modified in the transaction. The lengthiest part of the commit is the physical disk I/O performed by LGWR. However, the amount of time spent by LGWR is reduced because it has been incrementally writing the contents of the redo log buffer in the background.
Now you know what actually happens under the covers when you issue a COMMIT. But what do I mean by over-committing? Well, every time you issue a COMMIT all those steps above need to be performed. There is nothing wrong about that and they are necessary to keep the data safe. However, a lot of applications commit after every row, even though there is no reason why they have to. A very good example are batch related programs: A file comes in by the end of the day and needs to be processed. The application reads the file and processes the messages within the file in parallel. Every thread gets one message, processes it and then commits its work to the database. For every message a new SCN is generated, the redo log buffers get written to the online redo logs and so forth. But actually, there is no need to do all that work for every message. All the required data is within the file and the file resides on permanent storage, i.e. disk. If there happens to be a failure during the processing the application can always restart and continue where it stopped; all the information is still there. It doesn't have to commit after every message it can quite happily commit after 10 messages or even 100 messages which makes no difference to the commit duration nor does it matter to the business logic. But for what it does matter is the throughput on the database, and that I'll show in the next post.