MySQL 5.5: InnoDB adaptive_flushing – How it works?
By Calvin Sun on Sep 19, 2010
Note: this article was originally published on http://blogs.innodb.com on Sept 19, 2010 by Inaam Rana.
Write-heavy workloads can reach a situation where InnoDB runs out of usable space in its redo log files. When that happens, InnoDB does a lot of disk writes to create space and you can see a drop in server throughput for a few seconds. From InnoDB plugin 1.0.4 we have introduced the ‘innodb_adaptive_flushing’ method that uses a heurstic to try to flush enough pages in the background so that it is rare for the very active writing to happen. In this note I’ll try to explain how the heuristic works i.e.: what factors are taken into account when deciding how many dirty pages to flush from the buffer pool in the background. I’ll skip some details for the sake of clarity.
You may find InnoDB glossary useful to understand the terminology used in this note. adaptive_flushing is of consequence to you if your workload involves significant write activity. For example, if you are running MySQL 5.0 or MySQL 5.1 with built-in InnoDB and you see periodic drop in the performance with corresponding spikes in write activity then you should consider upgrading to InnoDB plugin.
Let us first familiarize ourselves with why and how flushing takes place in the buffer pool. Flushing is the activity of writing dirty pages to the disk. We need to flush because of two main reasons. First, we need to read in a page into the buffer pool and there are no free buffers. In this case we select a page as victim (based on LRU) and if the victim is dirty we write it to the disk before reusing it. This, I’ll call, LRU_list flush. Secondly, we need to write dirty pages to the disk when we want to reuse the redo logs. Redo logs in InnoDB (or for that matter in most other database engines as well) are used in a circular fashion. We can only overwrite redo log entries if, and only if, the dirty pages corresponding to these entries have been flushed to the disk. This type of flushing is called flush_list flush.
adaptive_flushing is related to the flush_list flush (though it is also influenced by the LRU_list flush as we’ll see shortly). If we allow the redo logs to fill up completely before we trigger flushing of the dirty pages then it will cause a serious drop in performance during this spike in the I/O. To avoid this we’d like to keep flushing dirty pages in the background at a rate that is just enough to avoid the I/O bursts. Note that we don’t want to flush too aggressively because that will reduce the chances of write combining thus resulting in more I/O than is required. So how do we determine the rate at which we do flush_list flush in the background? That is where the adaptive_flushing heuristic comes into play.
We do know the capacity of the redo logs which is constant and is equal to the combined size of the log files. If we know the rate at which redo log is being generated then we can figure out how many seconds we have before the whole redo log is filled up.
n = log_capacity / redo_gen_rate
Assume there are n_dirty dirty pages in the buffer pool. If we have to avoid the I/O burst we must be able to flush n_dirty pages in n seconds. In other words, our required flushing rate can be given by the following formula:
required_flush_rate = n_dirty / n = (n_dirty * redo_gen_rate) / log_capacity
There is one bit missing in the above formula. What if there is also an LRU_list flush also happening? That will be flushing dirty pages as well. We don’t want to over do the flushing effort and it makes sense to deduct LRU_list flush rate from the above.
desired_flush_rate = (n_dirty * redo_gen_rate) / log_capacity – LRU_flush_rate
That is all there is to it. In order to avoid sudden spikes we keep the history of the LRU_flush_rate and the redo_gen_rate and use weighted averages in our calculations.
As is the case with most heuristics the above may not work under some workloads. That is why we introduced the innodb_adaptive_flushing parameter which can be used to disable this feature (it is enabled by default). In MySQL 5.5.4 we introduced multiple buffer pools meaning there are multiple LRU lists and multiple flush lists. Obviously adaptive flushing has to be adapted for this. In MySQL 5.5.6 we have worked on this further by limiting the excess flushing caused by rounding off and flushing neighbors of a victim. But this is definitely not the end of the story. Consider this an area under focus. We continue to work towards finding better solution for the overall flushing inside InnoDB. One more reason why your feedback will be more than welcome. All I can say at this stage is give us your feedback and stay tuned!