Introduction
The Introduction to XFS Transaction Mechanism blog provided an overview of Checkpoint transactions. The metadata tracked by these transactions end up on the on-disk log. This article aims to provide a description of the on-disk layout of the checkpoint transactions.
On-disk log layout
The on-disk log’s contents can be visualized as a sequence of log records.
The pointers Tail and Head demarcate the boundaries of the active region of the log (indicated with the blue color). The Head of the log is where new log records are supposed to be written. The Tail of the log points to the oldest log record whose metadata items are yet to be written to their respective locations on disk. It is moved forward when the metadata in the oldest log record is written back.
A filesystem operation which modifies an inode will cause the modified parts of the inode to be written into the new log record LRq+1. The Head pointer will then point to immediately after LRq+1.
Assuming the log record LRp contained modifications made to the superblock, updating the corresponding on-disk superblock with these modifications will move the tail to point to LRp+1.
Also, The Head will wrap around when the end of the on-disk log is reached.
Each of the Head and Tail pointers are actually composed of two values,
- Cycle number of the log
- Offset inside the on-disk log.
The cycle number component is incremented when the Head pointer wraps around the log.
Each log record contains a subset of fields of one or more metadata items among other things. Also, log records are of varying sizes based on the number and size of the metadata items modified.
Log record structure
The log record starts with a log record header followed by an alternating sequence of Operation header and an optional field which contains the metadata (e.g. Inode) that has been modified.
Log record header
The log record header is represented by struct xlog_rec_header
. The following are some of the important fields of the header,
h_magicno
Set toXLOG_HEADER_MAGIC_NUM
.h_len
The length of the log record.h_lsn
The Log Sequence Number of the log record. It indicates the location in the on-disk log where the log record starts. Similar to the Tail and Head pointers, this field is composed of Cycle number and an offset.h_tail_lsn
The value of the Tail pointer at the time of writing this log record to the on-disk log.h_num_logops
The number of Operation headers in the log record.
Checkpoint transactions
A Checkpoint transaction consists of an alternating sequence of Operation header and Metadata. Checkpoint transactions can be laid out in on-disk log in many ways. The following list illustrates a subset of the possible cases.
-
One checkpoint transaction fits exactly in a single log record.
-
One checkpoint transaction spread across several log records.
The value of
h_num_logops
in each log record will be set to the number of operation headers in that log record rather than the total number of operation headers in the checkpoint transaction. -
Mulitple checkpoint transactions written inside a single log record.
The value of
h_num_logops
will be set to the total number of operation headers across all the checkpoint transactions.
Log Operation header
An Operation header describes the content of metadata which is logged right next to it. The operation header is represented by a struct xlog_op_header
. The following are some of the important fields of this structure.
oh_tid
The ID of the checkpoint transaction.oh_len
The length of the metadata payload.oh_flags
Valid values for this field are,XLOG_START_TRANS
Indicates that this is a start record.XLOG_COMMIT_TRANS
Indicates that the entire checkpoint transaction has been safely written to the on-disk log.XLOG_CONTINUE_TRANS
In the case where a checkpoint transaction is spread across multiple log records, the last metadata of a log record can be partially written. The corresponding operation header will have the continue flag set to indicate that the remaining part of the metadata can be obtained from the next log record in the log.XLOG_WAS_CONT_TRANS
This flag indicates that the operation header’s payload holds the remaining part of the metadata whose initial content was written at the end of the previous log record.XLOG_END_TRANS
This flag is set when the payload contains all of the remaining metadata whose initial content was written at the end of the previous log record.XLOG_UNMOUNT_TRANS
An operation header with this flag set is written to indicate that the filesystem has been unmounted cleanly.
The following operation headers of a checkpoint transaction have special significance:
-
Zeroth operation header
This indicates the beginning of a new checkpoint transaction.
-
First operation header
The payload of this operation header holds the following information about the checkpoint transaction,
- Transaction ID
- Number of metadata items being logged.
-
Last operation header.
This indicates the end of the checkpoint transaction.
Conclusion
This article provided an overview of the on-disk layout of XFS’ checkpoint transactions. A future article will provide actual examples of metadata being logged.