The Resize Inode in the Ext4 Filesystem

January 16, 2024 | 9 minute read
Text Size 100%:

In an ext4 filesystem, there are 11 inodes which are referred to as special inodes, and the resize inode is one of them. These special inodes are initialized by mkfs.ext4. Each of them has a special purpose. The following table lists all of these special inodes and their purpose:

 

inode
Purpose
0
NULL inode, used to indicate that there is no inode
1
Keeps track of defective blocks of the disk
2
Root directory inode
3
User Quota
4
Group Quota
5
Boot loader
6
Undelete directory
7
Resize Inode
8
Journal Inode
9
The “exclude” inode, for snapshots
10
Replica inode, used for some non-upstream feature
11
Traditional first non-reserved inode. Usually, this is the lost+found directory.

 

In this blog we restrict our attention to the resize inode. The resize inode facilitates the resizing of the filesystem. It maintains a map of reserved gdt blocks, which are used when a filesystem is expanded.

The number of reserved gdt blocks depends upon the number of block groups in the filesystem. The below formula gives the number of reserved blocks:

reserved_gdt_blocks = ceil(rsv_grps * (group_descriptor_size/blocksize)) - ceil(number_of_block_groups * group_descriptor_size / blocksize)
where rsv_grps = min(0xffffffff, total_number_of_blocks * 1024) / blocks_per_group

A 1 GiB ext4 filesystem with 4k blocksize has 8 block groups and the group descriptor size is 64 bytes, using the formula mentioned above, the number of reserved_gdt_blocks is calculated as follows:

rsv_grps = 8192
reserved_gdt_blocks = ceil(8192 * 64 /4096) - ceil(8 * 64/4096)
                    = 128 - 1 = 127 blocks

This value is stored in superblock->s_reserved_gdt_blocks.

These reserved_gdt_blocks blocks are part of the resize inode, as can be seen if we look at the stat output of the resize inode:

debugfs:  stat <7>
Inode: 7   Type: regular    Mode:  0600   Flags: 0x0
Generation: 0    Version: 0x00000000:00000000
User:     0   Group:     0   Project:     0   Size: 4299210752
File ACL: 0
Links: 1   Blockcount: 5088
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x6422c9bc:00000000 -- Tue Mar 28 11:04:28 2023
 atime: 0x6422c9bc:00000000 -- Tue Mar 28 11:04:28 2023
 mtime: 0x6422c9bc:00000000 -- Tue Mar 28 11:04:28 2023
crtime: 0x6422c9bc:00000000 -- Tue Mar 28 11:04:28 2023
Size of extra inode fields: 32
Inode checksum: 0xc96421a4
BLOCKS:
(DIND):4246, (IND):2, (2060):32770, (2061):98306, (2062):163842, (2063):229378,
(IND):3, (3084):32771, (3085):98307, (3086):163843, (3087):229379
...
(IND):128, (131084):32896, (131085):98432, (131086):163968, (131087):229504
TOTAL: 636

Under BLOCKS:, we see TOTAL: 636. We got this number from 127*5 + 1 = 636, the 127 here is the number of reserved_gdt_blocks. This is because, in a 1 GiB filesystem a primary copy of the superblock and gdt is stored in the 0th block group and their backup copies are stored in the 1st, 3rd, 5th, and 7th block groups (backup copies are stored in the 1st block group and the block groups whose number is the power of 3, 5, 7. e.g: 1, 3, 5, 7, 9, 25, 27, 49, etc). Similarly, a set of blocks are also reserved in each of these block groups as backup reserved_gdt_blocks. Hence, we multiplied 127 with 5 (1 primary copy + 4 backup copies) and the extra 1 block which was added is the 1st data block of the resize inode which stores the indirect map of all reserved_gdt_blocks.

BLOCKS

resize-inode-blocks

Now, let us shift our focus to the BLOCKS field in the stat output. The resize inode uses indirect mapping instead of extent-based mapping to map its blocks. The first thing we see is the DIND (highlighted in red) which stands for Double Indirect Block, following it, we have INDs (highlighted in yellow) numbering from 2 to 128 which are Indirect Blocks. We have 127 INDs which is equal to reserved_gdt_blocks. Before diving into what all these numbers mean, we first need to understand how indirect mapping works, especially the Double Indirect mapping case:

A DIND can map from 1 to $blocksize/4 INDs. And each IND can map $blocksize/4 blocks.

We can see the DIND is at block number 4246, and it can map up to 1024 INDs. But, in our case, it maps to 127 INDs from 2 to 128 because we only have 127 reserved gdt blocks. In general DIND indexing should start from IND 1 but here it is an exception (we will see why in the next section), it starts from IND 2. And as already mentioned each IND can map $blocksize/4 blocks which are 1024 blocks.

By convention in indirect block addressing the first 12 blocks are mapped directly, the next 1024 blocks are mapped by an indirect block, and from 1037th block to 1049612th block these are mapped by Double indirect blocks. In the case of the resize inode, only Double indrect blocks are used for mapping therefore, INDs start indexing the blocks from 1036 (i.e 1037th block and the first 1036 blocks are not used in this case), and each IND can map 1024 blocks, therefore the 1st IND can map 1024 blocks indexing them from 1036 to 2059. Similarly, the 2nd IND can map 1024 blocks indexing them from 2060 to 3083, the 3rd IND can map 1024 blocks indexing them from 3084 to 4107, and so on.

Now, let’s shift our focus to INDs. For IND 2 block number 2, and IND 3 block number 3, these are storing the maps that point to the next level blocks (leaf blocks), in the above image they are underlined and the number in the brackets are their indices. Look at the number of blocks each IND is pointing to, which is 4, this is because in this filesystem backup copies are stored in 4 block groups. So the number of blocks each IND is pointing to is equal to the number of block groups that store the backup copies.

The image below shows the graphical view of indirect mapping:

resize inode mapping

Now, let’s look at the contents of the DIND block i.e. block 4246.

Note: Ext4 used little endian notation.

DIND before resizing

The first 4 bytes of the block are zero, and following it each of the 4 bytes points to the next level blocks (i.e. INDs).

Let’s also look at the contents of the IND block. The following image shows the contents of the 3rd IND block.

IND3 before resizing

Here too, we can see the map where every four bytes (underlined with red) is pointing to next-level blocks (i.e. leaf blocks).

We get the same information presented in the BLOCKS field of stat output.

After resizing the filesystem

The resize2fs utility is used to resize an ext4 filesystem, before performing a filesystem expansion one should make sure that the device partition where the filesystem is residing contains enough space to fit the expanded filesystem, insufficient space results in failure.

resize2fs /dev/sdc4 10G

stat output of resize inode after resizing:

debugfs:  stat <7>
Inode: 7   Type: regular    Mode:  0600   Flags: 0x0
Generation: 0    Version: 0x00000000:00000000
User:     0   Group:     0   Project:     0   Size: 4299210752
File ACL: 0
Links: 1   Blockcount: 9080
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x6422c9bc:00000000 -- Tue Mar 28 11:04:28 2023
 atime: 0x6422cc33:00000000 -- Tue Mar 28 11:14:59 2023
 mtime: 0x6422cc33:00000000 -- Tue Mar 28 11:14:59 2023
crtime: 0x6422cc33:00000000 -- Tue Mar 28 11:14:59 2023
Size of extra inode fields: 32
Inode checksum: 0x7d36a479
BLOCKS:
(DIND):4246, (IND):3, (3084):32771, (3085):98307, (3086):163843, (3087):229379, (3088):294915, (3089):
819203, (3090):884739, (3091):1605635
...
(IND):128, (131084):32896, (131085):98432, (131086):163968, (131087):229504, (131088):295040, (1310
89):819328, (131090):884864, (131091):1605760
TOTAL: 1135

From the BLOCKS field, we see that DIND is 4246 which is the same as in the previous case. However the first IND is the 3rd whereas it was 2nd in the previous case. This is because in the previous case when the filesystem size was 1GiB, it had 8 block groups hence it had 8 block group descriptors that resided in a single block i.e block number 1. When the filesystem is resized to 10GiB, the number of block groups is increased to 80. The size of each block group descriptor is 64 bytes, as the single block could only accommodate 64 group descriptors, an extra block is needed to accommodate the remaining 16 group descriptors. Therefore, a new block from the reserved_gdt_blocks is taken, which is block number 2. That is why block number 2, which used to be IND 2 and part of resize inode disappeared. Similarly, in the previous case, IND 1 was absent because the 1st IND was already in use by gdt. If we had directly created a 10GiB filesystem, then the 1st two blocks would have been in use right from the beginning, therefore both IND1 and IND2 would have been absent, and when a resize is done, depending upon the requirement, the blocks from reserved_gdt_blocks blocks would be used and accordingly, their corresponding INDs would vanish from the stat output.

One more change that one can see is that, before each IND was used to map to 4 next-level blocks, now each IND is mapping to 8 (count the indices in each IND) next-level blocks. This is because when the filesystem size is increased the number of block groups where the backup copies are to be stored also increased. In the previous case, they used to be 4 (block groups 1, 3, 5, 7) now they are 8 (1, 3, 5, 7, 9, 25, 27, 49).

Now, let’s look at the contents of DIND block 4246:

DIND after resizing

We find that everything is almost the same, except the fact that the 4-7th bytes which were pointing to the 2nd block earlier are now zeroed. Also note the indexing now starts from 3.

The contents of IND 3 block:

IND3 after resizing

The difference here is that before it used to point to 4 blocks, but now it is pointing to 8 blocks.

Conclusion

In this blog, we have seen the role of the resize inode and the significance of reserved_gdt_blocks in filesystem resizing. We also took a dive into the complex looking BLOCKS field of the resize inode in order to further understand their usage and meaning.

Srivathsa Dara


Previous Post

Securing Open Source Software

Eric Maurice | 8 min read

Next Post


gprofng: Java Profiling

Elena Zannoni | 14 min read