We see that there can be many symlinks created under /dev/disk. These are created by systemd-udevd during the processing of udev rules. This blog discusses the kernel and udev events and how they are processed and how the processing of these events causes various symlinks to be created under /dev/disk/. Below is a list of such entries collected from the console output:
[root@20230327-1240 disk]# pwd /dev/disk [root@20230327-1240 disk]# ls -l total 0 drwxr-xr-x. 2 root root 320 Apr 1 13:47 by-id drwxr-xr-x. 2 root root 60 Mar 18 10:13 by-label drwxr-xr-x. 2 root root 60 Mar 18 10:12 by-partlabel drwxr-xr-x. 2 root root 100 Mar 18 10:12 by-partuuid drwxr-xr-x. 2 root root 180 Apr 1 13:47 by-path drwxr-xr-x. 2 root root 120 Jun 17 05:26 by-uuid [root@20230327-1240 disk]#
These symlinks are either created during the filesystem creation by mkfs or during the system boot up.
During mkfs
The mkfs utility creates the desired filesystem on the disk. As part of this, various events are generated, both, as KERNEL events as well as udev events. systemd-udevd processes these events based on rules defined under /usr/lib/udev/rules.d/.
The symlinks under by-uuid and by-label are directly related to mkfs. systemd-udevd is perpetually listening for events, so whenever mkfs creates a filesystem and writes to disk and closes the devices file, inotify generates an event called IN_CLOSE_WRITE. While _systemd-udevd__ processes this event, it disables inotify for that block device, and does a blkid scan on it as part of this event processing. After the blkid scan is done, it re-enables inotify listening.
Below is the sequence of actions involved in this workflow:
- mkfs opens device file
- mkfs creates the filesystem
- mkfs closes device file which results in inotify IN_CLOSE_WRITE event
- udev temporarily disbles inotify and performs blkid scan
- udevd writes into kernfs to cause a synthetic uevent as a kernel event.
- As part of kernel event processing, udev, systemd-udevd reads uuid and labels from the filesystem and creates by-uuid and by-label links
- Then, a uevent is triggered for any userspace consumers.
- And after all these steps, udev re-enables inotify
systemd-udevd processes IN_CLOSE_WRITE events and writes into kernfs, which is an in-memory pseudo filesystem. For this, it opens the ‘uevent’ file corresponding to the device/partition. (For example: /sys/devices/platform/host7/session1/target7:0:0/7:0:0:2/block/sdb/sdb1/uevent). It writes with “change” as the action into this file. This causes uevent_store() to be called, which sends a synthetic uevent in the form of a ‘KERNEL’ event indicating a change in the corresponding device. A sample KERNEL event is shown below that contains the details of the device:
KERNEL[10290436.775270] change /devices/pci0000:00/0000:00:04.7/0000:18:00.0/virtio1/host5/target5:0:0/5:0:0:3/block/sda/sda1 (block) ACTION=change DEVNAME=/dev/sda1 DEVPATH=/devices/pci0000:00/0000:00:04.7/0000:18:00.0/virtio1/host5/target5:0:0/5:0:0:3/block/sda/sda1 DEVTYPE=partition DISKSEQ=6 MAJOR=8 MINOR=1 PARTN=1 SEQNUM=3573 SUBSYSTEM=block SYNTH_UUID=0
As part of the processing of this KERNEL event, systemd-udevd reads various parameters from the filesystem such as UUID, label (if the filesystem supports label) and processes the information based on the udev rules.
During Bootup
initramfs contains the /init process. This interacts with PCI, IDE, SATA, SCSI, and NVMe controllers, and detects the block devices. For each of the detected disks, the kernel sends an ‘add’ KERNEL event. An example ‘add’ event is shown below:
ACTION=add DEVNAME=/dev/sdb DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-1/1-1:1.0/host6/target6:0:0/6:0:0:0/block/sdb DEVTYPE=disk SUBSYSTEM=block MAJOR=8 MINOR=16 ID_SERIAL=SanDisk_Cruzer_Glide_4C530001220702114173 ID_VENDOR=SanDisk ID_MODEL=Cruzer_Glide ID_FS_TYPE=ext4 ID_FS_LABEL=MYUSB
The first sector of the disk contains either an MBR signature or GPT signature. Depending on this, either MBR or GPT is read and the ‘add’ KERNEL event is sent to add the partitions:
ACTION=add DEVPATH=/devices/pci0000:00/0000:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sda/sda1 DEVNAME=/dev/sda1 DEVTYPE=partition ID_BUS=ata ID_MODEL=Samsung_SSD_860_EVO_1TB ID_MODEL_ENC=Samsung\x20SSD\x20860\x20EVO\x201TB ID_PART_ENTRY_DISK=8:0 ID_PART_ENTRY_NAME= ID_PART_ENTRY_NUMBER=1 ID_PART_ENTRY_OFFSET=2048 ID_PART_ENTRY_SIZE=488397312 ID_PART_ENTRY_TYPE=0x83 ID_PART_TABLE_TYPE=dos MAJOR=8 MINOR=1 SUBSYSTEM=block TAGS=:systemd: USEC_INITIALIZED=5632345
/init starts the udevadm trigger command to force udev to process these udevd events. udevd creates links for block devices under /dev, such as /dev/sda, /dev/sda1 for the disks and partitions. However the _/dev/disk/by-*_ links are created either in initramfs or after / is mounted depending on the initramfs contents.
systemd-udevd uses 60-persistent-storage.rules for general disks and 13-dm-disk.rules for device-mapper devices and 63-md-raid-arrays.rules for md arrays (software RAIDS created with mdadm command) and creates by-uuid and by-label entries.
by-uuid and by-label: by-uuid links are based on the UUID and by-label links are based on the filesystem label, if found.
systemd-udevd uses 60-persistent-storage.rules files to create entries under /dev/by-path.
by-path: This attribute provides a symbolic name that refers to the storage device by the hardware path used to access the device. This can change if the PCI ID, target port, or LUN number changes. by-path has the format HOST:BUS:LUN:TARGET (H:B:T:L)
[root@20240125-1442 rules.d]# ls -l /dev/disk/by-path/ total 0 lrwxrwxrwx. 1 root root 9 Mar 6 13:33 pci-0000:18:00.0-scsi-0:0:0:1 -> ../../sdb lrwxrwxrwx. 1 root root 10 Mar 6 13:33 pci-0000:18:00.0-scsi-0:0:0:1-part1 -> ../../sdb1 lrwxrwxrwx. 1 root root 10 Mar 6 13:33 pci-0000:18:00.0-scsi-0:0:0:1-part2 -> ../../sdb2 lrwxrwxrwx. 1 root root 10 Mar 6 13:33 pci-0000:18:00.0-scsi-0:0:0:1-part3 -> ../../sdb3 lrwxrwxrwx. 1 root root 9 Jul 1 08:16 pci-0000:18:00.0-scsi-0:0:0:3 -> ../../sda lrwxrwxrwx. 1 root root 10 Jul 1 09:21 pci-0000:18:00.0-scsi-0:0:0:3-part1 -> ../../sda1 [root@20240125-1442 rules.d]#
systemd-udevd uses 60-persistent-storage.rules files to create entries under /dev/by-id.
by-id: This is related to the device itself. It is populated when a disk is scanned. The ‘id’ is part of the device itself and not a part of the data stored on the device. This is retrieved by the SCSI Inquiry command. Hence this is more related to a blkid scan than to mkfs. Entries are made for the whole disk as well as each of the partitions.
/dev/disk/by-id has two entries per device, scsi-* and wwn-*. The rules contained in 63-scsi-sg3_symlink.rules defines what data is used for scsi-* format and 60-persistent-storage.rules defines the same for wwn-* format.
lrwxrwxrwx. 1 root root 9 Mar 6 13:33 scsi-3605447b1be9845b48e01deef4694cd48 -> ../../sdb lrwxrwxrwx. 1 root root 10 Mar 6 13:33 scsi-3605447b1be9845b48e01deef4694cd48-part1 -> ../../sdb1 lrwxrwxrwx. 1 root root 9 Mar 6 13:33 wwn-0x605447b1be9845b48e01deef4694cd48 -> ../../sdb lrwxrwxrwx. 1 root root 10 Mar 6 13:33 wwn-0x605447b1be9845b48e01deef4694cd48-part1 -> ../../sdb1
There are also entries for dm devices under by-id. The entries are make as per rules defined in 1_3-dm-disk.rules and 69-dm-lvm-metad.rules:
lrwxrwxrwx. 1 root root 10 Mar 4 06:54 dm-name-ocivolume-root -> ../../dm-0 lrwxrwxrwx. 1 root root 10 Mar 4 06:54 dm-uuid-LVM-mHOQeI3O0o87nHzXs0kOjaSeEVLcrVPQObC8eTo1qs0g1DwVXtm9fk7OzGrHfYRp -> ../../dm-0 lrwxrwxrwx. 1 root root 10 Mar 6 13:33 lvm-pv-uuid-0ntgiN-eNi7-R4fR-rc2o-vynb-pc4y-qrfrra -> ../../sdb3
After reading the partition table, systemd-udevd processes rules in 60-persistent-storage.rules, 13-dm-disk.rules and 63-md-raid-arrays.rules in order to create entries under /dev/disk/by-partuuid.
by-partuuid: This is the symbolic link created using partuuid, that is a unique identifier stored in the partition table. For GPT partitions, it is GUID and for MBR it is a concatenation of disk id with the partition number.
Additionally systemd-udevd processes 60-persistent-storage.rules and 13-dm-disk.rules to create entries under /dev/disk/by-parttable as this info is also obtained from the partition table itself.
by-partlabel: These labels are also stored in partition tables along with the partition IDs. A symbolic link is created using these labels. The partition labels do not change if the filesystems on the partitions change.
When all the rules are processed, at the end systemd-udevd sends out a uevent. The user can monitor such events to understand if mkfs created the filesystem successfully.
A sample uevent after rule processing is given below for mkfs.btrfs:
UDEV [10290436.879909] change /devices/pci0000:00/0000:00:04.7/0000:18:00.0/virtio1/host5/target5:0:0/5:0:0:3/block/sda/sda1 (block) .ID_FS_TYPE_NEW=btrfs ACTION=change DEVLINKS=/dev/disk/by-partuuid/23ea5dc6-01 /dev/disk/by-id/scsi-360c098be9b3d46079481cfe39aae948e-part1 /dev/oracleoci/oraclevdc1 /dev/disk/by-id/wwn-0x60c098be9b3d46079481cfe39aae948e-part1 /dev/disk/by-label/data-btrfs /dev/disk/by-path/pci-0000:18:00.0-scsi-0:0:0:3-part1 /dev/disk/by-uuid/cd530dea-7f4d-45be-a307-f683fa43c2cc DEVNAME=/dev/sda1 DEVPATH=/devices/pci0000:00/0000:00:04.7/0000:18:00.0/virtio1/host5/target5:0:0/5:0:0:3/block/sda/sda1 DEVTYPE=partition DISKSEQ=6 ID_BTRFS_READY=1 ID_BUS=scsi ID_FS_LABEL=data-btrfs ID_FS_LABEL_ENC=data-btrfs ID_FS_TYPE=btrfs ID_FS_USAGE=filesystem ID_FS_UUID=cd530dea-7f4d-45be-a307-f683fa43c2cc ID_FS_UUID_ENC=cd530dea-7f4d-45be-a307-f683fa43c2cc ID_FS_UUID_SUB=8aa1005f-d98b-46a5-95a8-3573d192b1bf ID_FS_UUID_SUB_ENC=8aa1005f-d98b-46a5-95a8-3573d192b1bf ID_MODEL=BlockVolume ID_MODEL_ENC=BlockVolume\\x20\\x20\\x20\\x20\\x20 ID_PART_ENTRY_DISK=8:0 ID_PART_ENTRY_NUMBER=1 ID_PART_ENTRY_OFFSET=2048 ID_PART_ENTRY_SCHEME=dos ID_PART_ENTRY_SIZE=104855552 ID_PART_ENTRY_TYPE=0x83 ID_PART_ENTRY_UUID=23ea5dc6-01 ID_PART_TABLE_TYPE=dos ID_PART_TABLE_UUID=23ea5dc6 ID_PATH=pci-0000:18:00.0-scsi-0:0:0:3 ID_PATH_TAG=pci-0000_18_00_0-scsi-0_0_0_3 ID_REVISION=1.0 ID_SCSI=1 ID_SCSI_INQUIRY=1 ID_SERIAL=360c098be9b3d46079481cfe39aae948e ID_SERIAL_SHORT=60c098be9b3d46079481cfe39aae948e ID_TYPE=disk ID_VENDOR=ORACLE ID_VENDOR_ENC=ORACLE\\x20\\x20 ID_WWN=0x60c098be9b3d4607 ID_WWN_VENDOR_EXTENSION=0x9481cfe39aae948e ID_WWN_WITH_EXTENSION=0x60c098be9b3d46079481cfe39aae948e MAJOR=8 MINOR=1 PARTN=1 SCSI_IDENT_LUN_NAA_REGEXT=60c098be9b3d46079481cfe39aae948e SCSI_MODEL=BlockVolume SCSI_MODEL_ENC=BlockVolume\\x20\\x20\\x20\\x20\\x20 SCSI_REVISION=1.0 SCSI_TPGS=0 SCSI_TYPE=disk_
This UDEV type of event are meant for various applications. They can register to receive notification for such events and take necessary actions as per the application design.