Exadata X9M was announced not so long ago, so there's lots of information to get through. However I came across something just the other day I thought I'd highlight. An enhancement in Exadata X9M that has received zero publicity. In fact it would have remained hidden had we not be the inquisitive type of Product Managers that we are...
I'm calling it out now, because to me it encapsulates the story of Exadata, and proves to me that our Development team are fully engaged with the Exadata vision.
The Exadata Database Server Disk Expansion Kit installation procedure.
I know, exciting, right?...
For those that haven't seen (or needed) this, the Database Server Disk Expansion Kit allows you to add additional space to the Database Server's local storage. So if you're looking to increase the number of ORACLE_HOMEs or add space to a VM for extra logging or some such, the disk expansion kit is available.
In (recent) previous generations, the local storage on the Database Server was made up of 4x 1.2TB hard drives, configured as RAID5, using a RAID HBA device to ensure redundancy and availability. The expansion kit was also 4x 1.2TB hard drives. Adding the new drives automatically reconstructed the RAID5 with the additional disks (all online of course), and took a few hours to complete.
However in Exadata X9M, we use NVMe SSD drives, 2x drives as standard, and 2x drives in the expansion kit. RAID5 is obviously out, and there's no RAID HBA mentioned in the hardware bill of materials (you'll just have to trust me on this one)... So what's going on here? Enquiring minds want to know.
Let's start with the OOTB configuration. This being an Exadata, we can expect the 2x drives to be in some form of highly available configuration. We're on the database server for this exercise, so dbmcli
is where we start:
DBMCLI> list physicaldisk attributes name,devicename,luns,disktype,makemodel
NVME_0 /dev/nvme0 0_2,0_1 FlashDisk "Oracle 3.84 TB 2.5-inch NVMe PCIe 4.0 SSD"
NVME_1 /dev/nvme1 1_2,1_1 FlashDisk "Oracle 3.84 TB 2.5-inch NVMe PCIe 4.0 SSD"
OK, we can see the two drives, their underlying device names, associated LUNs, disk type and make/model. Lets drop down into OS land to dig deeper.
Let's look for a RAID controller:
[root@xdpm01adm01 ~]# lspci -s `lspci |grep -i raid |awk ' { print $1}'`
0000:64:00.5 RAID bus controller: Intel Corporation Volume Management Device NVMe RAID Controller (rev 04)
Nice, so since this is a "software RAID" device, we should be able to see what's going on using the standard multi-device admin tool (mdadm
):
[root@xdpm01adm01 ~]# cat /proc/mdstat |egrep "nvme[0|1]"|grep " active"
md25 : active raid1 nvme1n2[1] nvme0n2[0]
md24 : active raid1 nvme1n1[1] nvme0n1[0]
Two devices surfaced as md24 and md25. What do they map to?
[root@xdpm01adm01 ~]# cat /etc/mdadm.conf |grep dev
ARRAY /dev/md/24 container=ffee1199:aa990000:a9a9a9a9:0aa999aa member=0 UUID=9991111a:aaa9a9a9:aa99aa99:ff0f00ff
ARRAY /dev/md/25 level=raid1 metadata=1.2 num-devices=2 UUID=1119999f:fff0f0f0:ff00ff00:ff0f00fe name=localhost:25
So we have an MD container, and a standard MD device. We're not interested in the md24 container, as this houses the /boot and /boot/efi filesystems:
[root@xdpm01adm01 ~]# df |grep md24
/dev/md24p1 7546828 85560 7461268 2% /boot
/dev/md24p2 260094 7596 252498 3% /boot/efi
Quick look at which volume group maps to the physical md device:
[root@xdpm01adm01 ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/md25 VGExaDb lvm2 a-- 3.48t 0
A single Volume Group, (proves my thoughts on /dev/md24 by it not being here). Lets look at the volume group.
[root@xdpm01adm01 ~]# vgs
VG #PV #LV #SN Attr VSize VFree
VGExaDb 1 11 0 wz--n- 3.48t 39.75g
So we have a single volume group (VG), with one physical volume (PV), eleven logical volumes (LV), 3.48t (VSize) allocated with ~40g free (VFree). We can use lvs
to see the logical volumes:
[root@xdpm01adm01 ~]# lvs
LV VG Attr LSize
LVDbExaVMImages VGExaDb -wi-ao---- 3.37t
LVDbHome VGExaDb -wi-ao---- 4.00g
LVDbSwap1 VGExaDb -wi-ao---- 16.00g
LVDbSys1 VGExaDb -wi-ao---- 15.00g
LVDbSys2 VGExaDb -wi-a----- 15.00g
LVDbTmp VGExaDb -wi-ao---- 3.00g
LVDbVar1 VGExaDb -wi-ao---- 2.00g
LVDbVar2 VGExaDb -wi-a----- 2.00g
LVDbVarLog VGExaDb -wi-ao---- 18.00g
LVDbVarLogAudit VGExaDb -wi-ao---- 1.00g
LVDoNotRemoveOrUse VGExaDb -wi-a----- 2.00g
OK. All looks as expected. This is a KVM host, hence the VMImages logical volume and not Oracle and Grid Home logical volumes.
So after all that, here's what we know:
Diagrammatically, it would look something like this:
Let's see what happens when we add the two drives in the Expansion Disk Kit.
After physically inserting the disk, check if they've turned up using dbmcli:
DBMCLI> list physicaldisk attributes name,deviceName,luns,diskType,makeModel
NVME_0 /dev/nvme0 0_2,0_1 FlashDisk "Oracle 3.84 TB 2.5-inch NVMe PCIe 4.0 SSD"
NVME_1 /dev/nvme1 1_2,1_1 FlashDisk "Oracle 3.84 TB 2.5-inch NVMe PCIe 4.0 SSD"
NVME_2 /dev/nvme2 2_1 FlashDisk "Oracle 3.84 TB 2.5-inch NVMe PCIe 4.0 SSD"
NVME_3 /dev/nvme3 3_1 FlashDisk "Oracle 3.84 TB 2.5-inch NVMe PCIe 4.0 SSD"
We can see the additional drives are there (NVME_2 and NVME_3), their underlying device names, a single LUN on each drive this time, not like the original 2.
Digging deeper in the OS to see what's been done in terms of RAID devices:
[root@xdpm01adm01 ~]# cat /proc/mdstat |grep "nvme"|grep " active"
md24 : active raid1 nvme1n1[1] nvme0n1[0]
md25 : active raid1 nvme1n2[1] nvme0n2[0]
md26 : active raid1 nvme3n1[1] nvme2n1[0]
The nvme2 and nvme3 devices are already partitioned and a 3rd RAID1 device has been created. The resulting device is /dev/md26. Let's see the physical volume mapping:
[root@xdpm01adm01 ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/md25 VGExaDb lvm2 a-- 3.48t 0
/dev/md26 VGExaDb lvm2 a-- 3.49t 3.49t
So /dev/md26 is already added to the Volume Group. Let's see a few more details:
[root@xdpm01adm01 ~]# pvscan
PV /dev/md25 VG VGExaDb lvm2 [3.48 TiB / 39.75 GiB free]
PV /dev/md26 VG VGExaDb lvm2 [3.49 TiB / 3.49 TiB free]
Total: 2 [<6.98 TiB] / in use: 2 [<6.98 TiB] / in no VG: 0 [0 ]
So in theory, if we look at the volume group, we should see the increased size from the addition of md26:
[root@xdpm01adm01 ~]# vgs
VG #PV #LV #SN Attr VSize VFree
VGExaDb 2 11 0 wz--n- <6.98t 3.53t
And there you go, the VGExaDb volume group now has 2 physical volumes (PV), a total of 6.98t allocated and 3.53t free ready for allocation, (and we haven't actually needed to do anything yet!)
The only thing to do, (which won't be automatically done for you - for good reason) is grow the logical volume of the filesystem we want to add the space to.
In my case, being a KVM Host, I want to give (almost) all of the new space to the VMImages filesystem.
Reminder, here's what we have currently:
[root@xdpm01adm01 ~]# lvs
LV VG Attr LSize
LVDbExaVMImages VGExaDb -wi-ao---- 3.37t
LVDbHome VGExaDb -wi-ao---- 4.00g
LVDbSwap1 VGExaDb -wi-ao---- 16.00g
LVDbSys1 VGExaDb -wi-ao---- 15.00g
LVDbSys2 VGExaDb -wi-a----- 15.00g
LVDbTmp VGExaDb -wi-ao---- 3.00g
LVDbVar1 VGExaDb -wi-ao---- 2.00g
LVDbVar2 VGExaDb -wi-a----- 2.00g
LVDbVarLog VGExaDb -wi-ao---- 18.00g
LVDbVarLogAudit VGExaDb -wi-ao---- 1.00g
LVDoNotRemoveOrUse VGExaDb -wi-a----- 2.00g
Using the standard lvm tool we can now extend the LVDbExaVMImages volume:
[root@xdpm01adm01 ~]# lvm lvextend -l +98%FREE /dev/VGExaDb/LVDbExaVMImages
Size of logical volume VGExaDb/LVDbExaVMImages changed from <3.37 TiB (882669 extents) to <6.83 TiB (1790003 extents).
Logical volume VGExaDb/LVDbExaVMImages successfully resized.
(Allocating 98%, just in case I need a bit of space somewhere else down the line).
Verify the new size of the logical volume:
[root@xdpm01adm01 ~]# lvs
LV VG Attr LSize
LVDbExaVMImages VGExaDb -wi-ao---- <6.83t
LVDbHome VGExaDb -wi-ao---- 4.00g
LVDbSwap1 VGExaDb -wi-ao---- 16.00g
LVDbSys1 VGExaDb -wi-ao---- 15.00g
LVDbSys2 VGExaDb -wi-a----- 15.00g
LVDbTmp VGExaDb -wi-ao---- 3.00g
LVDbVar1 VGExaDb -wi-ao---- 2.00g
LVDbVar2 VGExaDb -wi-a----- 2.00g
LVDbVarLog VGExaDb -wi-ao---- 18.00g
LVDbVarLogAudit VGExaDb -wi-ao---- 1.00g
LVDoNotRemoveOrUse VGExaDb -wi-a----- 2.00g
Then, to finish it off, grow the filesystem using xfs_growfs:
[root@xdpm01adm01 ~]# xfs_growfs /EXAVMIMAGES/
meta-data=/dev/mapper/VGExaDb-LVDbExaVMImages isize=512 agcount=4, agsize=225963264 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0
spinodes=0 rmapbt=0 = reflink=1
data = bsize=4096 blocks=903853056, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal bsize=4096 blocks=441334, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
data blocks changed from 903853056 to 1832963072
That's it, two commands! lvm lvextend and xfs_growfs, and you're done!
Behind the scenes quite a few things were automated, as we saw. A quick run through:
Here it is result diagrammatically:
I forgot to mention the timing of the tasks that have been automated in the above.
Basically, it's all completed and ready for expanding the logical volume in the time it takes the Oracle Field Engineer to say
"Hang on a sec, I'll just put down the phone to insert the new drives......ok they're in".
As I mentioned at the beginning. In my mind, this procedure encapsulates the essence of one of our Exadata vision statements, being to provide "Automated Management - Fully automating and optimizing the end-to-end management for Exadata."
While adding disk expansions may only appeal to or be required by some customers, this procedure, hidden in the depths of Exadata System Software, shows the attention to detail this development team puts in to make the Exadata Platform the best database platform in the world.
A big thank you to Anna and Ryan for help on this (part of our fantastic Development team working to make this platform what it is).
We are always interested in your feedback. You're welcome to engage with us via Twitter @ExadataPM, @GavinAtHQ or @alex_blyth.
Gavin is a product manager in Oracle’s Exadata team, with a focus on software and hardware roadmap. Prior to the Exadata team, Gavin was a founding member of the team responsible for launching the industry’s first on-premises public cloud technology - Oracle Cloud at Customer. In his 15 years in Oracle Product Management, Presales and Consulting roles, Gavin has developed a robust understanding of all things Oracle, helping customers architect and implement a variety of infrastructure and application technologies.