In today’s enterprise environments, high availability is a necessity no matter which operating system you are running. Whether you’re running mission-critical applications or managing complex workloads, maintaining uptime and fault tolerance is key. This blog post dives into the essential steps required to configure a Windows Server Failover Cluster (WSFC) within virtual machines hosted on Oracle Virtualization.

You’ll learn the prerequisites to create instances, shared storage requirements, and how to validate Windows Server configuration requirements to run the Failover Cluster in instances on Oracle Virtualization. Whether you’re a seasoned sysadmin or just exploring clustering in virtualized environments, this article explains how to configure Oracle Virtualization’s infrastructure to run WSFC.

Let’s get started on building a robust, scalable, and resilient cluster that can weather any storm.

Prerequisites from Oracle Virtualization perspective

  • An up-to-date Oracle Virtualization deployment
  • Update to ovirt-engine-4.5.5-1.50.el8.noarch.rpm
  • Oracle Linux KVM hosts running qemu-7.2.0-23 or newer, libvirt-9.0.0-12 or newer
  • Shared storage providing Direct LUNs

Configuration

KVM hosts

On each KVM host, create the /etc/multipath/conf.d/scsi3-reservation.conf file with the following content:

defaults {

    reservation_key file

}

This option will be added to the existing defaults section.

Reload the multipathd.service:

[root@kvm1 ~]# systemctl reload multipathd.service

To confirm the option is enabled, you can run the following command:

[root@kvm1 ~]# multipath -t

defaults {

        …

        reservation_key file

        …

Engine server

Enable the ovirt-engine application PropagateDiskErrors option and restart the ovirt-engine.service:

[root@engine ~]# engine-config -s PropagateDiskErrors=true

                

[root@engine ~]# engine-config -g PropagateDiskErrors

Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false

PropagateDiskErrors: true version: general

Windows Failover Clustering

Windows Server Failover Cluster software requires a Windows domain controller running the Active Directory Domain Services (ADDS) and Domain Name System (DNS). The ADDS domain controller must run apart from the Windows Failover Clustering nodes. All Windows Failover Clustering nodes must join the same Windows domain, must be running the same Windows server version, and must be in the same time zone. Windows Server 2016 introduced Workgroup Cluster, an alternative configuration that doesn’t require a domain controller running ADDS. In a Workgroup Cluster, the nodes join a workgroup instead of a domain. A DNS server is, however, still required. A Workgroup Cluster can be less expensive and require fewer hardware resources than a traditional Failover Cluster, without sacrificing high availability.

Attaching Direct LUNs to the first Windows Failover Clustering node

Assumptions:

  • Windows Failover Clustering VMs already created in the Oracle Virtualization.
  • The same Windows server OS version is installed, configured in the same time zone, and joined to the Windows domain.
  • The Windows Failover Clustering tool is already installed on the Windows server cluster nodes.
  • At least one shared LUN is available to be attached to the cluster nodes.
  • LUN not configured as a Direct LUN disk.

Procedure:

  • Edit the first Windows cluster VM and create a new disk by clicking the + button, then clicking the Create button.
  • In the New Virtual Disk dialog, click the Direct LUN button:
    • Give it a name.
    • In the Discover Targets section, enter the portal Address and click discover. Login to the desired LUN to use.
    • After logging in, expand the Target Name section, and check the radio box for the LUN to use.
    • Check both the Shareable and Enable SCSI-3 Reservation checkboxes.
  • Click OK to confirm and close.

Note

Using the Storage Disks New dialog does not show the required options to be enabled. Using the New Virtual Disk dialog when editing the VM is needed.

Attaching the Direct LUN to the remaining cluster nodes

Procedure:

  • Edit the Windows cluster VM and create a new disk by clicking the + button, then clicking the Attach button.
  • In the Attach Virtual Disk dialog, click the Direct LUN button:
    • Check the radio button for the desired LUN from the list and click Ok.
    • Click Ok again to close the Edit Virtual Machine window and save the configuration.
    • Edit the VM again and edit the recently added Direct LUN to configure the SCSI-3 Reservation option.
    • Check the Enable SCSI-3 Reservation checkbox and click Ok to close the Virtual Disk definition.
  • Click OK to confirm and close.

Windows Failover Clustering validation

Before creating a Failover Cluster, it is recommended that you validate the resources available.

Activating the shared Direct LUN

Using the Disk Management tool, bring the virtual disk online and initialize it.

Testing resources

Assuming you already have the Windows Failover Clustering tool installed, run the Failover Cluster Manager tool. Click the Validate Configuration link, enter both nodes’ hostnames to the list of hosts to test, and click Next. Run all tests by clicking Next.

In the end, a summary will be shown as follows:

remote viewer screen shot

Conclusion

After following the previous steps, you will be able to deploy a Windows Failover Cluster running on virtual machines hosted in Oracle Virtualization.

Resources

References

 

Credit to Marcos Sungaila