How to Set Up Windows Server 2016 Failover Clustering
Failover Clustering in Windows Server 2016 provides high availability for critical services and applications by allowing multiple servers to work together as a group. If one node fails, another node automatically takes over its workloads, minimizing downtime. This guide walks you through the process of installing and configuring a Windows Server 2016 Failover Cluster.
Prerequisites
Before creating a failover cluster, ensure the following:
- A minimum of two servers running Windows Server 2016 with identical or compatible hardware.
- All cluster nodes are joined to the same Active Directory domain.
- Shared storage (SAN, iSCSI, or Storage Spaces Direct) is accessible to all nodes.
- All nodes have the same version of Windows Server 2016 and current updates installed.
- Network adapters are configured for cluster communication (dedicated heartbeat network recommended).
Step 1: Install the Failover Clustering Feature
Run the following on all cluster nodes:
Install-WindowsFeature -Name Failover-Clustering -IncludeManagementTools
Restart the servers if prompted. Verify the feature is installed:
Get-WindowsFeature -Name Failover-Clustering
Step 2: Validate the Cluster Configuration
Before creating the cluster, run the Cluster Validation Wizard to check hardware, software, and network compatibility. This is required for Microsoft support:
Test-Cluster -Node "Node1", "Node2" -Include "Storage", "Network", "System Configuration", "Inventory"
Review the HTML report generated by the validation. Address any warnings or failures before proceeding.
Step 3: Create the Failover Cluster
Once validation passes, create the cluster using PowerShell from any node:
New-Cluster -Name "Cluster01" -Node "Node1", "Node2" -StaticAddress "192.168.1.100" -NoStorage
Use -NoStorage if you plan to add storage separately. The -StaticAddress parameter assigns the cluster IP address.
Step 4: Configure Cluster Quorum
Quorum determines how many votes are required for the cluster to remain online. For a two-node cluster, configure a file share witness or cloud witness:
Set-ClusterQuorum -FileShareWitness "\fileserverClusterWitness"
For a cloud witness using Azure:
Set-ClusterQuorum -CloudWitness -AccountName "mystorageaccount" -AccessKey "youraccesskey"
View current quorum configuration:
Get-ClusterQuorum
Step 5: Add Shared Storage to the Cluster
If storage was not added during cluster creation, add it now. First ensure shared disks are visible in Disk Management on all nodes. Then add to the cluster:
Get-ClusterAvailableDisk | Add-ClusterDisk
Assign a disk as the cluster witness quorum disk if applicable:
Set-ClusterQuorum -DiskWitness "Cluster Disk 1"
Step 6: Add Cluster Roles
Cluster roles are services or applications that run on the cluster. To add a generic service:
Add-ClusterGenericServiceRole -ServiceName "MyService" -Name "MyClusteredService" -StaticAddress "192.168.1.101"
For a File Server role:
Add-ClusterFileServerRole -Name "FileServer01" -Storage "Cluster Disk 1" -StaticAddress "192.168.1.102"
Step 7: Configure Failover and Failback Settings
Set the preferred owner for a cluster role:
Set-ClusterOwnerNode -Group "MyClusteredService" -Owners "Node1", "Node2"
Configure automatic failback to the preferred node after it recovers:
$group = Get-ClusterGroup -Name "MyClusteredService"
$group.AutoFailbackType = 1
$group.FailbackWindowStart = 2
$group.FailbackWindowEnd = 4
Step 8: Test Failover
Move a cluster role to another node to test failover manually:
Move-ClusterGroup -Name "MyClusteredService" -Node "Node2"
Verify that the role comes online on Node2. Then move it back:
Move-ClusterGroup -Name "MyClusteredService" -Node "Node1"
Step 9: Monitor Cluster Health
Get-ClusterNode
Get-ClusterGroup
Get-ClusterResource
To check event logs for cluster-related events:
Get-WinEvent -LogName "Microsoft-Windows-FailoverClustering/Operational" -MaxEvents 50
Summary
Windows Server 2016 Failover Clustering provides the high availability needed for mission-critical workloads. By validating your configuration, selecting the correct quorum model, and thoroughly testing failover behavior, you can build a resilient cluster that automatically recovers from node failures with minimal service interruption.