How to Set Up Windows Server 2016 Failover Clustering

Failover Clustering in Windows Server 2016 provides high availability for critical services and applications by allowing multiple servers to work together as a group. If one node fails, another node automatically takes over its workloads, minimizing downtime. This guide walks you through the process of installing and configuring a Windows Server 2016 Failover Cluster.

Prerequisites

Before creating a failover cluster, ensure the following:

  • A minimum of two servers running Windows Server 2016 with identical or compatible hardware.
  • All cluster nodes are joined to the same Active Directory domain.
  • Shared storage (SAN, iSCSI, or Storage Spaces Direct) is accessible to all nodes.
  • All nodes have the same version of Windows Server 2016 and current updates installed.
  • Network adapters are configured for cluster communication (dedicated heartbeat network recommended).

Step 1: Install the Failover Clustering Feature

Run the following on all cluster nodes:

Install-WindowsFeature -Name Failover-Clustering -IncludeManagementTools

Restart the servers if prompted. Verify the feature is installed:

Get-WindowsFeature -Name Failover-Clustering

Step 2: Validate the Cluster Configuration

Before creating the cluster, run the Cluster Validation Wizard to check hardware, software, and network compatibility. This is required for Microsoft support:

Test-Cluster -Node "Node1", "Node2" -Include "Storage", "Network", "System Configuration", "Inventory"

Review the HTML report generated by the validation. Address any warnings or failures before proceeding.

Step 3: Create the Failover Cluster

Once validation passes, create the cluster using PowerShell from any node:

New-Cluster -Name "Cluster01" -Node "Node1", "Node2" -StaticAddress "192.168.1.100" -NoStorage

Use -NoStorage if you plan to add storage separately. The -StaticAddress parameter assigns the cluster IP address.

Step 4: Configure Cluster Quorum

Quorum determines how many votes are required for the cluster to remain online. For a two-node cluster, configure a file share witness or cloud witness:

Set-ClusterQuorum -FileShareWitness "\fileserverClusterWitness"

For a cloud witness using Azure:

Set-ClusterQuorum -CloudWitness -AccountName "mystorageaccount" -AccessKey "youraccesskey"

View current quorum configuration:

Get-ClusterQuorum

Step 5: Add Shared Storage to the Cluster

If storage was not added during cluster creation, add it now. First ensure shared disks are visible in Disk Management on all nodes. Then add to the cluster:

Get-ClusterAvailableDisk | Add-ClusterDisk

Assign a disk as the cluster witness quorum disk if applicable:

Set-ClusterQuorum -DiskWitness "Cluster Disk 1"

Step 6: Add Cluster Roles

Cluster roles are services or applications that run on the cluster. To add a generic service:

Add-ClusterGenericServiceRole -ServiceName "MyService" -Name "MyClusteredService" -StaticAddress "192.168.1.101"

For a File Server role:

Add-ClusterFileServerRole -Name "FileServer01" -Storage "Cluster Disk 1" -StaticAddress "192.168.1.102"

Step 7: Configure Failover and Failback Settings

Set the preferred owner for a cluster role:

Set-ClusterOwnerNode -Group "MyClusteredService" -Owners "Node1", "Node2"

Configure automatic failback to the preferred node after it recovers:

$group = Get-ClusterGroup -Name "MyClusteredService"
$group.AutoFailbackType = 1
$group.FailbackWindowStart = 2
$group.FailbackWindowEnd = 4

Step 8: Test Failover

Move a cluster role to another node to test failover manually:

Move-ClusterGroup -Name "MyClusteredService" -Node "Node2"

Verify that the role comes online on Node2. Then move it back:

Move-ClusterGroup -Name "MyClusteredService" -Node "Node1"

Step 9: Monitor Cluster Health

Get-ClusterNode
Get-ClusterGroup
Get-ClusterResource

To check event logs for cluster-related events:

Get-WinEvent -LogName "Microsoft-Windows-FailoverClustering/Operational" -MaxEvents 50

Summary

Windows Server 2016 Failover Clustering provides the high availability needed for mission-critical workloads. By validating your configuration, selecting the correct quorum model, and thoroughly testing failover behavior, you can build a resilient cluster that automatically recovers from node failures with minimal service interruption.