How to Set Up Windows Server Failover Cluster (WSFC) with SQL Server on Windows Server 2025

High availability for SQL Server workloads on Windows Server 2025 is most commonly achieved through Windows Server Failover Clustering (WSFC). WSFC provides the underlying cluster infrastructure that SQL Server relies on for two of its primary HA technologies: Failover Cluster Instances (FCI) and Always On Availability Groups (AG). Understanding the distinction between these two approaches — and knowing how to configure WSFC correctly before SQL Server is ever installed — is critical for building a resilient database platform that meets your recovery time objectives. This guide walks through every stage of the process: from shared storage provisioning and cluster creation, through SQL Server installation on cluster nodes, to live failover testing and monitoring with PowerShell.

FCI vs Always On Availability Groups

Before building anything, choose the right technology for your requirements.

A Failover Cluster Instance (FCI) presents a single virtual SQL Server instance that lives on one node at a time. All nodes share the same storage (iSCSI LUN, Fibre Channel, shared VHDX, or Storage Spaces Direct). When the active node fails, the cluster service moves the SQL Server role — including the shared disks, virtual IP, and virtual network name — to a surviving node. Applications reconnect using the virtual server name and typically experience an RTO of 30–120 seconds depending on database size and recovery time.

An Always On Availability Group (AG) does not use shared storage. Each replica has its own copy of the data, synchronized via log shipping over a standard network connection. AGs support readable secondaries, cross-subnet failover, and near-zero RPO with synchronous commit. WSFC is still required as the underlying cluster framework, but the storage model is completely different. AGs are the preferred choice when you need geographic distribution or readable secondaries; FCIs are preferred when you need a single instance with full feature parity and a shared storage model.

This tutorial focuses on the FCI path.

Prerequisites

  • Two or more Windows Server 2025 (Standard or Datacenter) nodes, each domain-joined
  • A dedicated cluster service account in Active Directory (or use the default CNO)
  • Static IP addresses on all nodes for both the cluster network and iSCSI/storage network
  • Shared storage accessible to all nodes simultaneously: iSCSI target, Fibre Channel LUN, shared VHDX on a Scale-Out File Server, or Storage Spaces Direct (S2D)
  • SQL Server 2022 installation media
  • All nodes running the same Windows Server 2025 build and patch level
  • Failover Clustering and .NET Framework features available on all nodes

Step 1 — Install the Failover Clustering Feature on All Nodes

Run the following on every node that will participate in the cluster. Use PowerShell remoting to push the installation simultaneously.

# Run on each node, or use Invoke-Command to target all nodes at once
$nodes = 'SQL-NODE1', 'SQL-NODE2'

Invoke-Command -ComputerName $nodes -ScriptBlock {
    Install-WindowsFeature -Name Failover-Clustering -IncludeManagementTools -Restart:$false
    Install-WindowsFeature -Name RSAT-Clustering-PowerShell
}

Reboot the nodes after installation if prompted. Verify the feature is present before continuing:

Get-WindowsFeature -Name Failover-Clustering -ComputerName SQL-NODE1, SQL-NODE2 |
    Select-Object Name, InstallState

Step 2 — Validate Cluster Configuration

The cluster validation wizard is not optional. Microsoft support requires a clean validation report before troubleshooting any cluster issue. Run it against all nodes and your shared storage before creating the cluster.

Test-Cluster -Node SQL-NODE1, SQL-NODE2 -Include "Storage","Network","System Configuration","Inventory"

The report is saved as an HTML file in C:WindowsclusterReports. Open it and resolve any failures — warnings are acceptable, failures are not. Common failures include mismatched NIC driver versions, iSCSI initiator not connected on both nodes, or Windows Firewall blocking cluster communication ports (TCP 3343, UDP 3343).

Step 3 — Create the Windows Server Failover Cluster

Create the cluster with a static IP address. This IP is for the cluster management interface, not the SQL virtual server (that comes later).

New-Cluster `
    -Name SQLCLUSTER01 `
    -Node SQL-NODE1, SQL-NODE2 `
    -StaticAddress 10.10.1.50 `
    -NoStorage

The -NoStorage flag prevents the cluster wizard from automatically adding all available disks to Cluster Shared Volumes. You will assign storage manually. After the cluster is created, configure the quorum:

# Use a disk witness (recommended for 2-node clusters with shared storage)
# First, identify the quorum disk — it should be a small LUN (512 MB to 1 GB)
Get-ClusterAvailableDisk

# Add the quorum disk to the cluster
Get-ClusterAvailableDisk | Add-ClusterDisk

# Set witness
Set-ClusterQuorum -DiskWitness "Cluster Disk 1"

For two-node clusters without a suitable disk, a file share witness on a third server is acceptable:

Set-ClusterQuorum -FileShareWitness \WITNESS-SERVERSQLClusterWitness

Step 4 — Add SQL Data Disks to the Cluster

Add the LUNs that will hold SQL Server data, log, and TempDB files to the cluster as cluster resources. These disks must be visible to both nodes before this step.

# List all disks not yet part of the cluster
Get-ClusterAvailableDisk

# Add all available disks
Get-ClusterAvailableDisk | Add-ClusterDisk

# Rename cluster disks for clarity
(Get-ClusterResource "Cluster Disk 2").Name = "SQL-DATA-Disk"
(Get-ClusterResource "Cluster Disk 3").Name = "SQL-LOG-Disk"
(Get-ClusterResource "Cluster Disk 4").Name = "SQL-TEMPDB-Disk"

Verify disks are online and owned by the active node:

Get-ClusterResource | Where-Object ResourceType -eq "Physical Disk" |
    Select-Object Name, State, OwnerNode

Step 5 — Install SQL Server as a Failover Cluster Instance

SQL Server FCI installation is a two-phase process. First, run the installer on the primary node and choose New SQL Server failover cluster installation. Second, run the installer on each additional node and choose Add node to a SQL Server failover cluster.

Key inputs during the wizard on the first node:

  • SQL Server Network Name: the virtual server name clients will connect to (e.g., SQLVIRTUAL01)
  • IP Address: the virtual IP address for the SQL FCI (e.g., 10.10.1.55)
  • Cluster Disk Selection: select the data, log, and TempDB disks added in Step 4
  • Data Directories: point each to the appropriate cluster disk drive letter
  • Service Accounts: use a low-privilege domain account or Managed Service Account (MSA)

You can also automate FCI installation with a configuration file. Save the following as ConfigurationFile.ini and call setup:

# First-node FCI installation (run on SQL-NODE1)
.setup.exe /ConfigurationFile=C:SQLInstallConfigurationFile.ini /IACCEPTSQLSERVERLICENSETERMS

# On SQL-NODE2 — add node to existing FCI
.setup.exe /ACTION=AddNode /INSTANCENAME=MSSQLSERVER `
    /SQLSVCACCOUNT="CORPsvc-sql" /SQLSVCPASSWORD="P@ssw0rd!" `
    /IACCEPTSQLSERVERLICENSETERMS /CONFIRMIPDEPENDENCYCHANGE=TRUE

After installation, verify the SQL Server role appears in Failover Cluster Manager under Roles.

Step 6 — Test Failover

Test both planned and unplanned failover scenarios before going to production.

# Planned failover — move the SQL role to the other node
Move-ClusterGroup -Name "SQL Server (MSSQLSERVER)" -Node SQL-NODE2

# Check that the role is now online on SQL-NODE2
Get-ClusterGroup -Name "SQL Server (MSSQLSERVER)" | Select-Object Name, State, OwnerNode

# Verify SQL connectivity using the virtual server name
Invoke-Sqlcmd -ServerInstance SQLVIRTUAL01 -Query "SELECT @@SERVERNAME, @@VERSION"

For an unplanned failover simulation, pause or stop the cluster service on the active node:

# Simulate node failure (run from a management workstation)
Stop-Service ClusSvc -Force -ComputerName SQL-NODE1

Watch the cluster role transition in Failover Cluster Manager. The SQL Server role should come online on the surviving node within the RTO window. Restart the cluster service on the recovered node afterward and verify it rejoins the cluster.

Step 7 — Monitor Cluster Resources with PowerShell

Ongoing monitoring ensures you catch degraded states before they become outages.

# View all cluster resources and their states
Get-ClusterResource | Select-Object Name, ResourceType, State, OwnerNode |
    Format-Table -AutoSize

# Check cluster node health
Get-ClusterNode | Select-Object Name, State, NodeWeight

# View cluster events from the last 24 hours
Get-WinEvent -LogName "Microsoft-Windows-FailoverClustering/Operational" |
    Where-Object TimeCreated -gt (Get-Date).AddHours(-24) |
    Select-Object TimeCreated, Id, Message |
    Format-List

# Alert on any resource not in Online state
Get-ClusterResource | Where-Object State -ne "Online" |
    ForEach-Object { Write-Warning "Resource $($_.Name) is in state: $($_.State)" }

Configure cluster-aware monitoring using Windows Admin Center or System Center Operations Manager for production environments. Set up email alerts through the cluster event log channel.

RTO Expectations

SQL FCI RTO typically falls in the 30–120 second range. The primary factors are database recovery time (proportional to uncommitted transaction volume at failure time), disk enumeration time on the new owner node, and the cluster heartbeat timeout setting. The default heartbeat timeout is 5 seconds with a 5-second sample interval; adjust these conservatively in high-latency environments:

(Get-Cluster).SameSubnetDelay = 1000     # milliseconds
(Get-Cluster).SameSubnetThreshold = 10   # missed heartbeats before failover
(Get-Cluster).CrossSubnetDelay = 1000
(Get-Cluster).CrossSubnetThreshold = 20

Always On AGs using synchronous commit can achieve near-zero RPO and RTO under 30 seconds when properly configured, at the cost of shared storage independence. For most two-site HA/DR scenarios, AGs are the superior choice; for single-site shared storage environments where instance-level HA is required, FCI remains the authoritative solution.

Conclusion

Windows Server Failover Clustering with SQL Server FCI on Windows Server 2025 provides a mature, well-understood high-availability solution backed by decades of production use. By carefully validating your cluster configuration before installation, correctly assigning shared storage resources, and running both planned and simulated failover tests, you build confidence that the system will behave exactly as expected during an actual failure event. Combine ongoing PowerShell-based monitoring with Windows Admin Center dashboards and you have a complete operational picture of your SQL Server cluster at all times. Always document your RTO measurements from testing — they are the baseline against which real incidents are measured.