Affected versions: Windows Server 2022

πŸ“– ~1 min read

Table of contents
  1. Symptom & Impact
  2. Environment & Reproduction
  3. Root Cause Analysis
  4. Quick Triage
  5. Step-by-Step Diagnosis
  6. Solution β€” Primary Fix
  7. Solution β€” Alternative Approaches
  8. Verification & Acceptance Criteria
  9. Rollback Plan
  10. Prevention & Hardening
  11. Related Errors & Cross-Refs
  12. References & Further Reading

Symptom & Impact

Secondary replica remains in NOT SYNCHRONIZING or DISCONNECTED state, reducing HA posture and raising failover risk. Read-only routing and DR readiness are impacted.

Environment & Reproduction

Observed after patch drift, endpoint certificate issues, firewall changes, or network interruptions between replicas.

Invoke-Sqlcmd -Query "SELECT replica_server_name,synchronization_health_desc,connected_state_desc FROM sys.dm_hadr_availability_replica_states ars JOIN sys.availability_replicas r ON ars.replica_id=r.replica_id"
Test-NetConnection -ComputerName  -Port 5022

Root Cause Analysis

Root causes include endpoint mismatch, suspended data movement, LSN divergence after long outage, and service account/SPN authentication failures.

Quick Triage

Assess endpoint reachability, health state, and whether data movement is suspended.

Invoke-Sqlcmd -Query "SELECT database_id,synchronization_state_desc,is_suspended FROM sys.dm_hadr_database_replica_states"
Get-Service MSSQLSERVER
Get-WinEvent -LogName Application -MaxEvents 40 | Where-Object {$_.ProviderName -match 'MSSQLSERVER'}

Step-by-Step Diagnosis

Validate endpoint configuration and AG role/state consistency on both primary and secondary nodes.

Invoke-Sqlcmd -Query "SELECT name,state_desc,port,role_desc FROM sys.database_mirroring_endpoints"
Invoke-Sqlcmd -Query "SELECT ag.name,ar.replica_server_name,ar.endpoint_url FROM sys.availability_groups ag JOIN sys.availability_replicas ar ON ag.group_id=ar.group_id"
Illustrative mockup for windows-server-2022 β€” terminal_or_powershell
Always On synchronization state checks β€” Illustrative mockup β€” Progressive Robot

Solution β€” Primary Fix

Resume data movement, correct endpoint/firewall policy, and restart SQL service only if endpoint state remains stale.

Still having issues? Our IT Solutions & Services team can diagnose and resolve this for you. Get in touch for a free consultation.

Invoke-Sqlcmd -Query "ALTER DATABASE [] SET HADR RESUME"
New-NetFirewallRule -DisplayName 'Allow HADR 5022' -Direction Inbound -Protocol TCP -LocalPort 5022 -Action Allow
Restart-Service MSSQLSERVER -Force
Illustrative mockup for windows-server-2022 β€” event_or_log_viewer
Replica resynchronization and endpoint repair β€” Illustrative mockup β€” Progressive Robot

Solution β€” Alternative Approaches

If divergence is significant, remove and rejoin the affected database to the availability group using fresh full/log restore chain.

Invoke-Sqlcmd -Query "ALTER AVAILABILITY GROUP [] REMOVE DATABASE []"
# Restore full + logs on secondary then re-add
Invoke-Sqlcmd -Query "ALTER AVAILABILITY GROUP [] ADD DATABASE []"

Verification & Acceptance Criteria

All replicas should show HEALTHY and SYNCHRONIZED (or SYNCHRONIZING by design) with steady send/redo queues.

Invoke-Sqlcmd -Query "SELECT database_id,synchronization_state_desc,synchronization_health_desc,log_send_queue_size,redo_queue_size FROM sys.dm_hadr_database_replica_states"
Invoke-Sqlcmd -Query "SELECT replica_server_name,connected_state_desc FROM sys.dm_hadr_availability_replica_states ars JOIN sys.availability_replicas r ON ars.replica_id=r.replica_id"

Rollback Plan

If new changes destabilize AG, revert endpoint/firewall modifications and fail back to known healthy primary-only posture.

Remove-NetFirewallRule -DisplayName 'Allow HADR 5022' -ErrorAction SilentlyContinue
Invoke-Sqlcmd -Query "ALTER DATABASE [] SET HADR SUSPEND"

Prevention & Hardening

Enforce configuration drift checks, monitor queue metrics, and validate endpoint cert/SPN health after every patch cycle.

Invoke-Sqlcmd -Query "SELECT GETDATE() AS ts, database_id,log_send_queue_size,redo_queue_size FROM sys.dm_hadr_database_replica_states"
Get-ScheduledTask | Where-Object {$_.TaskName -match 'AG|SQL'}

Related to endpoint authentication failures, quorum instability, and suspended data movement after storage/network incidents.

View all Windows Server 2022 tutorials on the Tutorials Hub β†’

Browse all common problems & solutions on the Tutorials Hub.

References & Further Reading

Microsoft Always On troubleshooting references, endpoint configuration guidance, and SQL Server HA operations best practices.

Need Expert Help?

If you cannot resolve this yourself, our team offers hands-on Server Management, Managed IT Services, and flexible Support Plans. Contact us today β€” we respond within one business day.