Introduction to Disaster Recovery Testing on Windows Server 2019
Disaster recovery (DR) testing is the practice of regularly executing and validating the procedures documented in your disaster recovery plan to ensure they actually work when needed. An untested recovery plan is, at best, an optimistic assumption. Organizations that skip DR testing often discover critical gaps during an actual disaster — missing backups, outdated runbooks, changed firewall rules, expired credentials, or dependencies that were not captured in the DR documentation. Windows Server 2019 environments require systematic DR testing that covers backup restoration, system state recovery, bare metal recovery, Active Directory restoration, application failover, and network reconfiguration. This guide provides a structured framework for DR testing including test types, execution procedures, and documentation practices.
Types of Disaster Recovery Tests
There are five levels of DR testing, progressing from least disruptive to most realistic:
Tabletop Exercise: A discussion-based walkthrough of the DR plan with key stakeholders. No systems are touched. Identify gaps, ambiguities, and ownership issues in the plan documentation. Walk through the scenario: “The primary datacenter has lost power with no ETA for restoration. What do we do in the first 30 minutes? Next 4 hours? Next 24 hours?” Structured Walkthrough: Each recovery team member independently reviews and validates their section of the DR plan, confirming that procedures, commands, and credentials are current and accurate. Simulation Test: A partial recovery test where specific recovery procedures are executed against isolated test systems. For example, restore last night’s backup to an isolated test server and verify the application functions correctly. Parallel Test: Activate the DR environment in parallel with the production environment, verifying that DR systems can handle production workloads without actually diverting traffic. Full Interruption Test: Actually switch production to the DR environment for a defined window. The highest-risk and most disruptive test type — typically reserved for environments where DR validation is a regulatory requirement.
Quarterly Backup Restore Test Procedure
A quarterly file and application restore test should be part of every Windows Server 2019 environment’s DR testing program. The following procedure tests Windows Server Backup restore functionality:
Identify the test window — schedule during a maintenance window when production impact is acceptable. Provision an isolated test server (a VM is ideal) with the same OS version as the source. Do not connect the test server to the production network until the test is complete. Identify the most recent backup version:
wbadmin get versions -backupTarget:\backupserverbackups
Restore a critical volume to the test server:
wbadmin start recovery -version:05/17/2026-22:00 -items:D: -itemtype:Volume -recoveryTarget:D: -backupTarget:\backupserverbackups -quiet
After restoration, verify data integrity. Compare file counts between source and restored destination:
(Get-ChildItem "D:Shares" -Recurse -File).Count
Spot-check specific critical files by opening them in the appropriate application and verifying content is correct and not corrupted. Document the RTO achieved — the time from starting the restore to verified application availability.
Active Directory Recovery Test
For environments using Active Directory, quarterly AD restore tests are essential. Test AD recovery in an isolated network to avoid conflicting with production. The procedure for a non-authoritative AD restore test:
Deploy an isolated VM with no network connection to production. Boot from Windows Server 2019 installation media. At the repair screen, select Troubleshoot > System Image Recovery. Restore a domain controller system state backup to the test VM. After the restore completes and the server reboots, verify AD DS services start correctly:
Get-Service -Name "NTDS", "DNS", "Kc2svc", "Netlogon" | Select-Object Name, Status
Verify Active Directory database integrity:
ntdsutil "activate instance ntds" "files" "integrity" quit quit
Query AD to verify objects are present and correct:
Get-ADUser -Filter * | Measure-Object | Select-Object Count
Get-ADComputer -Filter * | Measure-Object | Select-Object Count
Compare these counts to known-good production values from a recent Active Directory inventory export.
Azure Site Recovery Test Failover
For environments using Azure Site Recovery, perform a test failover monthly. In the Azure Portal, navigate to the Recovery Services Vault > Replicated Items. Select a protected server and click Test Failover. Choose the latest processed recovery point (for minimum RTO testing) and an isolated test Azure VNet. Click OK. Monitor the failover job:
Get-AzRecoveryServicesAsrJob -Name "TestFailoverJob-GUID" | Select-Object DisplayName, State, StartTime, EndTime
After the Azure VM is created (typically 5–15 minutes), RDP to it and run application health checks. Verify the application connects to its database and responds to requests. Verify key Windows services are running:
Get-Service | Where-Object { $_.StartType -eq "Automatic" -and $_.Status -ne "Running" } | Select-Object Name, Status, StartType
Record the exact time the VM became available and application health checks passed — this is your actual RTO, which should be compared to the target RTO in your DR plan. After the test, click Cleanup test failover in the Azure Portal to remove the test VM.
DR Test Runbook and Automation
A DR runbook is a step-by-step document that any qualified administrator can follow to execute the recovery. Automate verification steps in PowerShell to reduce human error and speed up testing. Create a DR validation script that runs automatically after a restore:
# dr-validation.ps1
$TestResults = @()
# Test 1: Check critical services are running
$CriticalServices = @("W3SVC", "MSSQLSERVER", "DNS", "LanmanServer")
foreach ($svc in $CriticalServices) {
$status = (Get-Service -Name $svc -ErrorAction SilentlyContinue).Status
$TestResults += [PSCustomObject]@{Test="Service: $svc"; Result=$status; Pass=($status -eq "Running")}
}
# Test 2: Check disk space on all volumes
Get-PSDrive -PSProvider FileSystem | ForEach-Object {
$pctFree = [math]::Round(($_.Free / ($_.Used + $_.Free)) * 100, 1)
$TestResults += [PSCustomObject]@{Test="Disk: $($_.Root)"; Result="$pctFree% free"; Pass=($pctFree -gt 10)}
}
# Test 3: Check network connectivity
$NetworkTargets = @("dc01.corp.example.com", "sqlserver01.corp.example.com", "fileserver01.corp.example.com")
foreach ($target in $NetworkTargets) {
$ping = Test-Connection -ComputerName $target -Count 1 -Quiet
$TestResults += [PSCustomObject]@{Test="Ping: $target"; Result=($ping ? "Success" : "Failed"); Pass=$ping}
}
# Report
$TestResults | Format-Table Test, Result, Pass -AutoSize
$FailedTests = $TestResults | Where-Object { -not $_.Pass }
if ($FailedTests.Count -gt 0) {
Write-Host "DR VALIDATION FAILED: $($FailedTests.Count) tests did not pass." -ForegroundColor Red
exit 1
} else {
Write-Host "DR VALIDATION PASSED: All $($TestResults.Count) tests passed." -ForegroundColor Green
exit 0
}
Documenting DR Test Results
Document each DR test with a standardized test report that captures: test date and participants, test type performed (simulation, parallel, etc.), systems tested and recovery methods used, actual RTO achieved vs target RTO, actual RPO (data loss) vs target RPO, each step performed with timestamps, issues discovered during the test, remediation actions assigned and owners, and a pass/fail determination for each test objective.
Store DR test reports in a location accessible to all DR team members and auditors — a SharePoint library, a dedicated documentation server, or a version-controlled Git repository. Review test history quarterly to identify recurring issues and trends. Use DR test results to justify infrastructure investments — for example, if backup restoration consistently takes 4 hours against a 2-hour RTO target, the evidence supports investing in faster backup storage or additional DR tooling.
Updating the DR Plan After Each Test
A DR test is only valuable if it results in plan improvements. After each test, hold a brief retrospective meeting to discuss what worked, what did not work, and what was missing from the runbook. Update the DR plan documentation to reflect current infrastructure configuration, correct any commands that failed during the test, update responsible party assignments, and record the verified RTO and RPO. Retest any failed or incomplete recovery procedures at the next test cycle. The DR plan is a living document — treat it as such by committing to update it after every test, major infrastructure change, and application deployment.