Introduction to Reliability Monitor on Windows Server 2022

Reliability Monitor is a built-in Windows Server 2022 diagnostic tool that provides a chronological view of system stability by correlating software installations, updates, application crashes, Windows failures, and other events against a rolling reliability index score. While not a real-time monitoring tool, it is invaluable for post-incident analysis, identifying when a server’s stability degraded, and correlating that degradation with specific changes such as driver installations or Windows updates. This guide covers accessing Reliability Monitor, interpreting its data, extracting reliability history via PowerShell, understanding the underlying RACTask data collection mechanism, and correlating reliability events with system changes for root cause analysis.

Accessing Reliability Monitor

Reliability Monitor can be opened through several paths on Windows Server 2022:

# From Run dialog (Win+R) or PowerShell
perfmon /rel

# Via Control Panel path (if Desktop Experience is installed)
# Control Panel > System and Security > Security and Maintenance > Reliability Monitor

# Via search
# Start > Search > "View reliability history"

# From Computer Management > Performance > Reliability Monitor
compmgmt.msc

Note that Reliability Monitor requires the Desktop Experience feature to be installed. Windows Server Core does not display the GUI. However, the underlying reliability data is still collected on Server Core and can be accessed via PowerShell cmdlets and WMI as described later in this guide.

When Reliability Monitor opens, it displays a stability history chart spanning the last 28 days by default. The vertical axis shows the Reliability Index from 1 (least stable) to 10 (most stable). The horizontal axis shows dates. Below the chart is a categorised event list showing what occurred on each day.

Understanding the Reliability History Chart

The Reliability Index is calculated daily based on the number and severity of failures recorded. The score starts at 10 for a freshly installed system and decreases when failures occur. The index recovers gradually over time if no new failures occur, increasing by approximately 0.1 points per stable day. A score persistently below 6 indicates a system experiencing frequent instability that warrants investigation.

The chart uses coloured icons in rows beneath the timeline to represent different event types:

Application Failures (red X in the Applications row) — these include application crashes (where the application process terminates unexpectedly and a crash dump may be generated), application hang events (where an application stops responding and Windows terminates it after a timeout), and application stop errors.

Windows Failures (red X in the Windows row) — these include Stop errors (BSODs / kernel-mode crashes), Windows Update failures, and boot failures. A BSOD on a Windows Server 2022 system appears here as an Operating System: Unexpected shutdown event. Click the event to see the associated Bug Check code and the dump file path for further analysis with WinDbg.

Miscellaneous Failures (red X in the Other failures row) — hardware errors reported by the kernel, driver failures, and other non-application non-OS failures. Disk errors detected by the storage subsystem often appear here alongside Windows Error Reporting entries.

Warnings (yellow warning triangle) — non-fatal issues such as Windows Update installation warnings, driver installation warnings, and application installations that completed with warnings.

Informational events (blue i icon) — successful application installations, Windows Updates successfully applied, driver installations, and Windows Backup operations. These are critical for correlating when a stability drop began with what changed on that day.

Investigating Application Failures in Detail

Click any red X in the Application Failures row to expand the details panel at the bottom of Reliability Monitor. Each entry shows: the application name, version, fault module name and version, exception code, and the path to the generated crash dump file (if Windows Error Reporting collected a dump).

The fault module is particularly useful — if multiple different applications are crashing with the same fault module (e.g. a common DLL like msvcr140.dll or a third-party driver DLL), it indicates a shared component is the root cause rather than individual application bugs.

Click View technical details to see the full Windows Error Reporting (WER) event data including the Module Timestamp and Exception Offset. These values, combined with the application’s symbol files, allow a developer to pinpoint the exact line of code that caused the crash.

Application crash dumps are stored in %LOCALAPPDATA%CrashDumps for interactive application crashes or in the WER LocalDumps registry location for service crashes:

# Check WER LocalDumps configuration for a specific service
Get-ItemProperty -Path "HKLM:SOFTWAREMicrosoftWindowsWindows Error ReportingLocalDumpsmyservice.exe"

# Configure WER to collect full dumps for a specific application
$regPath = "HKLM:SOFTWAREMicrosoftWindowsWindows Error ReportingLocalDumpsmyapp.exe"
New-Item -Path $regPath -Force
New-ItemProperty -Path $regPath -Name DumpFolder -Value "C:CrashDumps" -PropertyType String
New-ItemProperty -Path $regPath -Name DumpType -Value 2 -PropertyType DWord  # 2 = full dump
New-ItemProperty -Path $regPath -Name DumpCount -Value 10 -PropertyType DWord

Investigating Windows Failures and Stop Errors

A Windows Failure entry in Reliability Monitor with the description Unexpected shutdown or Operating System: Windows stopped working indicates a kernel Stop error (BSOD). Click the entry to see the Stop error code (e.g. 0x0000007E SYSTEM_THREAD_EXCEPTION_NOT_HANDLED) and the path to the memory dump file.

Windows Server 2022 stores crash dumps in C:WindowsMinidump for small memory dumps or C:WindowsMEMORY.DMP for complete/kernel memory dumps. Configure the dump type in System Properties > Advanced > Startup and Recovery:

# Check current crash dump settings via PowerShell
$cs = Get-WmiObject -Class Win32_OSRecoveryConfiguration
$cs | Select-Object Name, DebugInfoType, ExpandedMiniDumpDirectory, ExpandedDebugFilePath

# DebugInfoType values:
# 0 = None
# 1 = Complete memory dump  
# 2 = Kernel memory dump (recommended for servers)
# 3 = Small memory dump (minidump)
# 7 = Automatic memory dump

# Set to Kernel memory dump via registry
Set-ItemProperty -Path "HKLM:SYSTEMCurrentControlSetControlCrashControl" -Name CrashDumpEnabled -Value 2
Set-ItemProperty -Path "HKLM:SYSTEMCurrentControlSetControlCrashControl" -Name DumpFile -Value "C:WindowsMEMORY.DMP"

Analyse the memory dump with WinDbg from the Windows SDK. Open the dump file and run !analyze -v for an automated analysis that identifies the failing module and provides a probable cause summary:

# In WinDbg command window after loading the dump file
!analyze -v
lm   (list loaded modules)
!process 0 0   (list all processes at time of crash)
!thread   (thread information for the crashing thread)

Reading the Reliability Index Score in Context

The reliability index provides a quick health signal but must be interpreted in context. A score of 9.5 with occasional application crashes from a non-critical background process is acceptable on a production server. A score of 3.0 caused by daily Windows Stop errors requires urgent attention regardless of current uptime.

Key patterns to watch for in the Reliability Monitor chart:

A sudden cliff drop — the index drops sharply on a specific date. Examine the information events on that date for software installations, updates, or driver changes that coincide with the drop. This is the most common pattern when a Windows Update or driver installation introduces instability.

A gradual decline — the index drops steadily over days or weeks. This typically indicates a memory leak or disk degradation that causes intermittent failures accumulating over time. Correlate with memory and disk counters from Performance Monitor logs.

A flat low score — the server has been unstable for a long time. Investigate the oldest failure events to find the original cause before addressing the symptom.

Exporting Reliability Data with PowerShell and Get-WinEvent

Reliability Monitor’s data is stored in the Windows Reliability Analysis Component (RAC) which writes events to the Microsoft-Windows-Reliability-Analysis-WinREAT/Operational log. The underlying data is also available through the Win32_ReliabilityRecords WMI class. Use PowerShell to extract and export this data for reporting or automated analysis:

# Get all reliability records from WMI
$records = Get-WmiObject -Class Win32_ReliabilityRecords | 
  Sort-Object TimeGenerated -Descending

# Display recent failures
$records | Where-Object {$_.SourceName -ne "Application"} | 
  Select-Object TimeGenerated, SourceName, Message | 
  Format-Table -AutoSize

# Export all reliability records to CSV
Get-WmiObject -Class Win32_ReliabilityRecords |
  Select-Object TimeGenerated, ProductName, SourceName, RecordNumber, Message |
  Sort-Object TimeGenerated -Descending |
  Export-Csv -Path C:reportsreliability_records.csv -NoTypeInformation

The Win32_ReliabilityStabilityMetrics WMI class provides the daily reliability index scores:

# Get reliability index scores for the past 28 days
Get-WmiObject -Class Win32_ReliabilityStabilityMetrics |
  Select-Object @{N="Date";E={$_.ConvertToDateTime($_.StartMeasurementDate)}},
                @{N="ReliabilityIndex";E={[math]::Round($_.SystemStabilityIndex, 2)}} |
  Sort-Object Date -Descending |
  Select-Object -First 28 |
  Format-Table -AutoSize

For event-log based retrieval using Get-WinEvent, query the Application log for Windows Error Reporting events (Event ID 1001) which correspond to application crashes:

# Get application crash events from the past 7 days
$startDate = (Get-Date).AddDays(-7)
Get-WinEvent -FilterHashtable @{
  LogName = "Application"
  Id = 1001
  StartTime = $startDate
} | Select-Object TimeCreated, Message | Format-List

# Get BSOD-related events from System log (Event ID 41 = Kernel-Power unexpected shutdown)
Get-WinEvent -FilterHashtable @{
  LogName = "System"
  Id = 41
  StartTime = $startDate
} | Select-Object TimeCreated, Message

Understanding RACTask in Task Scheduler

Reliability Monitor does not collect data in real time. Instead, it relies on a scheduled task called RACTask (Reliability Analysis Component) that runs once daily to process event logs and calculate the reliability index. This task is located in Task Scheduler under Task Scheduler Library > Microsoft > Windows > RAC.

# View RACTask details
Get-ScheduledTask -TaskPath "MicrosoftWindowsRAC" -TaskName "RACTask"

# Check when RACTask last ran and whether it completed successfully
Get-ScheduledTaskInfo -TaskPath "MicrosoftWindowsRAC" -TaskName "RACTask" |
  Select-Object LastRunTime, LastTaskResult, NextRunTime

# Run RACTask manually to update Reliability Monitor data immediately
Start-ScheduledTask -TaskPath "MicrosoftWindowsRAC" -TaskName "RACTask"

If Reliability Monitor shows a gap in data or displays No reliability data is currently available, verify RACTask is enabled and has not been disabled by a Group Policy. The task requires read access to event logs and write access to the RAC database stored at C:ProgramDataMicrosoftRACStateDataRacWmiDatabase.sdf.

If the RACTask database becomes corrupted (a rare occurrence after unclean shutdowns), delete the StateData directory and allow RACTask to rebuild it:

Stop-ScheduledTask -TaskPath "MicrosoftWindowsRAC" -TaskName "RACTask"
Remove-Item -Path "C:ProgramDataMicrosoftRACStateData" -Recurse -Force
Start-ScheduledTask -TaskPath "MicrosoftWindowsRAC" -TaskName "RACTask"

Using WinSAT for Performance Assessment

Windows System Assessment Tool (WinSAT) measures the performance capability of key hardware components and stores results in XML files at C:WindowsPerformanceWinSATDataStore. While WinSAT scores are not directly displayed in Reliability Monitor, they are collected by the same RAC infrastructure and provide a hardware performance baseline that helps distinguish whether reliability issues stem from hardware degradation or software problems.

# Run a full WinSAT assessment (takes 5-10 minutes, do NOT run on production under load)
winsat formal

# Run individual component assessments
winsat cpu         # CPU performance
winsat mem         # Memory bandwidth
winsat disk -drive C  # Disk I/O performance
winsat d3d         # DirectX graphics (not relevant for server headless)

# View latest WinSAT results
Get-WmiObject -Class Win32_WinSAT |
  Select-Object CPUScore, DiskScore, MemoryScore, GraphicsScore, WinSATAssessmentState

A sudden drop in WinSAT disk score (e.g. from 7.8 to 4.2) between assessments indicates disk degradation and correlates with increasing disk-related entries in Reliability Monitor. This kind of cross-tool correlation is the most powerful use of Reliability Monitor in a diagnostic workflow.

Correlating Reliability Events with System Changes

The most practical use of Reliability Monitor on Windows Server 2022 is change correlation. When a reliability issue is reported, open Reliability Monitor and navigate to the date when the stability index first dropped. Examine the information events on that day and the day before for:

Windows Update installations — note the KB numbers of updates installed immediately before the stability drop. These can be uninstalled if they are confirmed as the cause:

# List recently installed Windows Updates sorted by date
Get-HotFix | Sort-Object InstalledOn -Descending | Select-Object -First 20

# Uninstall a specific update by KB number
wusa /uninstall /kb:5034441 /quiet /norestart

Driver installations — device driver updates are a common source of BSOD-inducing instability. Identify recently installed drivers and roll them back:

# List drivers installed in the last 30 days
Get-WmiObject Win32_PnPSignedDriver |
  Where-Object {$_.DriverDate -gt (Get-Date).AddDays(-30).ToString("yyyyMMdd000000.000000+000")} |
  Select-Object DeviceName, DriverVersion, DriverDate |
  Sort-Object DriverDate -Descending

Application installations — third-party software installed via MSI or EXE is logged in Reliability Monitor as an informational event. Cross-reference this with the Software and Updates section of Windows Event Log (Event ID 11707 for MSI installs) to get the full installation path and version.

By combining the visual change timeline in Reliability Monitor with the PowerShell queries shown throughout this guide, administrators can rapidly identify the root cause of Windows Server 2022 stability issues and implement targeted fixes rather than resorting to broad and disruptive troubleshooting approaches. Reliability Monitor remains one of the most underutilised tools in the Windows Server diagnostic toolkit despite requiring no installation, no agent, and no third-party software.