Introduction to Monitoring Windows Server 2022 with Prometheus
Prometheus is a pull-based metrics system that scrapes HTTP endpoints at configured intervals and stores time-series data for querying and alerting. The windows_exporter (formerly wmi_exporter) is the standard Prometheus exporter for Windows Server, exposing hundreds of metrics covering CPU, memory, disk, network, IIS, services, and more. This guide covers installing windows_exporter as a Windows service, configuring collectors, writing Prometheus scrape configs, building Grafana dashboards, and setting up alerting rules.
Downloading windows_exporter for Windows Server 2022
The windows_exporter releases are hosted on GitHub at https://github.com/prometheus-community/windows_exporter/releases. Download the latest MSI or executable for your architecture (amd64 for most servers):
# Download the MSI installer (replace version number as appropriate)
$version = "0.29.2"
$url = "https://github.com/prometheus-community/windows_exporter/releases/download/v$version/windows_exporter-$version-amd64.msi"
Invoke-WebRequest -Uri $url -OutFile "C:installwindows_exporter-$version-amd64.msi"
Alternatively download the standalone EXE if you prefer manual service management:
$url = "https://github.com/prometheus-community/windows_exporter/releases/download/v$version/windows_exporter-$version-amd64.exe"
Invoke-WebRequest -Uri $url -OutFile "C:exporterswindows_exporter.exe"
Installing windows_exporter as a Windows Service
The MSI installer registers the exporter as a Windows service automatically with sane defaults. Install it with the specific collectors you need passed as a property:
msiexec /i "C:installwindows_exporter-0.29.2-amd64.msi" `
ENABLED_COLLECTORS="cpu,cs,logical_disk,net,os,service,system,iis,memory,tcp,process" `
LISTEN_PORT=9182 `
/quiet /norestart
This registers the service as windows_exporter. Verify it is running:
Get-Service windows_exporter
Invoke-WebRequest -Uri "http://localhost:9182/metrics" -UseBasicParsing | Select-Object -ExpandProperty Content | Select-Object -First 30
If using the standalone EXE, register it as a service with NSSM (Non-Sucking Service Manager) or the built-in sc.exe:
sc.exe create windows_exporter `
binPath= ""C:exporterswindows_exporter.exe" --collectors.enabled="cpu,cs,logical_disk,net,os,service,system,iis" --web.listen-address=":9182"" `
DisplayName= "Prometheus Windows Exporter" `
start= auto
sc.exe start windows_exporter
Available Collectors and Their Metrics
Collectors are the individual data sources windows_exporter can expose. Each collector maps to a category of Windows metrics:
cpu — Per-core CPU utilization in user, privileged, interrupt, and idle modes. Key metrics: windows_cpu_time_total{mode="idle|user|privileged|interrupt|dpc}
cs — Computer system summary including logical processors and physical memory. Key metrics: windows_cs_logical_processors, windows_cs_physical_memory_bytes
logical_disk — Per-volume disk I/O, free space, and queue length. Key metrics: windows_logical_disk_free_bytes, windows_logical_disk_read_bytes_total, windows_logical_disk_write_bytes_total, windows_logical_disk_avg_read_requests_queued
net — Per-NIC bytes/packets sent and received, errors, discards. Key metrics: windows_net_bytes_total, windows_net_packets_received_errors_total
os — Operating system metrics: uptime, paging file usage, visible memory. Key metrics: windows_os_virtual_memory_free_bytes, windows_os_physical_memory_free_bytes, windows_os_time
service — Windows service state (running/stopped/paused). Key metric: windows_service_state{name="W3SVC",state="running"}
system — System-level counters: context switches, exceptions, processor queue length. Key metric: windows_system_processor_queue_length
iis — IIS site requests, bytes transferred, connections, errors. Key metrics: windows_iis_requests_total, windows_iis_current_connections, windows_iis_receive_bytes_total
memory — Detailed memory breakdown: cache bytes, committed bytes, pool paged/nonpaged. Key metrics: windows_memory_available_bytes, windows_memory_committed_bytes, windows_memory_pool_nonpaged_bytes
tcp — TCP connection states (ESTABLISHED, CLOSE_WAIT, TIME_WAIT, etc.). Key metric: windows_tcp_connection_state_total
process — Per-process CPU, memory, and handle counts. Key metrics: windows_process_cpu_time_total, windows_process_working_set
Configuring Collectors and Listen Address
When running the exporter manually or building a startup command, the key flags are:
windows_exporter.exe `
--collectors.enabled="cpu,cs,logical_disk,net,os,service,system,iis,memory,tcp,process" `
--web.listen-address=":9182" `
--log.level=info `
--collector.logical_disk.volume-exclude="^HarddiskVolume" `
--collector.service.services-where="Name='W3SVC' OR Name='MSSQLSERVER' OR Name='wuauserv'"
The --collector.service.services-where flag accepts a WQL WHERE clause to limit service monitoring to specific services rather than all 200+ Windows services. The --collector.logical_disk.volume-exclude flag takes a regex to exclude system volumes.
To update the configuration for an MSI-installed service, edit the service’s image path in the registry:
$regPath = "HKLM:SYSTEMCurrentControlSetServiceswindows_exporter"
$currentPath = (Get-ItemProperty $regPath).ImagePath
# Append or modify flags in the ImagePath value, then restart
Set-ItemProperty $regPath -Name ImagePath -Value `
"`"C:Program Fileswindows_exporterwindows_exporter.exe`" --collectors.enabled=`"cpu,cs,logical_disk,net,os,service,system,iis,memory`" --web.listen-address=`":9182`""
Restart-Service windows_exporter
Firewall Rule for Prometheus Scraping
Open port 9182 so the Prometheus server can reach the exporter. Restrict the source IP to the Prometheus server’s IP for security:
New-NetFirewallRule -DisplayName "Prometheus windows_exporter" `
-Direction Inbound `
-Protocol TCP `
-LocalPort 9182 `
-RemoteAddress "10.0.1.50" ` # Prometheus server IP
-Action Allow
Prometheus Scrape Configuration for Windows Exporter
On the Prometheus server, add scrape targets for each Windows Server to prometheus.yml. A static configuration for a small number of servers:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'windows_servers'
static_configs:
- targets:
- 'web01.yourdomain.local:9182'
- 'web02.yourdomain.local:9182'
- 'app01.yourdomain.local:9182'
- 'db01.yourdomain.local:9182'
labels:
environment: 'production'
datacenter: 'dc-east'
scrape_timeout: 10s
metrics_path: /metrics
For larger environments using file-based service discovery, write server lists to a JSON file that Prometheus watches:
# C:prometheustargetswindows_servers.json
[
{
"targets": ["web01:9182","web02:9182","web03:9182"],
"labels": { "role": "web", "environment": "production" }
},
{
"targets": ["app01:9182","app02:9182"],
"labels": { "role": "app", "environment": "production" }
}
]
Reference the file in prometheus.yml:
scrape_configs:
- job_name: 'windows_servers'
file_sd_configs:
- files:
- 'C:prometheustargetswindows_servers.json'
refresh_interval: 5m
Key Windows Metrics Queries in Prometheus (PromQL)
These PromQL queries give immediate insight into Windows Server health:
CPU utilization percentage per server:
100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[5m])) * 100)
Free memory in GB:
windows_os_physical_memory_free_bytes / 1024 / 1024 / 1024
Disk free space percentage per volume:
100 * windows_logical_disk_free_bytes / windows_logical_disk_size_bytes
Disk read/write throughput in MB/s:
rate(windows_logical_disk_read_bytes_total[5m]) / 1024 / 1024
rate(windows_logical_disk_write_bytes_total[5m]) / 1024 / 1024
Network traffic in Mbps per interface:
rate(windows_net_bytes_received_total[5m]) * 8 / 1024 / 1024
rate(windows_net_bytes_sent_total[5m]) * 8 / 1024 / 1024
IIS requests per second:
rate(windows_iis_requests_total[1m])
Server uptime in hours:
(time() - windows_system_system_up_time) / 3600
Whether a specific service is running (1=running, 0=not):
windows_service_state{name="W3SVC",state="running"}
Creating Grafana Dashboards for Windows Server
The Grafana dashboard ID 14694 (Windows Node by prometheus-community) provides a ready-made dashboard for windows_exporter metrics. Import it in Grafana by navigating to Dashboards > Import and entering the dashboard ID. For custom dashboards, create panels with these queries:
CPU panel (time-series): Use the query above for CPU utilization, set the unit to “Percent (0-100)”, and add a threshold line at 80%.
Memory panel (gauge):
100 * (1 - (windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes))
Disk usage panel (bar gauge per volume):
100 - (100 * windows_logical_disk_free_bytes{volume!~"HarddiskVolume.*"} / windows_logical_disk_size_bytes{volume!~"HarddiskVolume.*"})
Use the instance label as a variable so dashboard viewers can filter by server. In Grafana, create a variable:
Name: instance
Label: Server
Query: label_values(windows_os_physical_memory_free_bytes, instance)
Then reference $instance in all panel queries to filter by the selected server.
Prometheus Alerting Rules for Windows Server
Create an alert rules file at C:prometheusruleswindows_alerts.yml and load it in prometheus.yml under rule_files:
groups:
- name: windows_server_alerts
interval: 30s
rules:
- alert: HighCPUUsage
expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[5m])) * 100) > 85
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU on {{ $labels.instance }}"
description: "CPU utilization is {{ $value | printf "%.1f" }}% on {{ $labels.instance }} for more than 5 minutes."
- alert: CriticalCPUUsage
expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[5m])) * 100) > 95
for: 2m
labels:
severity: critical
annotations:
summary: "Critical CPU on {{ $labels.instance }}"
description: "CPU utilization is {{ $value | printf "%.1f" }}% on {{ $labels.instance }}."
- alert: LowDiskSpace
expr: 100 * windows_logical_disk_free_bytes{volume="C:"} / windows_logical_disk_size_bytes{volume="C:"} < 15
for: 5m
labels:
severity: warning
annotations:
summary: "Low disk space on {{ $labels.instance }}"
description: "C: drive has {{ $value | printf "%.1f" }}% free on {{ $labels.instance }}."
- alert: CriticalDiskSpace
expr: 100 * windows_logical_disk_free_bytes{volume="C:"} / windows_logical_disk_size_bytes{volume="C:"} 90
for: 10m
labels:
severity: warning
annotations:
summary: "High memory usage on {{ $labels.instance }}"
description: "Memory usage is {{ $value | printf "%.1f" }}% on {{ $labels.instance }}."
- alert: ServiceDown
expr: windows_service_state{name=~"W3SVC|MSSQLSERVER|wuauserv",state="running"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Service {{ $labels.name }} is not running on {{ $labels.instance }}"
description: "The Windows service {{ $labels.name }} is not in running state on {{ $labels.instance }}."
- alert: HostDown
expr: up{job="windows_servers"} == 0
for: 2m
labels:
severity: critical
annotations:
summary: "Host unreachable: {{ $labels.instance }}"
description: "Prometheus cannot scrape {{ $labels.instance }}. The exporter may be down or the host unreachable."
Reference the rules file in prometheus.yml:
rule_files:
- "C:\prometheus\rules\windows_alerts.yml"
Alertmanager Integration
Route alerts to email, Slack, or PagerDuty via Alertmanager. A minimal Alertmanager config routing critical alerts to email and warnings to Slack:
global:
smtp_smarthost: 'smtp.yourdomain.com:587'
smtp_from: '[email protected]'
smtp_require_tls: true
route:
receiver: 'default'
routes:
- match:
severity: critical
receiver: 'pagerduty-critical'
- match:
severity: warning
receiver: 'slack-warnings'
receivers:
- name: 'default'
email_configs:
- to: '[email protected]'
- name: 'slack-warnings'
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
channel: '#infra-alerts'
text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'
- name: 'pagerduty-critical'
pagerduty_configs:
- routing_key: 'YOUR_PAGERDUTY_ROUTING_KEY'
Securing windows_exporter with Authentication and TLS
By default windows_exporter listens on HTTP without authentication. In production, enable TLS and basic authentication. Create a config file at C:exportersweb-config.yml:
tls_server_config:
cert_file: C:certsexporter.crt
key_file: C:certsexporter.key
basic_auth_users:
prometheus: $2y$12$hashed_bcrypt_password_here
Generate the bcrypt hash for the password using htpasswd or Python:
# Using Python (available on most build machines)
python -c "import bcrypt; print(bcrypt.hashpw(b'YourPrometheusPass', bcrypt.gensalt(12)).decode())"
Start the exporter with the web config:
windows_exporter.exe `
--collectors.enabled="cpu,cs,logical_disk,net,os,service,system,iis,memory" `
--web.listen-address=":9182" `
--web.config.file="C:exportersweb-config.yml"
Update the Prometheus scrape config to use TLS and basic auth:
scrape_configs:
- job_name: 'windows_servers'
scheme: https
tls_config:
ca_file: /etc/prometheus/certs/ca.crt
insecure_skip_verify: false
basic_auth:
username: prometheus
password: YourPrometheusPass
static_configs:
- targets: ['web01.yourdomain.local:9182']
For a quick self-signed certificate on the exporter host:
$cert = New-SelfSignedCertificate -DnsName "web01.yourdomain.local" `
-CertStoreLocation Cert:LocalMachineMy `
-KeyAlgorithm RSA -KeyLength 2048 -NotAfter (Get-Date).AddYears(2)
# Export to PEM format for windows_exporter
$certPath = "C:certsexporter.crt"
$keyPath = "C:certsexporter.key"
Export-Certificate -Cert $cert -FilePath "C:certsexporter.der" -Type CERT
certutil -encode "C:certsexporter.der" $certPath
Summary
Prometheus windows_exporter on Windows Server 2022 provides deep, real-time visibility into the performance and health of every node in your infrastructure. By enabling targeted collectors, securing the metrics endpoint with TLS and basic auth, writing precise PromQL queries for Grafana panels, and defining alert rules for CPU, memory, disk, and service state, operations teams gain a complete observability stack. The combination of Prometheus’ pull-based architecture, Grafana’s visualization, and Alertmanager’s routing creates a production monitoring system that scales from a handful of servers to hundreds without per-node licensing costs.