Introduction to Monitoring Windows Server 2022 with Prometheus

Prometheus is a pull-based metrics system that scrapes HTTP endpoints at configured intervals and stores time-series data for querying and alerting. The windows_exporter (formerly wmi_exporter) is the standard Prometheus exporter for Windows Server, exposing hundreds of metrics covering CPU, memory, disk, network, IIS, services, and more. This guide covers installing windows_exporter as a Windows service, configuring collectors, writing Prometheus scrape configs, building Grafana dashboards, and setting up alerting rules.

Downloading windows_exporter for Windows Server 2022

The windows_exporter releases are hosted on GitHub at https://github.com/prometheus-community/windows_exporter/releases. Download the latest MSI or executable for your architecture (amd64 for most servers):

# Download the MSI installer (replace version number as appropriate)
$version = "0.29.2"
$url = "https://github.com/prometheus-community/windows_exporter/releases/download/v$version/windows_exporter-$version-amd64.msi"
Invoke-WebRequest -Uri $url -OutFile "C:installwindows_exporter-$version-amd64.msi"

Alternatively download the standalone EXE if you prefer manual service management:

$url = "https://github.com/prometheus-community/windows_exporter/releases/download/v$version/windows_exporter-$version-amd64.exe"
Invoke-WebRequest -Uri $url -OutFile "C:exporterswindows_exporter.exe"

Installing windows_exporter as a Windows Service

The MSI installer registers the exporter as a Windows service automatically with sane defaults. Install it with the specific collectors you need passed as a property:

msiexec /i "C:installwindows_exporter-0.29.2-amd64.msi" `
    ENABLED_COLLECTORS="cpu,cs,logical_disk,net,os,service,system,iis,memory,tcp,process" `
    LISTEN_PORT=9182 `
    /quiet /norestart

This registers the service as windows_exporter. Verify it is running:

Get-Service windows_exporter
Invoke-WebRequest -Uri "http://localhost:9182/metrics" -UseBasicParsing | Select-Object -ExpandProperty Content | Select-Object -First 30

If using the standalone EXE, register it as a service with NSSM (Non-Sucking Service Manager) or the built-in sc.exe:

sc.exe create windows_exporter `
    binPath= ""C:exporterswindows_exporter.exe" --collectors.enabled="cpu,cs,logical_disk,net,os,service,system,iis" --web.listen-address=":9182"" `
    DisplayName= "Prometheus Windows Exporter" `
    start= auto

sc.exe start windows_exporter

Available Collectors and Their Metrics

Collectors are the individual data sources windows_exporter can expose. Each collector maps to a category of Windows metrics:

cpu — Per-core CPU utilization in user, privileged, interrupt, and idle modes. Key metrics: windows_cpu_time_total{mode="idle|user|privileged|interrupt|dpc}

cs — Computer system summary including logical processors and physical memory. Key metrics: windows_cs_logical_processors, windows_cs_physical_memory_bytes

logical_disk — Per-volume disk I/O, free space, and queue length. Key metrics: windows_logical_disk_free_bytes, windows_logical_disk_read_bytes_total, windows_logical_disk_write_bytes_total, windows_logical_disk_avg_read_requests_queued

net — Per-NIC bytes/packets sent and received, errors, discards. Key metrics: windows_net_bytes_total, windows_net_packets_received_errors_total

os — Operating system metrics: uptime, paging file usage, visible memory. Key metrics: windows_os_virtual_memory_free_bytes, windows_os_physical_memory_free_bytes, windows_os_time

service — Windows service state (running/stopped/paused). Key metric: windows_service_state{name="W3SVC",state="running"}

system — System-level counters: context switches, exceptions, processor queue length. Key metric: windows_system_processor_queue_length

iis — IIS site requests, bytes transferred, connections, errors. Key metrics: windows_iis_requests_total, windows_iis_current_connections, windows_iis_receive_bytes_total

memory — Detailed memory breakdown: cache bytes, committed bytes, pool paged/nonpaged. Key metrics: windows_memory_available_bytes, windows_memory_committed_bytes, windows_memory_pool_nonpaged_bytes

tcp — TCP connection states (ESTABLISHED, CLOSE_WAIT, TIME_WAIT, etc.). Key metric: windows_tcp_connection_state_total

process — Per-process CPU, memory, and handle counts. Key metrics: windows_process_cpu_time_total, windows_process_working_set

Configuring Collectors and Listen Address

When running the exporter manually or building a startup command, the key flags are:

windows_exporter.exe `
    --collectors.enabled="cpu,cs,logical_disk,net,os,service,system,iis,memory,tcp,process" `
    --web.listen-address=":9182" `
    --log.level=info `
    --collector.logical_disk.volume-exclude="^HarddiskVolume" `
    --collector.service.services-where="Name='W3SVC' OR Name='MSSQLSERVER' OR Name='wuauserv'"

The --collector.service.services-where flag accepts a WQL WHERE clause to limit service monitoring to specific services rather than all 200+ Windows services. The --collector.logical_disk.volume-exclude flag takes a regex to exclude system volumes.

To update the configuration for an MSI-installed service, edit the service’s image path in the registry:

$regPath = "HKLM:SYSTEMCurrentControlSetServiceswindows_exporter"
$currentPath = (Get-ItemProperty $regPath).ImagePath

# Append or modify flags in the ImagePath value, then restart
Set-ItemProperty $regPath -Name ImagePath -Value `
    "`"C:Program Fileswindows_exporterwindows_exporter.exe`" --collectors.enabled=`"cpu,cs,logical_disk,net,os,service,system,iis,memory`" --web.listen-address=`":9182`""

Restart-Service windows_exporter

Firewall Rule for Prometheus Scraping

Open port 9182 so the Prometheus server can reach the exporter. Restrict the source IP to the Prometheus server’s IP for security:

New-NetFirewallRule -DisplayName "Prometheus windows_exporter" `
    -Direction Inbound `
    -Protocol TCP `
    -LocalPort 9182 `
    -RemoteAddress "10.0.1.50" `   # Prometheus server IP
    -Action Allow

Prometheus Scrape Configuration for Windows Exporter

On the Prometheus server, add scrape targets for each Windows Server to prometheus.yml. A static configuration for a small number of servers:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'windows_servers'
    static_configs:
      - targets:
          - 'web01.yourdomain.local:9182'
          - 'web02.yourdomain.local:9182'
          - 'app01.yourdomain.local:9182'
          - 'db01.yourdomain.local:9182'
        labels:
          environment: 'production'
          datacenter: 'dc-east'
    scrape_timeout: 10s
    metrics_path: /metrics

For larger environments using file-based service discovery, write server lists to a JSON file that Prometheus watches:

# C:prometheustargetswindows_servers.json
[
  {
    "targets": ["web01:9182","web02:9182","web03:9182"],
    "labels": { "role": "web", "environment": "production" }
  },
  {
    "targets": ["app01:9182","app02:9182"],
    "labels": { "role": "app", "environment": "production" }
  }
]

Reference the file in prometheus.yml:

scrape_configs:
  - job_name: 'windows_servers'
    file_sd_configs:
      - files:
          - 'C:prometheustargetswindows_servers.json'
        refresh_interval: 5m

Key Windows Metrics Queries in Prometheus (PromQL)

These PromQL queries give immediate insight into Windows Server health:

CPU utilization percentage per server:

100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[5m])) * 100)

Free memory in GB:

windows_os_physical_memory_free_bytes / 1024 / 1024 / 1024

Disk free space percentage per volume:

100 * windows_logical_disk_free_bytes / windows_logical_disk_size_bytes

Disk read/write throughput in MB/s:

rate(windows_logical_disk_read_bytes_total[5m]) / 1024 / 1024
rate(windows_logical_disk_write_bytes_total[5m]) / 1024 / 1024

Network traffic in Mbps per interface:

rate(windows_net_bytes_received_total[5m]) * 8 / 1024 / 1024
rate(windows_net_bytes_sent_total[5m]) * 8 / 1024 / 1024

IIS requests per second:

rate(windows_iis_requests_total[1m])

Server uptime in hours:

(time() - windows_system_system_up_time) / 3600

Whether a specific service is running (1=running, 0=not):

windows_service_state{name="W3SVC",state="running"}

Creating Grafana Dashboards for Windows Server

The Grafana dashboard ID 14694 (Windows Node by prometheus-community) provides a ready-made dashboard for windows_exporter metrics. Import it in Grafana by navigating to Dashboards > Import and entering the dashboard ID. For custom dashboards, create panels with these queries:

CPU panel (time-series): Use the query above for CPU utilization, set the unit to “Percent (0-100)”, and add a threshold line at 80%.

Memory panel (gauge):

100 * (1 - (windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes))

Disk usage panel (bar gauge per volume):

100 - (100 * windows_logical_disk_free_bytes{volume!~"HarddiskVolume.*"} / windows_logical_disk_size_bytes{volume!~"HarddiskVolume.*"})

Use the instance label as a variable so dashboard viewers can filter by server. In Grafana, create a variable:

Name: instance
Label: Server
Query: label_values(windows_os_physical_memory_free_bytes, instance)

Then reference $instance in all panel queries to filter by the selected server.

Prometheus Alerting Rules for Windows Server

Create an alert rules file at C:prometheusruleswindows_alerts.yml and load it in prometheus.yml under rule_files:

groups:
  - name: windows_server_alerts
    interval: 30s
    rules:

      - alert: HighCPUUsage
        expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[5m])) * 100) > 85
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU on {{ $labels.instance }}"
          description: "CPU utilization is {{ $value | printf "%.1f" }}% on {{ $labels.instance }} for more than 5 minutes."

      - alert: CriticalCPUUsage
        expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[5m])) * 100) > 95
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Critical CPU on {{ $labels.instance }}"
          description: "CPU utilization is {{ $value | printf "%.1f" }}% on {{ $labels.instance }}."

      - alert: LowDiskSpace
        expr: 100 * windows_logical_disk_free_bytes{volume="C:"} / windows_logical_disk_size_bytes{volume="C:"} < 15
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Low disk space on {{ $labels.instance }}"
          description: "C: drive has {{ $value | printf "%.1f" }}% free on {{ $labels.instance }}."

      - alert: CriticalDiskSpace
        expr: 100 * windows_logical_disk_free_bytes{volume="C:"} / windows_logical_disk_size_bytes{volume="C:"}  90
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage on {{ $labels.instance }}"
          description: "Memory usage is {{ $value | printf "%.1f" }}% on {{ $labels.instance }}."

      - alert: ServiceDown
        expr: windows_service_state{name=~"W3SVC|MSSQLSERVER|wuauserv",state="running"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Service {{ $labels.name }} is not running on {{ $labels.instance }}"
          description: "The Windows service {{ $labels.name }} is not in running state on {{ $labels.instance }}."

      - alert: HostDown
        expr: up{job="windows_servers"} == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Host unreachable: {{ $labels.instance }}"
          description: "Prometheus cannot scrape {{ $labels.instance }}. The exporter may be down or the host unreachable."

Reference the rules file in prometheus.yml:

rule_files:
  - "C:\prometheus\rules\windows_alerts.yml"

Alertmanager Integration

Route alerts to email, Slack, or PagerDuty via Alertmanager. A minimal Alertmanager config routing critical alerts to email and warnings to Slack:

global:
  smtp_smarthost: 'smtp.yourdomain.com:587'
  smtp_from: '[email protected]'
  smtp_require_tls: true

route:
  receiver: 'default'
  routes:
    - match:
        severity: critical
      receiver: 'pagerduty-critical'
    - match:
        severity: warning
      receiver: 'slack-warnings'

receivers:
  - name: 'default'
    email_configs:
      - to: '[email protected]'

  - name: 'slack-warnings'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
        channel: '#infra-alerts'
        text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'

  - name: 'pagerduty-critical'
    pagerduty_configs:
      - routing_key: 'YOUR_PAGERDUTY_ROUTING_KEY'

Securing windows_exporter with Authentication and TLS

By default windows_exporter listens on HTTP without authentication. In production, enable TLS and basic authentication. Create a config file at C:exportersweb-config.yml:

tls_server_config:
  cert_file: C:certsexporter.crt
  key_file: C:certsexporter.key

basic_auth_users:
  prometheus: $2y$12$hashed_bcrypt_password_here

Generate the bcrypt hash for the password using htpasswd or Python:

# Using Python (available on most build machines)
python -c "import bcrypt; print(bcrypt.hashpw(b'YourPrometheusPass', bcrypt.gensalt(12)).decode())"

Start the exporter with the web config:

windows_exporter.exe `
    --collectors.enabled="cpu,cs,logical_disk,net,os,service,system,iis,memory" `
    --web.listen-address=":9182" `
    --web.config.file="C:exportersweb-config.yml"

Update the Prometheus scrape config to use TLS and basic auth:

scrape_configs:
  - job_name: 'windows_servers'
    scheme: https
    tls_config:
      ca_file: /etc/prometheus/certs/ca.crt
      insecure_skip_verify: false
    basic_auth:
      username: prometheus
      password: YourPrometheusPass
    static_configs:
      - targets: ['web01.yourdomain.local:9182']

For a quick self-signed certificate on the exporter host:

$cert = New-SelfSignedCertificate -DnsName "web01.yourdomain.local" `
    -CertStoreLocation Cert:LocalMachineMy `
    -KeyAlgorithm RSA -KeyLength 2048 -NotAfter (Get-Date).AddYears(2)

# Export to PEM format for windows_exporter
$certPath = "C:certsexporter.crt"
$keyPath  = "C:certsexporter.key"

Export-Certificate -Cert $cert -FilePath "C:certsexporter.der" -Type CERT
certutil -encode "C:certsexporter.der" $certPath

Summary

Prometheus windows_exporter on Windows Server 2022 provides deep, real-time visibility into the performance and health of every node in your infrastructure. By enabling targeted collectors, securing the metrics endpoint with TLS and basic auth, writing precise PromQL queries for Grafana panels, and defining alert rules for CPU, memory, disk, and service state, operations teams gain a complete observability stack. The combination of Prometheus’ pull-based architecture, Grafana’s visualization, and Alertmanager’s routing creates a production monitoring system that scales from a handful of servers to hundreds without per-node licensing costs.

How to Configure Windows Server 2022 Software Load Balancer