Introduction to Data Deduplication on Windows Server 2019

Data Deduplication (Dedup) is a Windows Server 2019 feature that reduces storage consumption by identifying and eliminating duplicate data blocks across files on a volume. Instead of storing multiple copies of identical data, Dedup replaces duplicates with references to a single shared copy stored in a special chunk store on the volume. Windows Server 2019 supports three Dedup usage types: General Purpose File Server (for shared file servers), Hyper-V (for VHD/VHDX files in a Hyper-V storage library), and Backup (for backup repository volumes). Dedup operates as a background process and is transparent to applications and users — files appear as full-size normal files even though their data is stored in a deduplicated form. In large environments, Dedup routinely achieves 40–80% space savings on file servers and 80–95% on backup repositories.

Supported Volume and Workload Types

Data Deduplication in Windows Server 2019 supports NTFS volumes only — ReFS volumes are not eligible for deduplication. The volume must be at least 2 GB in size and not be a system or boot volume. Dedup is most effective on volumes containing many similar files: user home directories, software repositories, backup storage (VHD/VHDX files), and virtual machine libraries.

Workloads that are poor candidates for deduplication include databases with random-write I/O patterns (SQL Server, Exchange mailbox databases) where inline deduplication would negatively impact performance, and volumes with highly compressed or already-encrypted data (media files, ZIP archives, encrypted volumes) where the deduplication ratio would be negligible.

Installing the Data Deduplication Feature

Data Deduplication is a Windows Server feature that must be installed before it can be configured. Use PowerShell to install it:

Install-WindowsFeature -Name FS-Data-Deduplication -IncludeManagementTools

Verify the installation:

Get-WindowsFeature -Name FS-Data-Deduplication

No reboot is required. The Dedup PowerShell module (Deduplication) is installed automatically with the feature.

Enabling Data Deduplication on a Volume

After installing the feature, enable deduplication on a specific volume. The following example enables deduplication on the D: volume with the General Purpose File Server usage type:

Enable-DedupVolume -Volume D: -UsageType Default

For a Hyper-V VM storage volume (VHD/VHDX files), use the Hyper-V usage type, which applies deduplication only to files larger than 4 KB and skips files that are open/locked by Hyper-V:

Enable-DedupVolume -Volume D: -UsageType HyperV

For a backup repository volume hosting Windows Server Backup or Veeam backup files:

Enable-DedupVolume -Volume E: -UsageType Backup

The Backup usage type disables the minimum file age setting (since backup files are written once and never modified), enabling immediate deduplication of all eligible files.

Configuring Deduplication Policy Settings

After enabling deduplication, review and adjust the policy for the volume. View current settings:

Get-DedupVolume -Volume D: | Format-List *

Key settings to configure:

MinimumFileAgeDays controls how many days a file must exist before it is eligible for deduplication. The default is 3 days for General Purpose workloads. Set it to 0 for backup volumes where files are written once:

Set-DedupVolume -Volume D: -MinimumFileAgeDays 5

MinimumFileSize sets the minimum file size (in bytes) eligible for deduplication. Files smaller than this value are not processed. The default is 32768 bytes (32 KB):

Set-DedupVolume -Volume D: -MinimumFileSize 65536

ExcludeFolder allows you to exclude specific folders from deduplication processing:

Set-DedupVolume -Volume D: -ExcludeFolder "D:NoDedup", "D:TempFiles"

ExcludeFileType excludes files by extension. For example, to exclude already-compressed media files:

Set-DedupVolume -Volume D: -ExcludeFileType "mp4", "mkv", "zip", "gz"

Running Deduplication Jobs

Deduplication uses scheduled background jobs to process data. There are four job types:

Optimization: Scans files, deduplicates them, and writes chunks to the chunk store. This is the primary job that produces space savings. Garbage Collection: Removes chunks from the chunk store that are no longer referenced by any files (run after deleting large numbers of files). Scrubbing: Validates the chunk store for corruption and repairs corrupted chunks from redundant copies. Unoptimization: Reverses deduplication and restores files to their full single-file state (used when disabling deduplication before removing the feature).

By default, Optimization runs on a schedule. To start it manually for immediate processing:

Start-DedupJob -Volume D: -Type Optimization

Monitor the running job:

Get-DedupJob

Run garbage collection after large file deletions:

Start-DedupJob -Volume D: -Type GarbageCollection -Full

Run scrubbing to validate chunk store integrity:

Start-DedupJob -Volume D: -Type Scrubbing

Scheduling Deduplication Jobs

Deduplication job schedules are managed through the DedupSchedule configuration. View current schedules:

Get-DedupSchedule

By default, Windows creates a BackgroundOptimization schedule that runs continuously in the background with low priority during idle periods, and a ThroughputOptimization schedule that runs at full speed on weekends. Modify the background optimization schedule to run at specific times:

Set-DedupSchedule -Name BackgroundOptimization -Start 22:00 -DurationHours 6 -Days Monday,Tuesday,Wednesday,Thursday,Friday

Create a custom garbage collection schedule to run weekly:

New-DedupSchedule -Name "WeeklyGC" -Type GarbageCollection -Start 03:00 -DurationHours 4 -Days Sunday

Monitoring Deduplication Savings

Check the deduplication statistics for a volume to see how much space is being saved:

Get-DedupStatus -Volume D: | Select-Object Volume, SavingsRate, SavedSpace, OptimizedFilesCount, InPolicyFilesCount | Format-List

The SavingsRate shows the percentage of space saved (e.g., 65 means 65% savings). SavedSpace shows the absolute amount of reclaimed space in bytes. OptimizedFilesCount is the number of files that have been deduplicated so far.

For a detailed view of all deduplication volumes across the server:

Get-DedupStatus | Format-Table Volume, SavingsRate, SavedSpace, LastOptimizationTime, LastGarbageCollectionTime -AutoSize

Disabling and Removing Deduplication

To disable deduplication on a volume while restoring all files to their original non-deduplicated state (required before decommissioning a volume or moving data to non-Dedup storage):

Disable-DedupVolume -Volume D:

Then run the Unoptimization job to restore all files from the chunk store:

Start-DedupJob -Volume D: -Type Unoptimization -Full

The unoptimization process can take considerable time for large volumes. Monitor progress with Get-DedupJob. Do not remove the volume or the feature until the unoptimization job completes, as inaccessible deduplicated files will become unreadable if the chunk store is removed before deoptimization completes.