How to Configure Windows Server 2016 Data Deduplication

Data Deduplication, also known as Dedup, is a Windows Server feature that reduces storage consumption by eliminating redundant data on a volume. It identifies duplicate chunks across files and stores only one copy of each unique chunk, replacing duplicates with references to the stored chunk. Windows Server 2016 expanded Data Deduplication to support new workloads including Virtualized Desktop Infrastructure (VDI), backup repositories with variable-length chunk deduplication, and nano server compatibility. This guide covers configuring and managing Data Deduplication on Windows Server 2016.

How Data Deduplication Works

Data Deduplication scans files on a volume and breaks them into small, variable-length chunks typically ranging from 32 KB to 128 KB. Each chunk is hashed, and identical chunks are stored only once in a chunk store located in a hidden System Volume Information folder on the volume. Original file data is replaced with pointers to the chunk store. Deduplication runs as a low-priority background job that minimizes impact on system performance. Rehydration, the process of reconstructing the original file, happens transparently when applications access deduplicated files.

Installing Data Deduplication

Data Deduplication is a role service under File and Storage Services. Install it using Server Manager or PowerShell. To install via PowerShell with management tools:

Install-WindowsFeature -Name FS-Data-Deduplication -IncludeManagementTools

Verify the installation was successful:

Get-WindowsFeature -Name FS-Data-Deduplication

Enabling Data Deduplication on a Volume

After installation, enable deduplication on a specific volume. Windows Server 2016 supports several usage type profiles: Default (general purpose file server), HyperV (Hyper-V virtual disk files), Backup (backup target), and NFS (NFS shares). To enable deduplication for a general file server on the D drive:

Enable-DedupVolume -Volume D: -UsageType Default

For a Hyper-V workload, which enables in-policy deduplication of VHD and VHDX files:

Enable-DedupVolume -Volume D: -UsageType HyperV

For a backup target using variable-length chunking optimized for backup data:

Enable-DedupVolume -Volume D: -UsageType Backup

Configuring Deduplication Settings

Fine-tune deduplication behavior using Set-DedupVolume. The minimum file age in days controls how long a file must exist before it is considered for deduplication. The default is 3 days. To change the minimum file age to 1 day for more aggressive deduplication:

Set-DedupVolume -Volume D: -MinimumFileAgeDays 1

To exclude specific file extensions from deduplication (for example, already-compressed formats):

Set-DedupVolume -Volume D: -ExcludeFileType @("zip","mp4","jpg","png")

To exclude specific folders from deduplication:

Set-DedupVolume -Volume D: -ExcludeFolder @("D:NoDedup","D:ExcludedData")

Scheduling Deduplication Jobs

Deduplication runs as scheduled background jobs. Three job types are available: Optimization (finds and deduplicates new data), GarbageCollection (cleans up unreferenced chunks), and Scrubbing (validates chunk integrity). View existing schedules:

Get-DedupSchedule

Modify the default optimization schedule to run during off-hours and limit CPU and memory usage:

Set-DedupSchedule -Name BackgroundOptimization -Start "01:00" -DurationHours 4 -Memory 50 -Cores 50

Create a manual garbage collection job to run immediately:

Start-DedupJob -Volume D: -Type GarbageCollection -Priority Normal

Create a scrubbing job to check chunk integrity:

Start-DedupJob -Volume D: -Type Scrubbing

Monitoring Deduplication Status and Savings

Monitor deduplication status and savings using the following commands. To get a summary of deduplication savings on all enabled volumes:

Get-DedupStatus | Select-Object Volume, SavedSpace, SavingsRate, OptimizedFiles, LastOptimizationTime

To get detailed deduplication volume settings:

Get-DedupVolume

To check currently running deduplication jobs:

Get-DedupJob

Running an Immediate Optimization

To trigger an immediate deduplication optimization job rather than waiting for the scheduled job, use Start-DedupJob:

Start-DedupJob -Volume D: -Type Optimization -Priority High

Monitor the job progress:

Get-DedupJob | Select-Object Volume, Type, State, Progress

Disabling Deduplication

If you need to disable deduplication on a volume and restore all data to its original non-deduplicated form, first disable new optimization and then unoptimize existing data:

Disable-DedupVolume -Volume D:

To rehydrate (un-deduplicate) all files on a volume:

Start-DedupJob -Volume D: -Type Unoptimization

Note that unoptimization requires enough free space on the volume to hold the original uncompressed data. Ensure sufficient space is available before running this operation.

VDI Deduplication Configuration

Windows Server 2016 added support for running deduplication on live Hyper-V virtual machine storage using the HyperV usage type. This enables significant savings in VDI environments where many VMs share common operating system and application data. To check savings specifically on a Hyper-V volume:

Get-DedupStatus -Volume D: | Select-Object *

Best Practices for Data Deduplication

Do not enable deduplication on operating system volumes, database volumes (SQL Server, Exchange), or volumes with already-compressed data formats. Use the appropriate UsageType for each workload to maximize efficiency. Monitor savings rates regularly; typical file server environments achieve 40 to 80 percent space savings. Schedule optimization jobs during off-peak hours to minimize performance impact. Run garbage collection weekly to reclaim space from deleted or modified files. Keep the deduplication chunk store healthy by running periodic scrubbing jobs. Plan for adequate free space on the volume to allow deduplication operations to complete successfully.

Properly configured Data Deduplication on Windows Server 2016 can dramatically reduce storage costs while maintaining full transparency to applications and users, making it one of the most cost-effective features available in modern Windows Server environments.