Friday, August 16, 2013

Deduplication on ZFS and NetApp

Recently I came across a case where deduplication ratio for the same data is lower on NetApps than on ZFS. This document maybe explains why - see limits for dedup on NetApp starting with page 26. Apparently NetApp will silently stop deduping data after a specific limit which varies for different models and Ontap versions is reached.

Anyone has some other ideas why effectiveness of dedup on ZFS might be higher for the same data? (assuming the same or similar blocksize).

3 comments:

Anonymous said...

I bet the reason is that one can get reasonably priced system with 3-digit GB of RAM. And there is no way around keeping dedup tables in memory. So Netapp hits this limit sooner and can't dedup more blocks as there is no space in RAM to store pointers. I'm wondering how much RAM mid and high end Netapp heads have onboard.

Anonymous said...

Did you set the filesystem recordsize to 4K on ZFS? You'll reach better ratios with larger block sizes. NetApp is restricted to 4K.

milek said...

We got better ratio with 8K and 4K. Actually for dedup you get better ratios the smaller the block size is which is one of the reasons why we were surprised to see better ratios on ZFS with 8k than on NetApp with 4k