It is widely know that ZFS can compress and deduplicate. The deduplication works across the pool level and removes duplicate data blocks as they are written to disk. This results into having only unique blocks stored on the disk while the duplicate blocks are shared among the files.
zpool create -f tank mirror xvdb xvdc
zfs set compression=on tank
zfs set compression=lz4 tank
Now let's see some nice stats:
[testing-zfs ~]# rsync --progress CentOS-7-x86_64-DVD-1503-01.iso /tank/CentOS-7-x86_64-DVD-1503-01.iso
CentOS-7-x86_64-DVD-1503-01.iso
4310695936 100% 111.86MB/s 0:00:36 (xfer#1, to-check=0/1)
sent 4311222237 bytes received 31 bytes 114965927.15 bytes/sec
total size is 4310695936 speedup is 1.00
[testing-zfs ~]# rsync --progress CentOS-7-x86_64-DVD-1503-01.iso /tank/testing/CentOS-7-x86_64-DVD-1503-01.iso
CentOS-7-x86_64-DVD-1503-01.iso
4310695936 100% 152.26MB/s 0:00:27 (xfer#1, to-check=0/1)
sent 4311222237 bytes received 31 bytes 156771718.84 bytes/sec
total size is 4310695936 speedup is 1.00
[testing-zfs ~]# rsync --progress CentOS-7-x86_64-DVD-1503-01.iso /tank/testing2/CentOS-7-x86_64-DVD-1503-01.iso
CentOS-7-x86_64-DVD-1503-01.iso
4310695936 100% 141.80MB/s 0:00:28 (xfer#1, to-check=0/1)
sent 4311222237 bytes received 31 bytes 146143127.73 bytes/sec
total size is 4310695936 speedup is 1.00
[testing-zfs ~]# zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
tank 9.94G 4.57G 5.36G - 34% 46% 2.99x ONLINE -
Although its size is ~10G, it has only 4.57G occupied:
[testing-zfs ~]# du -sh /tank/
14G /tank/
[testing-zfs /tank]# du -sh *
4.0G CentOS-7-x86_64-DVD-1503-01.iso
512 ddfile
616M debian-8.0.0-i386-CD-1.iso
4.6G testing
4.6G testing2
[testing-zfs ~]# zfs get all
NAME PROPERTY VALUE SOURCE
tank type filesystem -
tank creation Thu Jul 16 11:51 2015 -
tank used 13.7G -
tank available 5.07G -
tank referenced 13.6G -
tank compressratio 1.02x -
tank mounted yes -
tank quota none default
tank reservation none default
tank recordsize 128K default
tank mountpoint /tank default
tank sharenfs off default
tank checksum on default
tank compression lz4 local
tank atime on default
tank devices on default
tank exec on default
tank setuid on default
tank readonly off default
tank zoned off default
tank snapdir hidden default
tank aclinherit restricted default
tank canmount on default
tank xattr on default
tank copies 1 default
tank version 5 -
tank utf8only off -
tank normalization none -
tank casesensitivity sensitive -
tank vscan off default
tank nbmand off default
tank sharesmb off default
tank refquota none default
tank refreservation none default
tank primarycache all default
tank secondarycache all default
tank usedbysnapshots 12K -
tank usedbydataset 13.6G -
tank usedbychildren 14.8M -
tank usedbyrefreservation 0 -
tank logbias latency default
tank dedup on local
tank mlslabel none default
tank sync standard default
tank refcompressratio 1.02x -
tank written 11.8G -
tank logicalused 13.9G -
tank logicalreferenced 13.9G -
tank snapdev hidden default
tank acltype off default
tank context none default
tank fscontext none default
tank defcontext none default
tank rootcontext none default
tank relatime off default
tank redundant_metadata all default
tank overlay off default
tank@tank.snap1 type snapshot -
tank@tank.snap1 creation Thu Jul 16 14:18 2015 -
tank@tank.snap1 used 12K -
tank@tank.snap1 referenced 1.80G -
tank@tank.snap1 compressratio 1.04x -
tank@tank.snap1 devices on default
tank@tank.snap1 exec on default
tank@tank.snap1 setuid on default
tank@tank.snap1 xattr on default
tank@tank.snap1 version 5 -
tank@tank.snap1 utf8only off -
tank@tank.snap1 normalization none -
tank@tank.snap1 casesensitivity sensitive -
tank@tank.snap1 nbmand off default
tank@tank.snap1 primarycache all default
tank@tank.snap1 secondarycache all default
tank@tank.snap1 defer_destroy off -
tank@tank.snap1 userrefs 0 -
tank@tank.snap1 mlslabel none default
tank@tank.snap1 refcompressratio 1.04x -
tank@tank.snap1 written 1.80G -
tank@tank.snap1 clones -
tank@tank.snap1 logicalused 0 -
tank@tank.snap1 logicalreferenced 1.88G -
tank@tank.snap1 acltype off default
tank@tank.snap1 context none default
tank@tank.snap1 fscontext none default
tank@tank.snap1 defcontext none default
tank@tank.snap1 rootcontext none default
[testing-zfs ~]#