Re: [Hampshire] Open source network backup with de-dupe.

Top Page

Reply to this message
Author: Adrian Bridgett
Date:  
To: Hampshire LUG Discussion List
Subject: Re: [Hampshire] Open source network backup with de-dupe.
On Thu, Jul 15, 2010 at 21:11:25 +0100 (+0100), Keith Edmunds wrote:
> However, Chris is right: you cannot *know* that two files are the same
> unless you compare them, byte by byte. If hashes are good enough for you,
> just backup the hashes and save lots of time and diskspace!


My understanding on this point is that in fact a hash _is_ good enough
- or rather the odds of a hash not being good enough are sufficiently
low (cf corruption on hard disks etc) that it's irrevevant. For
instance see:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=122945

Adrian
--
bitcube.co.uk - Expert Linux infrastructure consultancy
Puppet, Debian, Red Hat, Ubuntu, CentOS