rsync alternative? (too many files)

Seth David Schoen schoen at loyalty.org
Mon Mar 7 18:49:20 PST 2011


Tony Godshall writes:

> On Sun, Mar 6, 2011 at 15:19, Seth David Schoen <schoen at loyalty.org> wrote:
> 
> [Seth]
> > If you're sure that the filenames don't contain tabs, you can...
> 
> Hi Seth.
> 
> I must not have expressed myself clearly.
> 
> There are excessive unique files, not duplicate entries in a list of files.
> 
> The files have already been deduplicated in the sense that entries to
> files containing the same content are hardlinks.
> 
> If I were to copy the files to new media without retaining the
> hardlinks, they would take up way more space.

Hi Tony,

My approach copies "unique" files in the sense of inode uniqueness
(not content uniqueness or name uniqueness) so I think it does
address what you want.  I was using find to print inode numbers
as the basis of a solution that preserves hard link structure,
since two files are hard links if and only if they have the same
inode number.

The only reason that my advice worries about whitespace in the
filenames is that it will confuse some of the programs that are
being asked to process those filenames, not that the filenames
themselves are being used to identify or distinguish "unique"
files.

-- 
Seth David Schoen <schoen at loyalty.org> | Qué empresa fácil no pensar en
     http://www.loyalty.org/~schoen/   | un tigre, reflexioné.
     http://vitanuova.loyalty.org/     |            -- Borges, El Zahir


More information about the bad mailing list