rsync alternative? (too many files)
Rick Moen
rick at linuxmafia.com
Fri Mar 4 17:23:16 PST 2011
Quoting Tony Godshall (tony at godshall.org):
> Anyone know of an rsync alternative or workaround for huge
> batches of files?
rsync's FAQ suggests ameliorating measures:
out of memory
The usual reason for "out of memory" when running rsync is that you are
transferring a _very_ large number of files. The size of the files
doesn't matter, only the total number of files. If memory is a problem,
first try to use the incremental recursion mode: upgrade both sides to
rsync 3.0.0 or newer and avoid options that disable incremental
recursion (e.g., use --delete-delay instead of --delete-after). If this
is not possible, you can break the rsync run into smaller chunks
operating on individual subdirectories using --relative and/or exclude
rules.
Or, you could tweak the size of the array of pointers to the file-list
entries (8 MB, last I heard) in rsync to a larger value and recompile.
But maybe incremental recursion is simply getting switched off? Quoting
the rsync manpage:
Some options require rsync to know the full file list, so these
options disable the incremental recursion mode. These include:
--delete-before, --delete-after, --prune-empty-dirs, and
--delay-updates. Because of this, the default delete mode when
you specify --delete is now --delete-during when both ends of
the connection are at least 3.0.0 (use --del or --delete-during
to request this improved deletion mode explicitly). See also
the --delete-delay option that is a better choice than using
--delete-after.
> In particular I'm looking for the ability to do the
> hardlink-a-tree-then-rsync way of making copies of a complete
> filesystem without duplicating files and without rsync crashing on me
> when the number of files to be transferred gets too big.
I'm not sure I followed the first half of that sentence, so apologies if
I don't 'get' your desired scenario. Speaking generically, if too many
files are making rsync hit 'out of memory in flist_expand' or 'out of
memory in glob_expand' or such, you _could_ switch to caveman methods
for finding then copying files, such as
find . -type f -print0 -xdev | xargs -0
...running cp piped into ssh, or whatever. 'Ware slowness.
More information about the bad
mailing list