rsync alternative? (too many files)

Rick Moen rick at linuxmafia.com
Fri Mar 4 17:23:16 PST 2011


Quoting Tony Godshall (tony at godshall.org):

> Anyone know of an rsync alternative or workaround for huge
> batches of files?

rsync's FAQ suggests ameliorating measures:

  out of memory

  The usual reason for "out of memory" when running rsync is that you are
  transferring a _very_ large number of files. The size of the files
  doesn't matter, only the total number of files. If memory is a problem,
  first try to use the incremental recursion mode: upgrade both sides to
  rsync 3.0.0 or newer and avoid options that disable incremental
  recursion (e.g., use --delete-delay instead of --delete-after). If this
  is not possible, you can break the rsync run into smaller chunks
  operating on individual subdirectories using --relative and/or exclude
  rules. 

Or, you could tweak the size of the array of pointers to the file-list
entries (8 MB, last I heard) in rsync to a larger value and recompile.

But maybe incremental recursion is simply getting switched off?  Quoting
the rsync manpage:

  Some options require rsync to know the full file list, so  these
  options  disable the incremental recursion mode.  These include:
  --delete-before,   --delete-after,    --prune-empty-dirs, and
  --delay-updates.   Because of this, the default delete mode when
  you specify --delete is now --delete-during when  both ends  of
  the  connection are at least 3.0.0 (use --del or --delete-during
  to request this improved deletion mode  explicitly).   See also
  the  --delete-delay  option  that  is a better choice than using
  --delete-after.



> In particular I'm looking for the ability to do the
> hardlink-a-tree-then-rsync way of making copies of a complete
> filesystem without duplicating files and without rsync crashing on me
> when the number of files to be transferred gets too big.

I'm not sure I followed the first half of that sentence, so apologies if
I don't 'get' your desired scenario.  Speaking generically, if too many
files are making rsync hit 'out of memory in flist_expand' or 'out of
memory in glob_expand' or such, you _could_ switch to caveman methods
for finding then copying files, such as

find . -type f -print0  -xdev | xargs -0 
...running cp piped into ssh, or whatever.  'Ware slowness.



More information about the bad mailing list