Migrating Data and Keeping Data Synchronized with Rsync
RSYNC is a powerful tool for moving or syncing data between server. It is NOT, however, an effective tool for keeping data up to date continuously or keeping data synchronized in both directions.
How it works
RSYNC works by looking at data on the source and then comparing it with the destination. On a simple level, rsync will look at the metadata of a file to decide whether a file qualifies for backing up. This will typically be the archive bit, time stamp or other data.
Optionally, rsync can run a comparison between files by running a hash sum on the file on both the source and the destination and then comparing the hashes. This can be useful for backing up files which don't adjust their time stamp or archive bit like you will find with databases.
Additionally, on file systems that support it, rsync can be used to compare data on the receiving end of the sync to a different directory and creating hard links between those two directories. This means that you can have a complete backup of a folder structure and then create a new backup and every file which is unchanged between the two will be hard linked together. This results in an entire directory structure as a backup but with only the new and 'changed' files being transferred.
Examples
In this example. If you want to use rsync to copy individual files or groups of files from one directory to another:
rsync -v /path/to/source/file.txt /path/to/destination/directory/
rsync -v /path/to/source/files*.txt /path/to/destination/directory/
To copy an entire directory tree or several directories with a wildcard:
rsync -av /path/to/source/directory /path/to/destination/directory
rsync -av /path/to/source/directory* /path/to/destination/
rsync -av /path/to/source/* /path/to/destination/
To copy a directory using on top of an existing directory to bring it up to date (please note that the –delete flag will remove files on the destination that have been also removed from the source since the last sync. This is useful to create a true snapshot of files that are supposed to be deleted):
rsync -av --delete /path/to/source/* /path/to/destination/
RSYNC is useful over SSH as well. Here we can use ssh for secure syncs between servers:
rsync -av root@hostname_or_IP_of_source:/path/to/source/* /path/to/destination/
To backup using a different directory and use hard links here is an example:
rsync -avH --link-dest=/path/to/previousbackup/ /path/to/source/* /path/to/destination/
Useful Switches
-v, --verbose increase verbosity -q, --quiet suppress non-error messages --no-motd suppress daemon-mode MOTD (see caveat) -c, --checksum skip based on checksum, not mod-time & size -a, --archive archive mode; same as -rlptgoD (no -H) --no-OPTION turn off an implied OPTION (e.g. --no-D) -r, --recursive recurse into directories -R, --relative use relative path names --no-implied-dirs don't send implied dirs with --relative -b, --backup make backups (see --suffix & --backup-dir) --backup-dir=DIR make backups into hierarchy based in DIR --suffix=SUFFIX backup suffix (default ~ w/o --backup-dir) -u, --update skip files that are newer on the receiver --inplace update destination files in-place --append append data onto shorter files -d, --dirs transfer directories without recursing -l, --links copy symlinks as symlinks -L, --copy-links transform symlink into referent file/dir --copy-unsafe-links only "unsafe" symlinks are transformed --safe-links ignore symlinks that point outside the tree -k, --copy-dirlinks transform symlink to dir into referent dir -K, --keep-dirlinks treat symlinked dir on receiver as dir -H, --hard-links preserve hard links -p, --perms preserve permissions --executability preserve executability --chmod=CHMOD affect file and/or directory permissions -o, --owner preserve owner (super-user only) -g, --group preserve group --devices preserve device files (super-user only) --specials preserve special files -D same as --devices --specials -t, --times preserve times -O, --omit-dir-times omit directories when preserving times --super receiver attempts super-user activities -S, --sparse handle sparse files efficiently -n, --dry-run show what would have been transferred -W, --whole-file copy files whole (without rsync algorithm) -x, --one-file-system don't cross filesystem boundaries -B, --block-size=SIZE force a fixed checksum block-size -e, --rsh=COMMAND specify the remote shell to use --rsync-path=PROGRAM specify the rsync to run on remote machine --existing skip creating new files on receiver --ignore-existing skip updating files that exist on receiver --remove-source-files sender removes synchronized files (non-dir) --del an alias for --delete-during --delete delete extraneous files from dest dirs --delete-before receiver deletes before transfer (default) --delete-during receiver deletes during xfer, not before --delete-after receiver deletes after transfer, not before --delete-excluded also delete excluded files from dest dirs --ignore-errors delete even if there are I/O errors --force force deletion of dirs even if not empty --max-delete=NUM don't delete more than NUM files --max-size=SIZE don't transfer any file larger than SIZE --min-size=SIZE don't transfer any file smaller than SIZE --partial keep partially transferred files --partial-dir=DIR put a partially transferred file into DIR --delay-updates put all updated files into place at end -m, --prune-empty-dirs prune empty directory chains from file-list --numeric-ids don't map uid/gid values by user/group name --timeout=TIME set I/O timeout in seconds -I, --ignore-times don't skip files that match size and time --size-only skip files that match in size --modify-window=NUM compare mod-times with reduced accuracy -T, --temp-dir=DIR create temporary files in directory DIR -y, --fuzzy find similar file for basis if no dest file --compare-dest=DIR also compare received files relative to DIR --copy-dest=DIR ... and include copies of unchanged files --link-dest=DIR hardlink to files in DIR when unchanged -z, --compress compress file data during the transfer --compress-level=NUM explicitly set compression level -C, --cvs-exclude auto-ignore files in the same way CVS does -f, --filter=RULE add a file-filtering RULE -F same as --filter='dir-merge /.rsync-filter' repeated: --filter='- .rsync-filter' --exclude=PATTERN exclude files matching PATTERN --exclude-from=FILE read exclude patterns from FILE --include=PATTERN don't exclude files matching PATTERN --include-from=FILE read include patterns from FILE --files-from=FILE read list of source-file names from FILE -0, --from0 all *from/filter files are delimited by 0s --address=ADDRESS bind address for outgoing socket to daemon --port=PORT specify double-colon alternate port number --sockopts=OPTIONS specify custom TCP options --blocking-io use blocking I/O for the remote shell --stats give some file-transfer stats -8, --8-bit-output leave high-bit chars unescaped in output -h, --human-readable output numbers in a human-readable format --progress show progress during transfer -P same as --partial --progress -i, --itemize-changes output a change-summary for all updates --out-format=FORMAT output updates using the specified FORMAT --log-file=FILE log what we're doing to the specified FILE --log-file-format=FMT log updates using the specified FMT --password-file=FILE read password from FILE --list-only list the files instead of copying them --bwlimit=KBPS limit I/O bandwidth; KBytes per second --write-batch=FILE write a batched update to FILE --only-write-batch=FILE like --write-batch but w/o updating dest --read-batch=FILE read a batched update from FILE --protocol=NUM force an older protocol version to be used --checksum-seed=NUM set block/file checksum seed (advanced) -4, --ipv4 prefer IPv4 -6, --ipv6 prefer IPv6 -E, --extended-attributes copy extended attributes, resource forks --cache disable fcntl(F_NOCACHE) --version print version number (-h) --help show this help (see below for -h comment)