-*- indented-text -*-
Notes towards a new version of rsync
Martin Pool <mbp@samba.org>, September 2001.
Good things about the current implementation:
- Widely known and adopted.
- Fast/efficient, especially for moderately small sets of files over
slow links (transoceanic or modem.)
- Fairly reliable.
- The choice of runnning over a plain TCP socket or tunneling over
ssh.
- rsync operations are idempotent: you can always run the same
command twice to make sure it worked properly without any fear.
(Are there any exceptions?)
- Small changes to files cause small deltas.
- There is a way to evolve the protocol to some extent.
- rdiff and rsync --write-batch allow generation of standalone patch
sets. rsync+ is pretty cheesy, though. xdelta seems cleaner.
- Process triangle is creative, but seems to provoke OS bugs.
- "Morning-after property": you don't need to know anything on the
local machine about the state of the remote machine, or about
transfers that have been done in the past.
- You can easily push or pull simply by switching the order of
files.
- The "modules" system has some neat features compared to
e.g. Apache's per-directory configuration. In particular, because
you can set a userid and chroot directory, there is strong
protection between different modules. I haven't seen any calls
for a more flexible system.
Bad things about the current implementation:
- Persistent and hard-to-diagnose hang bugs remain
- Protocol is sketchily documented, tied to this implementation, and
hard to modify/extend
- Both the program and the protocol assume a single non-interactive
one-way transfer
- A list of all files are held in memory for the entire transfer,
which cripples scalability to large file trees
- Opening a new socket for every operation causes problems,
especially when running over SSH with password authentication.
- Renamed files are not handled: the old file is removed, and the
new file created from scratch.
- The versioning approach assumes that future versions of the
program know about all previous versions, and will do the right
thing.
- People always get confused about ':' vs '::'
- Error messages can be cryptic.
- Default behaviour is not intuitive: in too many cases rsync will
happily do nothing. Perhaps -a should be the default?
- People get confused by trailing slashes, though it's hard to think
of another reasonable way to make this necessary distinction
between a directory and its contents.
Protocol philosophy:
*The* big difference between protocols like HTTP, FTP, and NFS is
that their fundamental operations are "read this file", "delete
this file", and "make this directory", whereas rsync is "make this
directory like this one".
Questionable features:
These are neat, but not necessarily clean or worth preserving.
- The remote rsync can be wrapped by some other program, such as in
tridge's rsync-mail scripts. The general feature of sending and
retrieving mail over rsync is good, but this is perhaps not the
right way to implement it.
Desirable features:
These don't really require architectural changes; they're just
something to keep in mind.
- Synchronize ACLs and extended attributes
- Anonymous servers should be efficient
- Code should be portable to non-UNIX systems
- Should be possible to document the protocol in RFC form
- --dry-run option
- IPv6 support. Pretty straightforward.
- Allow the basis and destination files to be different. For
example, you could use this when you have a CD-ROM and want to
download an updated image onto a hard drive.
- Efficiently interrupt and restart a transfer. We can write a
checkpoint file that says where we're up to in the filesystem.
Alternatively, as long as transfers are idempotent, we can just
restart the whole thing. [NFSv4]
- Scripting support.
- Propagate atimes and do not modify them. This is very ugly on
Unix. It might be better to try to add O_NOATIME to kernels, and
call that.
- Unicode. Probably just use UTF-8 for everything.
- Open authentication system. Can we use PAM? Is SASL an adequate
mapping of PAM to the network, or useful in some other way?
- Resume interrupted transfers without the --partial flag. We need
to leave the temporary file behind, and then know to use it. This
leaves a risk of large temporary files accumulating, which is not
good. Perhaps it should be off by default.
- tcpwrappers support. Should be trivial; can already be done
through tcpd or inetd.
- Socks support built in. It's not clear this is any better than
just linking against the socks library, though.
- When run over SSH, invoke with predictable command-line arguments,
so that people can restrict what commands sshd will run. (Is this
really required?)
- Comparison mode: give a list of which files are new, gone, or
different. Set return code depending on whether anything has
changed.
- Internationalized messages (gettext?)
- Optionally use real regexps rather than globs?
- Show overall progress. Pretty hard to do, especially if we insist
on not scanning the directory tree up front.
Regression testing:
- Support automatic testing.
- Have hard internal timeouts against hangs.
- Be deterministic.
- Measure performance.
Hard links:
At the moment, we can recreate hard links, but it's a bit
inefficient: it depends on holding a list of all files in the tree.
Every time we see a file with a linkcount >1, we need to search for
another known name that has the same (fsid,inum) tuple. We could do
that more efficiently by keeping a list of only files with
linkcount>1, and removing files from that list as all their names
become known.
Command-line options:
We have rather a lot at the moment. We might get more if the tool
becomes more flexible. Do we need a .rc or configuration file?
That wouldn't really fit with its pattern of use: cp and tar don't
have them, though ssh does.
Scripting issues:
- Perhaps support multiple scripting languages: candidates include
Perl, Python, Tcl, Scheme (guile?), sh, ...
- Simply running a subprocess and looking at its stdout/exit code
might be sufficient, though it could also be pretty slow if it's
called often.
- There are security issues about running remote code, at least if
it's not running in the users own account. So we can either
disallow it, or use some kind of sandbox system.
- Python is a good language, but the syntax is not so good for
giving small fragments on the command line.
- Tcl is broken Lisp.
- Lots of sysadmins know Perl, though Perl can give some bizarre or
confusing errors. The built in stat operators and regexps might
be useful.
- Sadly probably not enough people know Scheme.
- sh is hard to embed.
Scripting hooks:
- Whether to transfer a file
- What basis file to use
- Logging
- Whether to allow transfers (for public servers)
- Authentication
- Locking
- Cache
- Generating backup path/name.
- Post-processing of backups, e.g. to do compression.
- After transfer, before replacement: so that we can spit out a diff
of what was changed, or kick off some kind of reconciliation
process.
VFS:
Rather than talking straight to the filesystem, rsyncd talks through
an internal API. Samba has one. Is it useful?
- Could be a tidy way to implement cached signatures.
- Keep files compressed on disk?
Interactive interface:
- Something like ncFTP, or integration into GNOME-vfs. Probably
hold a single socket connection open.
- Can either call us as a separate process, or as a library.
- The standalone process needs to produce output in a form easily
没有合适的资源?快使用搜索试试~ 我知道了~
linux备份软件rsync-3.0.4.tar.gz
5星 · 超过95%的资源 需积分: 9 49 下载量 126 浏览量
2008-11-11
10:48:58
上传
评论
收藏 755KB GZ 举报
温馨提示
共206个文件
c:71个
test:35个
h:28个
linux系统备份软件rsync-3.0.4.tar.gz
资源推荐
资源详情
资源评论
收起资源包目录
linux备份软件rsync-3.0.4.tar.gz (206个子文件)
rsync.1 152KB
pool_alloc.3 6KB
rsyncd.conf.5 35KB
atomic-rsync 3KB
flist.c 79KB
options.c 70KB
sysacls.c 65KB
deflate.c 63KB
generator.c 62KB
inflate.c 48KB
io.c 44KB
trees.c 43KB
main.c 39KB
util.c 38KB
exclude.c 36KB
popt.c 35KB
acls.c 29KB
clientserver.c 27KB
xattrs.c 27KB
loadparm.c 26KB
snprintf.c 24KB
popthelp.c 22KB
log.c 21KB
receiver.c 20KB
socket.c 20KB
params.c 19KB
rsync.c 18KB
token.c 15KB
hlink.c 14KB
inftrees.c 13KB
crc32.c 13KB
inffast.c 12KB
getaddrinfo.c 11KB
match.c 11KB
wildmatch.c 9KB
sender.c 9KB
clientname.c 9KB
compat.c 9KB
backup.c 9KB
uidlist.c 9KB
pool_alloc.c 8KB
authenticate.c 8KB
zutil.c 7KB
md5.c 7KB
tls.c 7KB
batch.c 6KB
fileio.c 6KB
syscall.c 6KB
mdfour.c 6KB
cleanup.c 6KB
wildtest.c 6KB
poptparse.c 5KB
checksum.c 5KB
progress.c 5KB
access.c 5KB
inet_pton.c 5KB
chmod.c 5KB
inet_ntop.c 5KB
pipe.c 5KB
poptconfig.c 4KB
savetransfer.c 4KB
adler32.c 4KB
sysxattrs.c 4KB
hashtable.c 4KB
compat.c 3KB
compress.c 2KB
t_stub.c 2KB
permstring.c 2KB
getgroups.c 1KB
connection.c 1KB
t_unsafe.c 1KB
findme.c 1KB
rounding.c 1KB
trimslash.c 1KB
getfsdev.c 382B
ChangeLog 42KB
CHANGES 1KB
configure 687B
COPYING 34KB
COPYING 1KB
cull_options 2KB
cvs2includes 1KB
deny-rsync 997B
Doxyfile 7KB
extern-squish 479B
file-attr-restore 5KB
files-to-excludes 534B
rsync.fns 9KB
git-set-file-times 899B
.gitignore 488B
gpg 171B
config.guess 43KB
zlib.h 65KB
rsync.h 30KB
crc32.h 30KB
proto.h 18KB
popt.h 16KB
deflate.h 12KB
zconf.h 9KB
trees.h 8KB
共 206 条
- 1
- 2
- 3
资源评论
- zhuxianzaifei2012-07-18已试验安装成功
- zhushuying122013-10-22很好 帮助到了自己
Dansley
- 粉丝: 75
- 资源: 13
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功