Lsync - Keeping Your Sanity by Keeping Your Home Directory Synchronised

I haver battled for a number of years on how to keep a laptop and a home directory reasonably syncronised. About a year ago I decided to solve this problem once and for all.

Below I describe Lsync, a script I wrote to do the work for me. Here is the Lsync script for download.

Introduction

WARNING: Command-Line Content

Lsync is a tool to keep your home directory synchronised across multiple systems. While there are a few solutions out there for doing this automagically, such solutions are either still in a research phase, have a price tag, or are designed on a different scale.

This solution is intended to be cheap, and work with most variants of Unix. It is also intended to be used by the user who is doing updates on one or both copies of their own home directory. It should be used regularly, to minimise the work required at each operation, and to reduce the risk of data loss, or insanity.

Lsync is implemented on top of the excellent OSS program rsync.

Pre-Requisites

As configured, Lsync depends on bash, ssh, and a recent version of rsync (exactly how recent I do not know). It can probably be modified to use the (far less secure) rsh/rexec protocol, but I am not going to do this work. So far, I have found it to work on Solaris 9, 10 and Express, as well as Mac OS X 10.3.9. I would fully expect it to work on any variant of Linux.

The user will need to edit the Lsync script to set the values for their "laptop" and "master" hosts.

What It Does

Lsync offers 5 basic functions:

  • Check on what needs to be synced from your "laptop" to your "master"
  • Check on vice-versa
  • Sync from your "laptop" to your "master" (make it so...)
  • Sync vice-versa
  • Edit your "rsync includes" file

The sync/check operations by default are done between the user's home directories on the "laptop" and "master" hosts, but they can instead perform the operation just on a sub-directory of the home directory.

Any operation that modifies data requires further input from the user - either editing of the "rsync includes" file, or entering a "y" to confirm that we really want to sync.

An advantage of using ssh as the remote shell protocol is that the user can leverage ~/.ssh/config to specify a different username to be used on a remote host. For example, I have the username "timc" on my laptop, but have to use the corporate standard "tc35445" on any SWAN host (yuck). The magic for this to happen without work on my behalf is to put this in ~/.ssh/config on my laptop:

	Host	\*.sfbay
	  User     tc35445

Terminology

laptop
A host you designate. This can have a dynamic name, established at run-time as the host you are running Lsync on, or it can be static. If the laptop host is static, you can run Lsync on either the master or the laptop host.
master
A host you designate. This is static.
sender
Host deemed to have the current authoritative version of your home directory.
receiver
Host being synchronised to the sender's version of your home directory.
rsync includes file
A file containing rules for including & excluding files and/or directories to be synchronised. The format is documented in the rsync(1) manual page. The default location for this is ~/.rsync-include.

Performance

Unless you have seen rsync before, you will be surprised how fast it can do it's work. I am currently syncing about 50,000 files, but if the metadata for these is cached on both systems, a full home directory check takes around a minute. If it is the first check of the day, it might take 5 minutes.

In either case, it is fast enough to use at least daily, which means you can easily have your full, up-to-date "working set" with you on your laptop when you are out of the office, but go back to a SunRay when you get in.

Alternatively, if you know you have been working in a sub-directory, just specify that subdirectory, and Lsync limits its update to that directory tree. For example:

	$ Lsync L ~/tools/sh/Lsync

Caveats, Warnings, Disclaimers and Other Fine Print

It is important to understand that I am using the "--delete" option to rsync, which means that rsync will delete files on the receiver that do not exist on the sender. This means if you delete something on one host, it won't come back to haunt you, but it also means you must get sync operations in the correct order with your work activity. For easy identification, rsync tells you whenever it would have deleted or is deleting something by prepending it with "deleting".

Also, all sync operations will act on files and directories at the same level under your home directory on both systems. In other words, you can not use Lsync to copy a directory to a different location on the receiver, leaving the receiver's version of the directory in place.

This is a deliberate decision - Lsync is intended to synchronise, not to replicate.

Examples

When I want to sync my home directory, if I am not sure what I have modified on what directory, I first check:

d-mpk12-65-186 ) Lsync l
-- Lsync: listing what to sync under ~
     from d-mpk12-65-186.SFBay.Sun.COM to paedata.sfbay
building file list ... done
deleting tools/sh/Lsync/tmpfile
./
man/cat1/
man/cat1/rsync.1.gz
tools/sh/Lsync/README

wrote 1016147 bytes  read 28 bytes  16796.28 bytes/sec
total size is 4882980897  speedup is 4805.26
-- Use "Lsync L" to perform sync
d-mpk12-65-186 ) 

Then, noticing that the only thing to delete is something I have deleted on the sender and genuinely do not want any more, I go ahead and "make it so":

d-mpk12-65-186 ) Lsync L
-- Lsync: SYNCING everything under ~
     from d-mpk12-65-186.SFBay.Sun.COM to paedata.sfbay
Enter 'y' to confirm: y
building file list ... 
49676 files to consider
deleting tools/sh/Lsync/tmpfile
./
man/cat1/
man/cat1/rsync.1.gz
       47248 100%  780.49kB/s    0:00:00  (1, 70.8% of 49676)
tools/sh/Lsync/
tools/sh/Lsync/README
       12288 100%  255.32kB/s    0:00:00  (2, 99.4% of 49676)

wrote 1075767 bytes  read 60 bytes  16942.16 bytes/sec
total size is 4882980897  speedup is 4538.82

If I want to double-check, I can now see what might be out of date with my "master":

d-mpk12-65-186 ) Lsync m
-- Lsync: listing what to sync under ~
     from paedata.sfbay to d-mpk12-65-186.SFBay.Sun.COM
receiving file list ... done

wrote 337 bytes  read 1049835 bytes  21653.03 bytes/sec
total size is 4882980897  speedup is 4649.70
-- Use "Lsync M" to perform sync

If I want to just sync a particular directory tree, I can specify this as an argument after the operation letter. This will be a lot faster than examining my whole home directory tree on both hosts. It also ignores my "rsync includes" file.

Any absolute or relative path can be specified, but it must resolve to something below my $HOME:

d-mpk12-65-186 ) pwd
/Users/timc/tools/sh/Lsync
d-mpk12-65-186 ) Lsync l .
-- Lsync: listing what to sync under ~/tools/sh/Lsync
     from d-mpk12-65-186.SFBay.Sun.COM to paedata.sfbay
building file list ... done
./
README

wrote 165 bytes  read 28 bytes  55.14 bytes/sec
total size is 32083  speedup is 166.23
-- Use "Lsync L" to perform sync
d-mpk12-65-186 ) Lsync l ~/tools/
-- Lsync: listing what to sync under ~/tools
     from d-mpk12-65-186.SFBay.Sun.COM to paedata.sfbay
building file list ... done
sh/Lsync/
sh/Lsync/README

wrote 54756 bytes  read 28 bytes  9960.73 bytes/sec
total size is 84506552  speedup is 1542.54
-- Use "Lsync L" to perform sync
d-mpk12-65-186 ) Lsync l /var/tmp
Lsync: can not sync "/var/tmp", as it is not under $HOME

Comments:

I know you mentioned that you were aware of other solutions, but it's only fair to other readers to suggest Unison (http://www.cis.upenn.edu/~bcpierce/unison/) as an alternative which does synchronisation.

Posted by Ceri Davies on October 09, 2006 at 04:48 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

Tim Cook's Weblog The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today