Thursday, February 19, 2009

The Backup Tool

In my previous blog entry I wrote an overview of an in-house backup solution which seems to be a good enough replacement for over 90% of backups currently done by Netbackup in our environment. I promised to show some examples on how it actually works. I can't give you output from a live system so will show some examples from a test one. Let's go thru couple of examples then.

Please keep in mind that it is still more like a working prototype than a finished product (and it will most certainly stay that way to some extend).

To list all backups (I run this on an empty system)
# backup -l

Let's run a backup for a client mk-archive-1.test

# backup -c mk-archive-1.test
Creating new file system archive-2/backup/mk-archive-1.test
Using generic rules file: /archive-2/conf/standard-os.rsync.rules
Using client rules file: /archive-2/conf/mk-archive-1.test.rsync.rules
Starting rsync
Creating snapshot archive-2/backup/mk-archive-1.test@rsync-2009-02-19_15:11--2009-02-19_15:14
Log file: /archive-2/logs/mk-archive-1.test.rsync.2009-02-19_15:11--2009-02-19_15:14

Above you can see that it uses to config files - one is a global file describing includes/excludes which are run for all clients and the second file which describes an include/exclude file for that specific client. In many cases you don't need to create that file - the tool will create an empty one for you.

Let's list all our backups then.

# backup -lv
mk-archive-1.test 1.15G 1.15G 1.75x 35 (global)
mk-archive-1.test@rsync-2009-02-19_15:11--2009-02-19_15:14 1.15G 0 1.75x

The snapshot definies a backup and I put the start and end date of the backup in its name.

If you want to schedule a backup from a cron you do not need any verbose output - there is an option "-q" which keeps the tool quiet.

# backup -q -c mk-archive-1.test
# backup -lv
mk-archive-1.test 1.15G 1.16G 1.75x 35 (global)
mk-archive-1.test@rsync-2009-02-19_15:11--2009-02-19_15:14 1.15G 6.63M 1.75x
mk-archive-1.test@rsync-2009-02-19_15:16--2009-02-19_15:16 1.15G 0 1.75x

Now lets change the retention policy for the client to 15 days.

# backup -c mk-archive-1.test -e 15
# backup -lv
CLIENT NAME REFER USED RATIO RETENTION 1.15G 1.16G 1.75x 15 (local) 1.15G 6.63M 1.75x 1.15G 0 1.75x

To start an expiry process of old backups (not that there is something to expire on this empty system...):

# backup -E

Expiry started on : 2009-02-19_17:21
Expiry finished on : 2009-02-19_17:21
Global retention policy : 35
Total number of deleted backups : 0
Total number of preserved backups : 0
Log file : /archive-2/logs/retention_2009-02-19_17:21--2009-02-19_17:21

You can also expire all backups or for a specific client according to a global and a client specific retention policies, you can generate reports, list all currently active backups, etc. The current usage information for the tool looks like:

# backup -h

usage: backup {-c client_name} [-r rsync_destination] [-hvq]
backup [-lvF]
backup [-Lv]
backup {-R date} [-v]
backup {-E} [-v] [-n] [-c client_name]
backup {-e days} {-c client_name}
backup {-D backup_name} [-f]
backup {-A} {-c client_name} [-n] [-f] [-ff]

This script starts remote client backup using rsync.

-h Show this message
-r Rsync destination. If not specified then it will become Client_name/ALL/
-c Client name (FQDN)
-v Verbose
-q Quiet (no output)
-l list all clients in a backup
-v will also include all backups for each client
-vF will list only backups which are marked as FAILED
-e sets a retention policy for a client
if number of days is zero then client retention policy is set to global
if client_name is "global" then set a global retention policy
-L list all running backups
-v more verbose output
-vv even more verbose output
-R Show report for backups from a specified date ("today" "yesterday" are also allowed)
-v list failed backups
-vv list failed and successful backups
-E expire (delete) backups according to a retention policy
-c client_name expires backup only for specified client
-v more verbose output
-n simulate only - do not delete anything
-D deletes specified backup
-f forces deletion of a backup - this is required to delete a backup if
there are no more successful backups for the client
-A archive specified client - only one backup is allowed in order to achive client
-c client_name - valid client name, this option is mandatory
-n simulate only - do not archive anything
-f deletes all backup for a client except most recent one and archives client
-ff archives a client along with all backups
-I Initializes file systems within a pool (currently: archive-1)


In order to immediatelly start a backup for a given client:

backup -c XXX.yyy.zz
backup -r XXX.yyy.zz/ALL/ -c XXX.yyy.zz

Above two commands are doing exactly the same - the first version is preffered.
The 2nd version is useful when doing backups over ssh tunnel or via a dedicated backup interface
when it is required to connect to different address that a client name. For example, in order
to start a backup for a client XXX.yyy.zz t via ssh tunnel at localhost:5001 issue:

backup -r localhost:5001/ALL/ -c XXX.yyy.zz


backup -E - expire backups according to retention policy
backup -e 30 -c global - sets global retention policy to 30 days
backup -l - list all clients in backup including their retention policy


Kim Nørgaard said...

This looks very interesting.

Is the source code something you would like to share with the community so that others can benefit from it?

milek said...

Yes, I would like to do it but I need first get a green light from my employer.

Anonymous said...

It's been awhile, so I'm assuming your employer doesn't want you to release the code? :\

milek said...

Unfortunately I change employers in the mean time... I will check with them if it would be possible to open source it.

Chris Nagele said...

I would love to use this as well. Please keep us posted if you get permission to open source it.