Amanda-Users

Re: wishlist suggestions

2002-11-07 11:58:53
Subject: Re: wishlist suggestions
From: Frank Smith <fsmith AT hoovers DOT com>
To: todd AT fries DOT net, amanda-users AT amanda DOT org
Date: Thu, 07 Nov 2002 10:09:55 -0600


--On Thursday, November 07, 2002 07:34:32 -0600 "Todd T. Fries" <todd AT fries DOT 
net> wrote:

The defragmentation I hope I explain properly, if not, ask for more/better
explanation.  To need defragmentation, you also need support for multiple
(partial)backups per tape.  It is best done with alot of holding disk and
a tape silo, but can be done otherwise.  The concept stems from adsm's
reality of doing a full backup once, then incremental always afterwards.
There is a concept of 'keep only one copy of a file if the most recent 
modification is over x days old' and 'keep at max x copies of a file if the 
modification
times are since x days old' and 'keep all files since x days old'.  This may
make more sense with an example:

        - keep only one copy of each file if the most recent modification date
                is older than 90 days
        - keep at max 3 copies of a file if the modification times are newer
                than 90 days
                - or -
          keep all files newer than 90 days

When you do things like the above, you end up having to go through a tape
that has many files backed up, and remove a few files that are outside
the scope of backup.  The tape gets dumped to the holding area, and then
the data is manipulated to remove the files not needed in the backup system
anymore, then spooled with other data heading to tape.  In general data
gets spooled to tape together, but each tape (or 'volume') gets fully
used each time it is written to.

This seems pretty scary, since it means you only have one copy of static files.
When you have a tape error on that tape how do you recover?

The indexing mechanism for adsm required several tuning stages, since there
was a site I used to work for that used adsm that had a 500gb filesystem
with a poor algorithm for spreading out data (usually about 5 subdirs
deep to find a file and one or two files in the subdir, if any).  It took
the indexing mechanism 22 hours to do a search for what to backup, then
about 20 minutes to do the actual backup.

It could get time consuming, perhaps the indexes would save time, but the
concept of doing 'ls -allverions file' and seeing each version of a specific
file that is in the backup set was extremely useful in adsm, users that
were very uncertain as to when the file was removed could be told we have
a file from the 10th, the 11th, and the 13th.

This would be a nice feature, and if indexing is available shouldn't be too
hard to implement (although probably easier to implement as a stand-alone
program to browse indexes than to add in to amrecover).

Frank

--
Todd Fries .. todd AT fries DOT net

(last updated $ToddFries: signature.p,v 1.2 2002/03/19 15:10:18 todd Exp $)



--
Frank Smith                                                fsmith AT hoovers 
DOT com
Systems Administrator                                     Voice: 512-374-4673
Hoover's Online                                             Fax: 512-374-4501

<Prev in Thread] Current Thread [Next in Thread>