Veritas-bu

[Veritas-bu] Checking to see if millions of files are backed up?

2007-03-26 18:40:59
Subject: [Veritas-bu] Checking to see if millions of files are backed up?
From: jpiszcz.backup at gmail.com (Justin Piszcz)
Date: Mon, 26 Mar 2007 18:40:59 -0400
The main reason for something like this overall is you will get 5
kilobytes per second if you backup a filesystem with a lot of spare
data.

Think of 500,000 directories and 60,000 files, but the 60,000 files
are scattered across the 500,000 directories. 99% of the backup is
NetBackup traversing the directory hierarchy and gives about 5KB/s
average to the tape drive, causing shoe-shining and other problems.

Justin.

On 3/26/07, Justin Piszcz <jpiszcz.backup at gmail.com> wrote:
> A nice idea; however, this is actually part of a much larger and
> complicated system in which certain files have to kept for certain
> retentions, both on disk and backed up to tape, think tape archival
> with different retention rates.  If I were to change the entire
> architecture behind it, this may be a good solution and is something
> on my plate for the future.  However, in the interim, I just need a
> solution to verify files have been backed up to tape and remove them
> if they are older than N days if we are running low on space on any
> particular server.
>
> On 3/26/07, Whelan, Patrick <Patrick.Whelan at colt.net> wrote:
> > Why not have a script that runs a backup followed by an archive. Check
> > the error code of the backup, if is not 0 then don't run the archive. If
> > it is 0 then run the archive which will automatically delete the files
> > when it completes successfully. It will not delete antything if it fails
> > even with a 1.
> >
> > Regards,
> >
> > Patrick Whelan
> >
> > -----Original Message-----
> > From: veritas-bu-bounces at mailman.eng.auburn.edu
> > [mailto:veritas-bu-bounces at mailman.eng.auburn.edu] On Behalf Of Justin
> > Piszcz
> > Sent: 26 March 2007 22:36
> > To: bobbyrjw at comcast.net
> > Cc: Veritas-bu at mailman.eng.auburn.edu
> > Subject: Re: [Veritas-bu] Checking to see if millions of files are
> > backed up?
> >
> >
> > The problem with that is two-fold:
> >
> > 1. We backup multiple copies of the data, therefore, the archive option
> > will not work. 2. What if a tape has an I/O error half way through the
> > archive process? Yikes.
> >
> > Justin.
> >
> > On 3/26/07, Bobby Williams <bobbyrjw at comcast.net> wrote:
> > > Why not set up an archive schedule?  That way, the files can be
> > > archived and NetBackup will ensure that they are on tape before
> > > removing.
> > >
> > >
> > >
> > >
> > > Bobby Williams
> > > 2205 Peterson Drive
> > > Chattanooga, Tennessee  37421
> > > 423-296-8200
> > >
> > > -----Original Message-----
> > > From: veritas-bu-bounces at mailman.eng.auburn.edu
> > > [mailto:veritas-bu-bounces at mailman.eng.auburn.edu] On Behalf Of Justin
> >
> > > Piszcz
> > > Sent: Monday, March 26, 2007 4:27 PM
> > > To: Veritas-bu at mailman.eng.auburn.edu
> > > Subject: [Veritas-bu] Checking to see if millions of files are backed
> > > up?
> > >
> > > If one is to create a script to ensure that the files on the
> > > filesystem are backed upon before removing them, what is the best
> > > data-store model for doing so?
> > >
> > > Obviously, if you have > 1,000,000 files in the catalog and you need
> > > to check each of those, you do not want to bplist -B -C -R 999999
> > > /path/to/file/1.txt for each file.  However, you do not want to grep
> > > "1" one_gigabyte_catalog.txt either as there is really too much
> > > overhead in either case.
> > >
> > > I have a few ideas that involves neither of these, but I was wondering
> >
> > > if anyone out there had already done something similar to this that
> > > was high performance?
> > >
> > > Justin.
> > > _______________________________________________
> > > Veritas-bu maillist  -  Veritas-bu at mailman.eng.auburn.edu
> > > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> > >
> > >
> > _______________________________________________
> > Veritas-bu maillist  -  Veritas-bu at mailman.eng.auburn.edu
> > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> >
> >
> > *************************************************************************************
> > The message is intended for the named addressee only and may not be 
> > disclosed to or used by anyone else, nor may it be copied in any way.
> >
> > The contents of this message and its attachments are confidential and may 
> > also be subject to legal privilege.  If you are not the named addressee 
> > and/or have received this message in error, please advise us by e-mailing 
> > security at colt.net and delete the message and any attachments without 
> > retaining any copies.
> >
> > Internet communications are not secure and COLT does not accept 
> > responsibility for this message, its contents nor responsibility for any 
> > viruses.
> >
> > No contracts can be created or varied on behalf of COLT Telecommunications, 
> > its subsidiaries or affiliates ("COLT") and any other party by email 
> > Communications unless expressly agreed in writing with such other party.
> >
> > Please note that incoming emails will be automatically scanned to eliminate 
> > potential viruses and unsolicited promotional emails. For more information 
> > refer to www.colt.net or contact us on +44(0)20 7390 3900.
> >
> >
>