Veritas-bu

[Veritas-bu] Backing up a large amount of small files

2002-11-09 01:22:13
Subject: [Veritas-bu] Backing up a large amount of small files
From: scurry AT yahoo-inc DOT com (Steve Curry)
Date: Fri, 8 Nov 2002 22:22:13 -0800
Breaking this complex volume down into smaller volumes is (IMHO) the
best method.  This backup problem is compounded by the fact that
restores are twice as painful (or can be!).  I currently backup some
volumes that are 350GB and contain 40 million files (Unix/Netapp Filer).
While testing restores of this data we found it took in *some* cases
4-8X the time to restore as it took to backup.  So if a backup took 10
hours, a restore took 80 hours!!  This... is painful at best when the
SLA reads 12 hours. :(  The best solution I've found so far (and please
chime in if you have better suggestions) is to backup smaller chunks of
this data based on what Rocky Reed mentioned below.  These large volume
+ small file backups are also very cpu/disk intensive (especially on the
client) while Veritas is building the file index database.

Creating a snapshot of your target data is also a great idea.  We create
snapshots of our volumes before the backup, backup that snapshot and
then delete the snapshot after the backup.  This resolves the file
locking issues you might run into during backup (I have no idea how this
works on MS Windows, I don't touch that stuff).  Ahhhh, I'm 8 shots of
espresso into my day... 

Have a great weekend all!! ;)

  Steve Curry
  Guardian of Y! Production Data
  Yahoo! Inc.
  
  (w) 408.349.6632
  (m) 408.373.7247
  (e)  scurry AT yahoo-inc DOT com
 

 #-----Original Message-----
 #From: veritas-bu-admin AT mailman.eng.auburn DOT edu [mailto:veritas-bu-
 #admin AT mailman.eng.auburn DOT edu] On Behalf Of Bruno.Bossier AT comparex 
DOT be
 #Sent: Tuesday, November 05, 2002 6:04 AM
 #To: Rockey Reed
 #Cc: veritas-bu AT mailman.eng.auburn DOT edu
 #Subject: RE: [Veritas-bu] Backing up a large amount of small files
 #
 #
 #That is probably the best thing to do.
 #
 #Thanks !
 #Bruno
 #
 #
 #
 #
 #
 #                    Rockey Reed
 #                    <Rockey.Reed@ve       To:
 #"'Bruno.Bossier AT comparex DOT be'" <Bruno.Bossier AT comparex DOT be>
 #                    ritas.com>            cc:
 #                                          Subject:     RE:
[Veritas-bu]
 #Backing up a large amount of small files
 #                    11/05/2002
 #                    05:15
 #
 #
 #
 #
 #
 #
 #OK Bruno, let's be honest here.  NBU as with all backup software
 #applications have difficulty with large quantities of small files.
This
 #is
 #caused by the overhead required to catalog the files, retain metadata
of
 #their locations on tape, etcetera.  A recommendation would be to look
at
 #breaking the 150Gb into something like three 50GB chunks.  Do this in
the
 #file selection using the * wildcard and streaming the data.  The file
list
 #would look something like this:
 #
 #NEW_STREAM
 #/path/a*
 #/path/b*
 #|
 #/path/g*
 #NEW_STREAM
 #/path/h*
 #|
 #/path/m*
 #
 #down to /path/z*
 #
 #To write to one drive set multiplexing in both the STU and the
schedule
 #and
 #set it to the number of streams.
 #
 #HTH.
 #
 #Thanks,
 #Rockey Reed
 #
 #-----Original Message-----
 #From: Bruno.Bossier AT comparex DOT be [mailto:Bruno.Bossier AT comparex DOT 
be]
 #Sent: Monday, November 04, 2002 9:34 AM
 #To: David A. Chapa
 #Cc: veritas-bu AT mailman.eng.auburn DOT edu
 #Subject: Re: [Veritas-bu] Backing up a large amount of small files
 #
 #
 #
 #All the data is on EMC via a SAN, so RAID-1.
 #
 #
 #
 #
 #
 #
 #
 #                    "David A.
 #
 #                    Chapa"               To:
Bruno.Bossier AT comparex DOT be
 #
 #                    <david@datasta       cc:
 #veritas-bu AT mailman.eng.auburn DOT edu
 #                    ff.com>              Subject:     Re: [Veritas-bu]
 #Backing up a large amount of small files
 #
 #
 #                    10/30/2002
 #
 #                    18:03
 #
 #
 #
 #
 #
 #
 #
 #
 #
 #Well, I did see something like this at a client and I always found the
 #best
 #
 #solution to something like this is to just format the dang thing :-)
 #
 #But Seriously now...
 #
 #What RAID level, 5?
 #
 #RAID5 is great for writes but not for reads, and with as many reads as
you
 #need
 #done for this backup you would definitely hit timeouts.  This is
exactly
 #the
 #problem one of my current clients is facing.
 #
 #What about the last time SCANDISK was run?  We found a heavily
fragmented
 #disk
 #caused these types of timeouts as well.
 #
 #Some of the other things we tried was to break up the server backup at
the
 #file
 #list.  This was a bit more labor intensive, but it did work to some
 #degree.
 #
 #HTH
 #
 #David
 #
 #Quoting Bruno.Bossier AT comparex DOT be:
 #
 #> We have a server with a directory which contains over 3 million
files
 #for
 #a
 #> total amount of nearly 150 Gb. We are trying to back this up in a
 #> reasonable amount of time (12 to 14 hours maximum). OTM is enabled.
We
 #have
 #> not succeeded so far. A lot of time is lost at the beginning of the
 #backup
 #> when the backup is going through the directory structure to check
all
 #> files. This takes several hours. The first we already did until now
is
 #to
 #> set the maximum cache size to 0 (unlimited). This stopped the errors
we
 #saw
 #> in the Windows eventlog when OTM is started and then aborted after
40
 #> minutes, then restarted again ....
 #>
 #> Can someone give some suggestions to speed up this backup ?
 #>
 #> Thanks,
 #> Bruno
 #>
 #>
 #> _______________________________________________
 #> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
 #> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
 #>
 #
 #
 #
 #
 #
 #
 #
 #_______________________________________________
 #Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
 #http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
 #
 #
 #
 #
 #_______________________________________________
 #Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
 #http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu