ADSM-L

Re: [ADSM-L] best backup method for millions of small files?

2009-05-01 11:10:40
Subject: Re: [ADSM-L] best backup method for millions of small files?
From: "Huebner,Andy,FORT WORTH,IT" <Andy.Huebner AT ALCONLABS DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 1 May 2009 10:09:55 -0500
Our 32 bit node that backs up large file systems, 9.4 million objects total, 
only has 2 over 1 million and the biggest is 6.5 million and I use the disk 
cache method without any problems.  The server does have the /3gb switch and 
4GB RAM.  This system does not use journaling and is known to be running near 
the limits of a 32 bit process.
We are running with resource utilization set at 10.
For timing it runs in 5-11 hours moving a relatively insignificant amount of 
data.  The faster runs are when the TSM server is not as busy.

It is definitely a balancing act to get it to run at the edge of the limits of 
RAM and still run fast.  We used the information in the TSMSTATS.ini to 
identify file systems with few files and then excluded them from the 
memoryefficientbackup.  If there was a way to change the order that the file 
systems backup we probably could make it run faster by better balancing memory 
usage with the larger file systems.

Andy Huebner
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Steven Harris
Sent: Thursday, April 30, 2009 6:23 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi Norman

Your post worries me, as I'm just implementing an email archive solution
that will depend on windows journalling to back up some huge repositories.
The particular product fills up "containers" that  once filled never
change, so the change rate will be low there, but there are also index
files that will change often.

Have you determined whether the memory issue is related to number of files
or number of changes?

Regards

Steve

Steven Harris
TSM Admin, Sydney Australia





             "Gee, Norman"
             <Norman.Gee AT LC DOT CA
             .GOV>                                                      To
             Sent by: "ADSM:           ADSM-L AT VM.MARIST DOT EDU
             Dist Stor                                                  cc
             Manager"
             <[email protected]                                     Subject
             .EDU>                     Re: [ADSM-L] best backup method for
                                       millions of small files?

             01/05/2009 07:12
             AM


             Please respond to
             "ADSM: Dist Stor
                 Manager"
             <[email protected]
                   .EDU>






What options are there when journaling runs out of memory on a 32 bit
Windows server?  I have about 10 million files on one server that the
journal engine runs out of memory. With memory efficient disk cache
method and resource utilization 5, its runs out of memory,  resource
utilization of 4 runs too long.

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Huebner,Andy,FORT WORTH,IT
Sent: Thursday, April 30, 2009 8:16 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: best backup method for millions of small files?

You have a disk array copy of the data, is that located close or far?
Have you considered a disk array snap shot also?
If you perform a journaled file system backup and an image backup then
you should be able to restore the image and then update the image with
the file system restore.  This might take a long time, I have never
tried it.
What failure are you trying to protect against?  In our case we use the
disk arrays to protect against a data center loss and a corrupt file
system and a TSM file system backup to protect against the loss of a
file.  Our big ones are in the 10 million file range.  Using a 64bit
Windows server we can backup the file system in about 6 - 8 hours
without journaling.  We suspect we could get the time down to around 4
hours if the TSM server was not busy backing up 500 other nodes.

To me the important thing is to figure out what you are protecting
against with each thing you do.  Also be sure and ask what the Recovery
Point Objective (RPO) is.  If it is less than 24 hours then array based
solutions may be the best choice.  Over 24 hours then TSM may be the
best choice.

Andy Huebner

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Mehdi Salehi
Sent: Thursday, April 30, 2009 9:39 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi,
None the two methods that you mean in the user's guide are suitable for
my
case. "Image+normal incremental" that you emphasized in your post means
getting full image backups for example every week. For the incremental
part,
one file-based full backup is needed which is a nightmare for 20
millions.
OK, if I accept the initial incremental backup time (that might take for
days), what happens in restoration?

Naturally, last image backup should be restored first and it will take A
minutes. Provided that image backups are weekly, the progressive
incremental
backups of the week is about 6*20MB=120MB. Now imagine 120MB of 15-20K
files
are to be restored in filesystem with an incredibly big file address
table
and system should create an inode-like entry for each. If this step
takes B
minutes, the total restoration time would be A+B. (A+B/A) ratio is
important
and I will try to measure and share it with the group.

Steven, your solution is excellent for ordinary filesystems with a
limited
number of files. But I think for millions of files, only backup/restore
method that do not care how many files exist in the volume are feasible.
Somehing like pure image backup (like Acronis image incremental backup)
or
the method that FastBack exploites.

Your points are welcomed.

Regards,
Mehdi Salehi


This e-mail (including any attachments) is confidential and may be
legally privileged. If you are not an intended recipient or an
authorized representative of an intended recipient, you are prohibited
from using, copying or distributing the information in this e-mail or
its attachments. If you have received this e-mail in error, please
notify the sender immediately by return e-mail and delete all copies of
this message and any attachments.
Thank you.


This e-mail (including any attachments) is confidential and may be legally 
privileged. If you are not an intended recipient or an authorized 
representative of an intended recipient, you are prohibited from using, copying 
or distributing the information in this e-mail or its attachments. If you have 
received this e-mail in error, please notify the sender immediately by return 
e-mail and delete all copies of this message and any attachments.
Thank you.

<Prev in Thread] Current Thread [Next in Thread>