Re: ADSM versus Arcserve and Backup Exec

ADSM performance is a tricky balencing act. There are many things that affect
the speed of restoring data. Some that immediatly spring to mind:

1. Co-location off. This can lead to many more tape mounts than you might expect

2. Many small tapes. This again leads to many tape mounts.
3. Too high a mount retention period. This can compound the slowness of the
above two points, especially if you only have a single tape device in the
library.
4. Slow seek time to a particular file on a tape. This is a problem that gets
worse if files are scattered across several tapes. Seeking to a particular spot
on, say, an exabyte tape can take 2 hours or more.
5. Too low a priority on the restore operation vs. the backup and archive
processing. This causes the restore to be pre-empted by any backup that is
running.
6. Slow network performance. The network is used to transport all information
for the restore client to use to compare dates etc. on files that are being
restored.
7. Tape fragmentation. Use tape reclaimation to minimise the number of tapes
holding data, and thus the number of mounts.

Where I am currently working, we did a full DR test of one of our SP nodes over
FDDI, and recovered the box, plus about 25GB of data in  about 6 hours from
start to finish. We use a 3575 L24 library with four tape drives. The data was a
mixture of normal files backed up with DSMC, together with some Oracle databases
backed up using SQL*Backtrack (please don't ask me about Backtrack, our DBA's
look after that side of things).

So the golden rules to follow if you are backing up GB of data are:

1. Go for the fastest network you can afford. If possible, put in a special
network for ADSM (FDDI, ATM, or 100Mb/s Ethernet).
2. Go for a tape technology that provides fast seek time as well as fast backup
time (IBM 3570 tapes are ideal, although 3590, 3595 and possibly DLT are also
probably suitable).
3. Have more tape drives than you need for regular backup operations, to allow
you to have at least one spare for restore operations.
4. Make sure that you don't have your mount retention too high
5. If files are changing frequently, from many different client systems,
consider co-location to minimise the number of tape mounts for recovery.

Please note that these are just muses that I had when considering the problem in
the bath. I do not say that there are not problems with ADSM, but looking at how
you spread data across tapes, in the way I have proposed should be considered.

Peter Gathercole
Open Systems Consultant.

Christo Heuer wrote:

> Hi,
>
> I can agree with Dan regarding poor performance when it gets to
> restoring a great number of small files. It does not matter what platform
> you are on, the performance IS an issue!
> To give you an idea - We had  a standalone unix server that we backed
> up (4M/bit T/ring card, not very powerfull as far as processing etc. goes).
> We backed up this box - 15Gig of user data consisting of about 300K files,
> between 1Kb and 1.7Meg each.
> The backup took about 16Hours - this is WITH other backups etc. running to
> the ADSM server. In other words the ADSM server was not dedicated to this
> task.
> This is acceptable performance taking the above into consideration, BUT,
> (Why is there always a BUT?), the restore was a different story.
>
> This unix box was moving to a node on our SP/2 - the data was already
> on this node but a bit outdated as to speak.
> So the only option we had was to do one of two things:
> 1) Wipe all the older data on the SP/2 node and restore the standalone
> unix box onto the SP/2 node
> or
> 2) Use ADSM's intelligence and restore the box using the -ifn (If newer then
> replace).
> BIG mistake!
> The restore ran for two days non-stop - where we eventually cancelled it and
> restored
> via the GUI braking the directory structure down a bit more.
> This helped in the sense that we had less files in the restore stream and we
> could
> also make use of parrallel sessions.
> At the end it took us about the whole week-end sitting monitoring the
> restore process
> and starting new processes. (Not a pretty solution).
> Now - Where was the bottleneck?
> Who knows - Bad design I think (Sorry IBM)
> The ADSM server: In this case is a powerfull MVS mainframe - doing a few
> I/O's
> and using very little CPU cycles during all of the restore process.
>
> The network: Escon attached to SP/2 node
>
> The Adsm client: (Powerfull node - the reason we had to move this
> box to the node was because of performance on the old box)
>
> We monitored the network/cpu usage on the SP/2 side etc. but
> nowhere could we find any pointers as to what the problem
> with the bad performance was.
>
> During the whole restore process the first time - after running for
> about 24 hours only 15Meg of user data was physically transferred.
> The rest of the time seems to be spent comparing files between
> MVS and the SP/2 - not that this should be taking so long?
> Has anyone been able to get better performance from a restore
> when there is such a number of files in the picture?
> Does not matter what platform!
>
> Cheers
> Christo Heuer
>
> >There are so many pieces involved in tuning for a fast restore.  It would
> be
> >nice to know how fast each piece could theoretically go.  In this case it
> >would be nice if someone (IBM?) supplied a program which could run on many
> >platforms that would read an input stream of file names and sizes and then
> >allocate the file and write random data of the input size.  This would test
> >the speed of the filesystem and be one of the baselines for estimating
> >restore time.  When a restore is going badly, it might help point the blame
> >finger in the right direction.
> >
> >A logical source for the input file would be the output of 'dsmc q backup'.
> >I think it would help us plan for restores of large servers.
> >
> >Another Tool for baselining would be to run a restore into  the bitbucket,
> >i.e. exercise the server and the network, but not the filesystem.
> >
> >one more would be to have the server do everything it does for a restore
> but
> >not send it out over the network.
> >
> >--
> >-----------------------------------------------------------
> >Bill Colwell
> >C. S. Draper Lab
> >Cambridge, Ma.
> >bcolwell AT draper DOT com
> >-----------------------------------------------------------
> >
> >In <19980914142949.13666.rocketmail AT send103.yahoomail DOT com>, on 
> >09/14/98
> >   at 07:29 AM, Dan Kronstadt <dkronsta AT YAHOO DOT COM> said:
> >
> >>Dave - I would be interested in getting some more info about your restore
> >>tests. We are happy with adsm except for one thing - a restore of a file
> >>server with several 100K files takes TOO LONG! I have yet to hear anyone
> >>say they can restore more than a gig or 2 an hour, when that is made up of
> >>files averaging 50K each. We are still testing, and the bottleneck may be
> >>Netware *allocating* that many files - but other vendors (arcserve, for
> >>example) claim faster restore times. Do you have any info on this kind of
> a
> >>restore? Large files get restored fine.
> >
> >>Thanks.
> >>Dan Kronstadt
> >>Warner Bros.
> >>dan.kronstadt AT warnerbros DOT com
> >
> >
> >
> >
> >>---Dave Larimer <david.larimer.hnj9 AT STATEFARM DOT COM> wrote:
> >>>
> >>> An alternative suggestion on the use of ADSM, the issue is that you
> >>do not
> >>> wish to use ADSM over the network because of restore being too slow.
> >> If
> >>> this is correct, I would give you an alternative suggestion.  Given
> >>that a
> >>> disaster situation is hopefully few and far between, backup all data
> >>via
> >>> ADSM through the network and in the event of an actual disaster,
> >>construct
> >>> a new box at the central site, restore it there and ship it to the
> >>remote
> >>> location.  The cost savings eliminating local software, tape library,
> >>> hardware and labor would be substantial.   In addition, when I
> >>evaluated
> >>> Arcserve, Backup Exec, and ADSM, I found the following:
> >>> Backup time:  (depending on how much data changes from day to day) I
> >>found
> >>> that overall ADSM came in first, followed closely            by
> >>Arcserve
> >>> and then by Backup Exec.
> >>> Restore time: (depending on severity of restore and network
> >>connectivity)
> >>> All three products performed about the same, with            ADSM
> >>having a
> >>> slight edge, due to it's strength as file restore software.  In
> >>ADSM, the
> >>> file is ready as soon as it        is restored.  This may not be the
> >>case
> >>> with the other two products.
> >>> Service Support: This is the part that I experienced the most
> >>variety, with
> >>> ADSM, I found the most support, followed by Arcserve         and then
> >>> Backup Exec a distance third.  Backup Exec's support fell off sharply
> >>> during off hours.
> >>> Cost savings: ADSM clearly came out ahead here in all categories.
> >>>
> >>> I hope that this helps.
> >>>
> >>> Dave Larimer
> >>> David.Larimer.HNJ9 AT StateFarm DOT com
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> From:
> >>O1=INET00/C=US/A=IBMX400/P=STATEFARM/DD.RFC-822=ADSM-L\@VM.MARIST.EDU > on
> >>09/11/98 04:21:21 PM
> >>> To:   ADSM-L
> >>> cc:
> >>> Subject:  ADSM versus Arcserve and Backup Exec
> >>>
> >>> Help!
> >>>
> >>>      There is a shift going on within our company where many Netware
> >>> servers are being consolidated to larger NT servers.  A large number
> >>of
> >>> these Netware soon to be NT servers are located in remote offices
> >>connected
> >>> to our statewide ATM backbone via T1 lines.  The new NT servers in the
> >>> remote offices will contain approximately 6 - 10 GB of user data.
> >>> In most cases we were not planning on backing up the remote NT
> >>machines to
> >>> a central ADSM server because it would take too long to restore an
> >>entire
> >>> machine in a disaster recovery scenario.  This means, for the remote
> >>> offices, local tape, probably a IBM 3570 library, would be used with
> >>the
> >>> standalone version of ADSM.  We also thought we might backup the 3570
> >>> storage pools to a central server for disaster protection.
> >>>
> >>>      Our current enviroment is ADSM for MVS v3 backing up 100
> >>clients all
> >>> within the Datacenter or close by.  Clients are AIX, SUN, HP,
> >>Windows NT
> >>> (Lotus Notes Servers), and 1 Netware server.   ADSM has been in used
> >>to
> >>> backup our UNIX servers for nearly 3 years.  Arcserve is currently
> >>used to
> >>> backup the Netware servers using a DAT tape drive attached to each
> >>server.
> >>> We standardized, or a least I thought we did, on using ADSM company
> >>wide
> >>> about a year and a half ago.
> >>>
> >>>      Ok that's the background on to the problem.. A person from our
> >>> distributed computing group informed me today that they have pretty
> >>much
> >>> decided to go with Arcserve or Seagate Backup Exec to backup the
> >>remote
> >>> office servers.  This decision was made without my involvement and
> >>> shouldn't have been.. But that's a political issue.. The question I
> >>have
> >>> for you good people is has anyone out there done a side by side
> >>comparison
> >>> of the ADSM single server version versus Arcserve and/or Seagate
> >>Backup
> >>> Exec? Any ammo you can give me that shows ADSM is the better choice
> >>would
> >>> be GREATLY appreciated.  It is their feeling that ADSM is too slow
> >>and not
> >>> widely used in the industry for backing up Windows NT or Netware.
> >>>
> >>>
> >>> Thanks!
> >>> Jeff Connor
> >>> Niagara Mohawk Power Corp.
> >>> Syracuse NY
> >>>
> >
> >>_________________________________________________________
> >>DO YOU YAHOO!?
> >>Get your free @yahoo.com address at http://mail.yahoo.com