Re: rollforward vs. normal mode

As far as growth projections and justifications for additional TSM
resources, it is pretty simple. TSM grows for two reasons: More clients,
and larger clients. Only you can project how many clients you'll have. I
project the size of those clients by how large a PC hard drive I can buy
for $150 at Office Depot. This has been growing at an annual rate of
about 50% recently. I take this as an average between the extremes of
the guy who traded in a machine with a 10gb drive for a new one with a
150gb drive, and the guy who got an 80GB drive a year ago and is happy
with it for a while. Talk to the people who buy new desktop equipment in
your organization, and find out their plans for the coming year in terms
of how many and how large they are budgeted for. There is your budgetary
justification for adding resources to TSM.

The TSM database grows with the NUMBER of files, but I don't think
average file size has changed much. Huge image and audio files are
offset by the balooning number of small files that comprise most modern
installed application software, and it comes out in the wash. (Just look
at a TSM client!) Bigger client disks contain more files, not just
larger files.

Another TSM capacity thing to keep your ear to the ground for is any
shift in email software from a mailbox being one single file ("Mail
folders") to a system where each mail item is a single file ("Mail
dirs"). This can increase the number of files by a factor of 100 times,
but reduce the tape requirements considerably; one simple software
change that can make your TSM Database explode. (This is why I'm working
on splitting my server just now.)

I was recently playing one-upsmanship with some colleagues in our
Database area, and it turns out that I've got the biggest and most
active database of any kid on the block - the TSM Database. ("You
expired one million objects last night? Wow!) I've also got one of the
most primitive backup mechanisms, precisely because I cannot use Tivoli
Storage Manager to back it up. When I suggested TDP (Tivoli Data
Protection) for TSM at SHARE in March, the reaction was as though I had
proposed licking your own back with your tongue - a nearly impossible
contortion. (Unless, of course, you're a dog.)

But let's take this database seriously as a major, and critical,
enterprise database, and rethink how we protect it, as it grows beyond
100gb in many shops. Once splitting becomes a rational strategy, that is
a sign of a scalability design problem. In our case of TSM, the
scalability limitation is the Log, especially in rollforward mode. OTOH,
once you split your TSM server into two or more images, you open up
opportunities for backing their databases up into each other. Currently
there is the primitive method involving archived virtual volumes, but
surely we can do better than that. With two server images, TDP for TSM
is no longer a canine contortion, but a perfectly reasonable idea. We
ourselves beat the drum often and loudly, that there is no better way to
back up a database than TSM. Let's follow that advice!

Roger Deschner      University of Illinois at Chicago     rogerd AT uic DOT edu
              Academic Computing and Communications Center
(Systems Programmer, and TSM Administrator since it was WDSF.)


On Fri, 26 Jul 2002, Kelly J. Lipp wrote:

>Roger,
>
>Thanks.  Excellent.  Ditto.
>
>While on the "How about his Tivoli" bandwagon: multi-stream TSM database
>backup (and restore, but let's walk before we run...).  The TSM database has
>become much more stable and capable of being larger.  Therefore, we're
>routinely seeing databases approaching 100 GB (once unheard of).  At 40
>GB/hour (typical tape speed) that's two.5 hours to backup the db.  And only
>one of those hundreds of tape drives are running!  Let's put those tape
>drives to work!
>
>Kelly J. Lipp
>Storage Solutions Specialists, Inc.
>PO Box 51313
>Colorado Springs, CO 80949
>lipp AT storsol DOT com or kelly.lipp AT storserver DOT com
>www.storsol.com or www.storserver.com
>(719)531-5926
>Fax: (240)539-7175
>
>
>-----Original Message-----
>From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU]On Behalf Of
>William Rosette
>Sent: Wednesday, July 24, 2002 6:29 AM
>To: ADSM-L AT VM.MARIST DOT EDU
>Subject: Re: rollforward vs. normal mode
>
>
>Hi Roger,
>
>I recently sent this memo to a TSM consultant to get some ammo to acquire
>disk space for a project, any additional ammo such as below and then some
>would be greatly appreciated including your title and experience with TSM
>(need credentials for directors)
>
>
>"Judy,
>
>  I need your recommendation for disk space on SPF2N21 (our TSM server)
>in order to complete the project of the Policy Domain Change.  This is also
>including Month End full backups.
>
>We currently are backing up 688.2 GB incrementally daily.  We are currently
>using 134.4 GB of Database and Recovery Log space (67.2 GB mirrored). I
>suggested a projection of 268.8 GB growth (2x current space).  What I am
>needing from you is how this projection can be justified to my Directory.
>The AIX admins said a 1/2 drawer of 36.2 drives is sufficient.(8)  Any
>ideas are greatly appreciated."
>
>
>An additional note is that we are backing up 80 clients with 107 jobs and a
>total 10.3 Terabytes of DASD (capacity, not all being used and backed up,
>just potential)
>
>Thanks Roger,
>Bill Rosette
>Data Center/IS/Papa Johns International
>WWJD
>
>
>|---------+---------------------------->
>|         |           Roger Deschner   |
>|         |           <rogerd AT UIC DOT EDU> |
>|         |           Sent by: "ADSM:  |
>|         |           Dist Stor        |
>|         |           Manager"         |
>|         |           <[email protected]|
>|         |           .EDU>            |
>|         |                            |
>|         |                            |
>|         |           07/24/02 02:38 AM|
>|         |           Please respond to|
>|         |           "ADSM: Dist Stor |
>|         |           Manager"         |
>|         |                            |
>|---------+---------------------------->
>
>>---------------------------------------------------------------------------
>-----------------------------------|
>  |
>|
>  |       To:       ADSM-L AT VM.MARIST DOT EDU
>|
>  |       cc:
>|
>  |       Subject:  Re: rollforward vs. normal mode
>|
>
>>---------------------------------------------------------------------------
>-----------------------------------|
>
>
>
>
>I've run it both ways. ROLLFORWARD is goodness that lets me sleep better
>at night, but it's expensive goodness, in terms of management effort and
>system performance.
>
>1. Define your log to be 12gb, regardless of how big you think it should
>be. The max in V4+ is 13gb, and you want to leave yourself some room to
>add your PREDEFINED AND PREFORMATTED 1gb emergency log entent for when
>it fills up, as it inevitably will. My log is two 6gb extents which I
>think gives me some flexibility, compared to one 12gb extent. Disks are
>cheap, compared to being awakened in the middle of the night by a full
>log. Just make it 12gb. Period.
>
>2. Look at the statistics from your recent incremental database backups.
>That (plus a safety/growth factor of, say, 25%) is how big your log
>really needs to be, and if that's pushing 12gb, you are going to need to
>run Database Backups more frequently than you are now.
>
>3. Now consider the full backups. They take much longer. Therefore,
>you've got to start them much earlier. If there is one thing you want to
>avoid, it is a TRIGGERED FULL DB BACKUP. That's when my log fills up,
>always. While one could say this means I've got my trigger set too high,
>and they'd technically be right, I like to set it high or else I'll run
>a whole lot more incremental backups than I really need to. (Tivoli: It
>would be really nice to have TWO thresholds we could set in the backup
>trigger, separate for whether or not the next backup will be an
>incremental or a full backup. I'd set a much higher threshold when the
>next backup was to be incremental, comapred to full.)
>
>4. In fact, avoid triggered backups in general. It is inevitable that
>when my system gets a good head of steam up doing nice productive work
>like migration and reclamation, that the trigger pops and a tape drive
>gets yanked. And since the system is busy, the backup runs slower. And
>since the system is busy, the log fills up faster. Pop! It's full. So,
>watch how your work flow goes over the course of the day, and try to
>make all backups be scheduled ones, with triggered backups only in an
>emergency. I take the occurence of a triggered backup as a sign that I
>have not done an adequate job at scheduling things. You don't want to
>resort to Drastic Measures such as shutting down migration, reclamation,
>and expiration, due to your own poor planning.
>
>5. When running in ROLLFORWARD mode, database disk performance becomes a
>much more critical concern. I could go on for a long time about this,
>but basically you want disk arms. RAID-anything won't help database
>performance; you've got to have arms, due to the essentially random I/O
>pattern of the database. More smaller slower disk drives is better than
>fewer larger faster disk drives, because it allows a higher
>multiprogramming factor and thereby more total server throughput. I
>tried AIX JFS striping briefly, and it ran VERY slowly, so I gave it up.
>
>The bottleneck in TSM Database Backup is disk, not tape, performance.
>When the log is filling up and you are sitting there, on pins and
>needles, watching it fill up at the same time a DB backup is running,
>you want that database backup to run as fast as possible, so it can win
>the race. You can get full drawers of nice 9.1gb IBM 7133-020 SAA disks
>from used equipment dealers for a song just now; my TSM database loves
>'em.
>
>THERE IS A BUG in many ADSM and TSM server versions, dating back to ADSM
>V2 on VM but still present, wherein sometimes a database backup
>completes but the log does not empty out. (Amazing how they can port
>bugs from one platform to another!) The bypass is to bounce the TSM
>server. If you have automated monitoring things, you can watch for this
>situation and get yourself warned, before the log fills up.
>
>6. The log, OTOH, has a much more sequential access pattern, since it is
>a circular queue. All those good performance things your RAID vendors
>love to tell you over expensive lunches, will work on the log, as
>opposed to the database.
>
>7. You'll never get all this higher mathematics right the first time, or
>perhaps a databae backup will fail, so plan for the eventuality of a
>log-full server crash. Predefine and preformat a log extent that is
>still small enough to fit under the 13gb limit, and write down clearly
>where it is and how to add it with an EXTEND, how to increase the number
>of incrementals between fulls by +1 for just this one time, how to start
>an incremental backup manually, in the middle of the night, when your
>mind is on other things, and perhaps you've had a drink or two. This
>WILL happen, when you run a big TSM server in rollforward mode.
>
>The wall clock time for a Full Database Backup should be considered an
>essential measure of system health. Keep track of it over time. The
>other essential measure is expiration process time; if it exceeds 24
>hours, (or whatever interval you start it at) you're in a slow death
>spiral. When these barometers of TSM server health go critical, consider
>giving up ROLLFORWARD mode as an interim measure until you can really
>fix things, such as with a faster server computer, more disks, or a
>server split.
>
>NEVER run in NORMAL mode without total and complete database mirroring,
>or without a tool to check that all mirrors are sync'd from time to
>time. At least you can be protected against disk crashes.
>
>Myself, I'm in NORMAL mode at the present moment, awaiting a somewhat
>painstaking server split so I can go back into ROLLFORWARD mode.
>
>Roger Deschner      University of Illinois at Chicago     rogerd AT uic DOT edu
>( ) ASCII ribbon campaign
> X  against HTML e-mail
>/ \
>
>
>On Wed, 24 Jul 2002, Steve Harris wrote:
>
>>Joni,
>>
>>There's a stat of cumulative log consumption that can be seen with Q LOG
>F=D, and can be reset with RESET LOGCONSUMPTION
>>
>>I'm not sure how this works in normal mode - I use rollforward - but it
>might be worth a look.
>>Let us know how it turns out.
>>
>>
>>Steve Harris
>>AIX and TSM Admin
>>Queensland Health, Brisbane Australia
>>
>>>>> joni.moyer AT HIGHMARK DOT COM 23/07/2002 21:57:03 >>>
>>Hello everyone!
>>
>>I was just wondering if anyone has strong opinions about which mode to use
>>for the log: rollforward or normal?  And also, would anyone happen to know
>>how much space I would have to add if I changed to rollforward mode?  My
>>log is 2324 MB and the database is 41832 MB.  I was also wondering if it
>is
>>a common practice to have the log be 10% the size of the database?  Thanks
>>in advance for any insight!!!
>>
>>
>>Joni Moyer
>>Associate Systems Programmer
>>joni.moyer AT highmark DOT com
>>(717)975-8338
>>
>>
>>
>>**********************************************************************
>>This e-mail, including any attachments sent with it, is confidential
>>and for the sole use of the intended recipient(s). This confidentiality
>>is not waived or lost if you receive it and you are not the intended
>>recipient(s), or if it is transmitted/ received in error.
>>
>>Any unauthorised use, alteration, disclosure, distribution or review
>>of this e-mail is prohibited.  It may be subject to a statutory duty of
>>confidentiality if it relates to health service matters.
>>
>>If you are not the intended recipient(s), or if you have received this
>>e-mail in error, you are asked to immediately notify the sender by
>>telephone or by return e-mail.  You should also delete this e-mail
>>message and destroy any hard copies produced.
>>**********************************************************************
>>
>