ADSM-L

Re: test for DRM

2002-09-03 21:27:45
Subject: Re: test for DRM
From: "Chetan H. Ravnikar" <Chetan.Ravnikar AT SYNOPSYS DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 3 Sep 2002 15:52:17 -0700
Pual appreciate your time,

thanks bowing

On Sat, 31 Aug 2002, Seay, Paul wrote:
> What SUN probably did not tell you was they were probably 32K or 64K blocks
> to get that data rate.  TSM Database uses 4K and most other databases are 4K
> or 8K.  Are you using raw volumes for the disk pools?  If so, talk to SUN to
> find out what the optimal block sizes are.  If you are using a file system,
> same thing.
T3's cannot be used as a JBOD, at a minimum there should be a volume
created of a few disks and they support Raid 0, 1 and 5

what we have on some configurations are simply RAID 1, so that during
writes data is just stripped across..

any idea, what storage tyoes are out there predominently used (raid or
just a JBOD)

>
> Be careful how you talk to Tivoli about RAID-5 versus JBOD.  They think of
> RAID-5 as being a bunch of disks you have raided, not a hardware raid
> solution with a high end controller cache.
good point! I will keep this in mind. I do read that TSM writes data
in(variable)block rates of 4 to 64K. With T3 storage where this has to
set to a static value and I have it set to 64K. Did we even go and buy
the wrong storage for TSM storage pools..


> 20MB to 30MB/sec sounds about right for a T3.  The T3 is a midrange device.
> It may not be a good choice for a high write activity workload versus a
> JBOD/Raid-1.  I do not know how smart the T3 is.  The 280 is a pretty good
> little box based on my review of it, so I do not know that is the problem.

SUN280 had an issue, when we had the recovery-logs on the root
(system)disk. The disk got busy as there was contention and a lot of i/o
getting queued, thereby increasing the queue depth! Later we moved the
recovery logs on to an External D2-array on the on-board SCSI-3 . Looks
bettter now.


> You are preaching to the choir on a protected disk pool.  If you lose it you
> have lost the backups for that night which may be unacceptable.
yes :)

driving large critical backups on direct_to_tape is what we have chosen
and the remaining goes to disk. But tapes are slower than disks .. so
still on a quest for faster, safer (insured) backups!


> One thing to consider is large files you may want to send directly to tape.
> You can do this by setting a maximum file size in the primary disk pool.
> TSM will mount up a tape and any time a file from a client exceeds the
> limit, it writes it to tape instead of disk.  But, unless you have very
> reliable tape and create your backup storage pool copies in time to
> recapture the backup if the tape is bad, I do not know if this is an option
> for you.
where can I read about this, max file size param settings/config! I will
still lookinto this.

THe reason. My large(file, oracle * SAP) backups run off of the same
client and hence, we have the same node registered as 2 different nodes
and talk to 2 different domains, via 2 different *opt* files..
Managing this within our internal customer base and having them not touch
and change things has been an increasing pain. I am looking for a simple
straight configuration.. hence the search


> Backup Storage Pool commands start where they left off.
thanks, what is the gaurantee here. There is no way TSM can miss files ..
leading to integrity errors!?

We have actually cancelled
*copy stg on-site off-site maxpr=2* because it went passed days.. because
the next day backups needed tape-drive resources. But we have had
processes completing successfully.


> You say your environment is huge.  How many tapes do you have.  How much are
> you trying to backup to these servers.  They may not be the right fit.
we backup between 100 to 150 GB every day and during weekends we backup
approx 200 GB. All data is changed data. (Oracle) This particular server
has a STKL700 lib with 320 slots and 5 DLT7000 drives SCSI (daisy
chained)installed  on to 2 33Mhz SCSI diff cards on the SUN280r. fast
wide SCSI diff do 40MB/sec and the drives do 5MB/sec



> As far as the data integrity issue.  Did you not back something up?  Where
here is what we did and *probabaly should have not done* on a pressure to
check integrity of off-site tapes. We picked a node & BTW we hvae
collocation on and hence with a select command (below) got a list of all
tapes on-site and marked them destroyed. With the same select command, got
a list of all the off-site tapes (all this on the production server) and
have them checked in as private and read-only, initiated a restore.

The process, just came back with an error as below. Tivoli level 2 with
all traces open. could not figure out how this could have happened. We did
audit checks on the volumes. The audit checks were successful :(

select statement
==
select volume_name from volumeusage where node_name='TEKTON-SAP' and
stgpool_name='OS_TAPEPOOL_SERVER'
==

> you missing some data?  What do you mean?  This is one place TSM really
> shines over the other backup products.

error below
==
08/07/02   14:34:13      ANR1424W Read access denied for volume 220417 -
volume access mode="destroyed".
08/07/02   14:34:13      ANR0548W Retrieve or restore failed for session
31017 for node TEKTON-SAP (HPUX) processing file space
                          /oracle/WD1/sapdata15 63 for file /odsd_17/
odsd.data17 stored as Backup - data integrity error detected.
08/07/02   14:34:16      ANE4035W (Session: 31017, Node: TEKTON-SAP)
 Error processing '/restore/oracle/WD1/sapdata15/odsd_17/odsd.data17':
file currently unavailable on server.

==























>
> -----Original Message-----
> From: Chetan H. Ravnikar [mailto:Chetan.Ravnikar AT SYNOPSYS DOT COM]
> Sent: Saturday, August 31, 2002 12:09 PM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: test for DRM
>
>
> Hi there and thanks in advance for all your tips and recommendations
>
>
> we have a huge distributed new TSM setup, with server spread across the
> campuses. We recently moved from 3 ADSM 3.1 servers to 9 TSM 4.2.2 servers
> all direct attached SUN 280r(sol-2.8), SUN T3 and Spectralogic 64K libs
>
> I have a few questions
>
> 1. We have TSM working on Solaris2.8 with SUN T3 storage for mirrored DB
>    and storage pools. Our performances is nowhere close to what SUNs
>    recomended T3 sustained writes which is 80MB. Recovery logs are on
>    external D130 disk-packs
>
>    has anyone seen a setup with SUN and is this normal? My writes to
> diskpools are at 20 to 30 MB and that is slow. I have a raid5 setup for the
> storage pools,
>    Tivoli suggests JBOD for storagepools rather than raid5!? but how do  I
> protect myself from a disk fail on a critacal quarter financial backup..
> since the source gets overwritten as soon as they throw the data on to my
> stoarge pools T3 (primary)
>
> 2. One such setup has a StorageTek L7000 lib and my customer wanted me to
> prove that the tapes from offsite do work.
>
> Tivoli suggests that I do not test DRM on a production system. But I had no
> choice but to atleast test for bad media on the primary tapepool!if any so I
> went ahead picked *a* node
>
> with a select statement, marked all the tapes destroyed on the primary tape
> pool(for that node), and started a restore of a filesystem. Prior I had a
> bunch of tapes recalled from the off-site pertinent to the same node. Had
> them checked in as private and waited, to see if TSM picks those tapes since
> the onsite were marked destroyed. This process has been rather lengthy and
> tedious and unsuccessful
>
> Has anyone done a rather simpler test for bad media, to prove that the
> off-site tapes do work, less to say the test I performed came back with data
> integrity errors and my customers are not happy and with all traces setup..
> Tivoli was unclear how that happened
>
> (Tivoli claimed, there could be a flaw in my DRM process)
>
> 3. The last question, during a copy storage pools process, if I *cancel* the
> process (since it took days), the next time I start (manual or via a
> script) does it pick up from where it stoped!
>
>
> thanks for all your responses, forgive me, My knowledge is pretty limited
> and I started learning Tivoli while I started this project
>
> Cheers..
> Chetan
>

<Prev in Thread] Current Thread [Next in Thread>