Constant intermittant failures (rc=18) of Domino/Solaris TDP backups

2006-01-31 04:31:04
Subject: Constant intermittant failures (rc=18) of Domino/Solaris TDP backups
From: Zoltan Forray/AC/VCU <zforray AT VCU DOT EDU>
Date: Tue, 31 Jan 2006 09:32:10 -0500
TSM Client:   Solaris
TSM Server: AIX
Domino TDP:

This used to be an "occassional" problem but is now starting to spread
across most of my 32 Solaris based Domino servers.  There is at least
1-failure per day !   I saw an entry in the ADSM-L archives from 2003 with
a similar symptom, but no real solution.

What I am seeing in the Domino log is usually:

01/30/06   20:39:01 Request                       : INCREMENTAL
01/30/06   20:39:01 Database Input List           : *
01/30/06   20:39:01 Number of Buffers             : 8
01/30/06   20:39:01 Buffer Size                   : 8192
01/30/06   20:39:01 Wait for Tape Mounts?         : Yes
01/30/06   20:39:01 Process Subdirectories?       : Yes
01/30/06   20:39:01 TSM Options File              :
01/30/06   20:39:01 TSM Nodename Override         :
01/30/06   20:39:01
01/30/06   21:55:35 Backup of mail/usermail2/jwdaniel.nsf failed.
01/30/06   21:55:35 ACD0200E File (<NULL>) could not be opened for
01/30/06   21:55:51 Backup of mail/usermail2/kcsterner.nsf failed.
01/30/06   21:55:51 ACD0200E File (<NULL>) could not be opened for
01/30/06   21:56:54 Total Domino databases inspected:         3,718
01/30/06   21:56:54 Total Domino databases backed up:         236
01/30/06   21:56:54 Total Domino databases excluded:          0
01/30/06   21:56:54 Total Domino backup objects expired:      0
01/30/06   21:56:54 Throughput rate:                          3,492.96

Another strangeness I am seeing is TDP backups going straight to the "Next
Storage Pool", that being LTO2.  This causes an issue since there aren't
enough drives/tapes during the evening backups to handle this (I have come
in in the morning to see 5-TDP sessions waiting on tape mounts).

This doesn't make sense since there aren't any management classes that go
straight to tape and the storagepool devclass is set to NOLIMIT.   The LZ
is now up to 2TB, more than enough to handle the load of a nightly backup.

I see the patches for the TDP to, which I plan to implement,
however nothing in the README to the Solaris version, addresses anything
like this.  What I thought was curious is the README in the Windows TDP
includes fixes for which to address other problems not covered in
the Solaris version. Is this an oversight or none of these problems occur
in the Solaris version of the TDP !

Yes, I have see the TID (Thanks Richard) about changing a Domino tape
wait/timeout to 60-minutes versus the default of 15-minutes (which I think
would be a great since the operators can't respond within a 15-minute
window if they need to unload tapes from the 3583 to load more scratch
tapes), but the Domino admin says this requires bouncing the Domino server

Any thoughts/suggestions/ideas on how to deal with this ?

<Prev in Thread] Current Thread [Next in Thread>