ADSM-L

bacup primary to copy storage pool on LTO fails

2002-12-20 07:00:56
Subject: bacup primary to copy storage pool on LTO fails
From: "kurt.beyers AT pandora DOT be" <kurt.beyers AT PANDORA DOT BE>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 20 Dec 2002 12:59:48 +0100
Hi everybody,


I&#8217;ve got the following environment:

TSM server 5.1.1.6 on Windows 2000 SP2 (IBM Netfinity server xSeries 232)
IBM LTO library 3583, 2 LTO drives and 18 slots
The library is connected with two Adaptec SCSI card 29160 Ultra160 SCSI 
controllers (driver name: Adaptec, version 6.1.530.201, date 5/14/2002). The 
first controller goes to one drive and the other controller to the robot arm 
and the second drive
The IBM LTO device drivers are of IBM corporation, version 5.0.3.2

I&#8217;ve got a failure when I take a backup of the primary storage pool on 
LTO to the copy storage pool. The copying stops when large files need to be 
transferred from tape to tape (large is bigger than 2 GB).  A write error is 
found in the activity log, the tape of the copy storage pool gets the status 
read-only and another scratch pool is allocated to the copy storage pool. The 
backup goes on a while until the next large file is met. The errors in the 
activity log are:

12/20/2002 09:55:43   ANR8302E I/O error on drive DRIVE1 (mt0.0.0.3) (OP=WRITE,
                       Error Number=121, CC=0, KEY=00, ASC=00, ASCQ=00,
                       SENSE=**NONE**, Description=An undetermined error has
                       occurred).  Refer to Appendix D in the 'Messages' manual
                       for recommended action.
12/20/2002 09:55:43   ANR1411W Access mode for volume 000020L1 now set to
                       "read-only" due to write error.
12/20/2002 10:04:31   ANR8302E I/O error on drive DRIVE1 (mt0.0.0.3) (OP=LOCATE,
                       Error Number=1104, CC=0, KEY=08, ASC=14, ASCQ=03,
                       SENSE=70.00.08.00.00.00.00.1C.00.00.00.00.14.03.00.00.20-
                       .76.00.00.00.00.00.00.00.00.00.00.00.05.00.00.9A.8D.00.0-
                       0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.-
                       00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00-
                       .00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0-
                       0.00.00.00.00, Description=An undetermined error has
                       occurred).  Refer to Appendix D in the 'Messages' manual
                       for recommended action.
12/20/2002 10:04:31   ANR1411W Access mode for volume 000010L1 now set to
                      "read-only" due to write error.

12/20/2002 10:44:24   ANR8302E I/O error on drive DRIVE1 (mt0.0.0.3) (OP=WRITE,
                       Error Number=121, CC=0, KEY=00, ASC=00, ASCQ=00,
                       SENSE=**NONE**, Description=An undetermined error has
                       occurred).  Refer to Appendix D in the 'Messages' manual
                       for recommended action.
12/20/2002 10:44:24   ANR1411W Access mode for volume LA0014L1 now set to
                       "read-only" due to write error.

12/20/2002 11:11:13   ANR8302E I/O error on drive DRIVE1 (mt0.0.0.3) (OP=WRITE,
                       Error Number=121, CC=0, KEY=00, ASC=00, ASCQ=00,
                       SENSE=**NONE**, Description=An undetermined error has
                       occurred).  Refer to Appendix D in the 'Messages' manual
                       for recommended action.
12/20/2002 11:11:13   ANR1411W Access mode for volume LA0015L1 now set to
                       "read-only" due to write error.
12/20/2002 11:31:06   ANR2017I Administrator ADMIN issued command: QUERY ACTLOG
                       begindate=today-1 begintime=08:00 enddate=today
                       endtime=now search=error

In the event viewer of the TSM server, I&#8217;ve got the following error:

Source: adpu160m
Type: Error
Category: None
Event ID: 9
Description: The device, \Device\Scsi\adpu160m2, did not respond within the 
timeout period.

So there is a timeout somewhere during the copy of tape to tape with the large 
files.

Does anybody knows how to solve this problem? How can the timeout be increased? 
The backup of the clients to the disk storage pool is fine and the flush of the 
disk storage pool to the LTO pool is without any problems as well. Is this a 
harware problem or a TSM problem?

Any help/input would greatly be appreciated,

Kurt

<Prev in Thread] Current Thread [Next in Thread>