Veritas-bu

[Veritas-bu] HELP - media and I/O errors

2004-02-17 09:58:19
Subject: [Veritas-bu] HELP - media and I/O errors
From: gdrew AT deathstar DOT org (George Drew)
Date: Tue, 17 Feb 2004 09:58:19 -0500 (EST)
Changing the Media Unmount Delay will have absolutely no effect on this
*at all*. Media unmount delay is the amount of time nbu waits from the
end of a *user* backup to the time nbu issues the unload. In other
words, setting this is not going to affect how long nbu waits for a
rewind to complete (a silly idea anyway, as tape scsi commands are
synchronous - nbu *must* wait for the unload to complete before it can
do anything else), because nbu doesn't even send the rewind until media
unmount delay expires.

There are some issues related to how nbu reacts to finding a tape in a
drive that it didn't put there, and you should ensure that all media
servers are patched to MP6 or FP6.

George

On Mon, 16 Feb 2004, Denis Petrov wrote:

> MessageI had simillar issue on L80. STK tech suggested that issue is may be 
> related to the Netbackup setting of "Media unmount delay" which is way too 
> short by default 3 minutes. What is happening tape is getting unmounted 
> before it completely rewound. I had some arguments with my co-workers about 
> it.... Speed of LTO's vs DLT's, since the issues that never came up when we 
> used DLTs, but I was able to confirm the issues with my L80 logs some are 
> exactly what the tech suggested and some are similar. In addition the issues 
> does not seem to show up right away until LTO tapes have significant amount 
> of data - takes longer to rewind???... . I say if everything failes try to 
> change Media unmount delay to something like 15 minutes or so and see if it 
> makes any difference
>
> --Denis
>
>   ----- Original Message -----
>   From: Dave Markham
>   To: Sokolowski Ric-ERS004 ; veritas-bu AT mailman.eng.auburn DOT edu
>   Sent: Wednesday, February 11, 2004 9:23 AM
>   Subject: Re: [Veritas-bu] HELP - media and I/O errors
>
>
>   Have the cables connecting the devices been replaced?
>
>   I have similar problem recently with LTO drives and tried many things. My 
> environment was Solaris and there were some patches to apply ( although that 
> doesn't help sorry ), but the cables were mentioned plus I found LTO media 
> has a chip inside it which can be dislodged. If you shake the tapes and they 
> rattle loudly then most likely they are damaged. This could be more than one 
> tape if they have come from the same batch perhaps.
>
>   Just some ideas
>   Dave
>     ----- Original Message -----
>     From: Sokolowski Ric-ERS004
>     To: 'veritas-bu AT mailman.eng.auburn DOT edu'
>     Sent: Wednesday, February 11, 2004 4:28 PM
>     Subject: [Veritas-bu] HELP - media and I/O errors
>
>
>     Our system:
>
>     NB 4.5 MP5
>     master - HP-UX 11.00
>     media - 4 HP-UX 11.00, 1 HP-UX 11.11
>     STK L700 (HP20/700) w/10 HP LTO 1 drives w/SSO
>     5 HP 2/1 FC/SCSI bridges
>     1 Brocade 2800
>
>     We're seeing tons of media-related errors (70% status 86 - media 
> position, 30% status 84 - media write) spread across
>     all drives.  Some nights we see no errors, other nights we'll see 50-100 
> media-related failures.  We see the failures when
>     reusing tapes and with brand new tapes.  All drives have been cleaned 
> recently.  We have had cases open w/Veritas and
>     HP for just over 4 weeks now.  Veritas has examined over a months worth 
> of log files and has determined that the
>     problem is hardware related.  HP replaced 3 drives, we saw media failures 
> on these 3 new drives the same day they were
>     replaced.  HP also replaced the robot controller, the camera, and one of 
> the Fibre bridges.  We're not seeing any
>     communication errors on the FC switch.  Everything has the latest 
> available firmware.  Whenever we get the status 84/86,
>     we see a  lot of things like "cannot read from media socket 10", "ioctl 
> (MTREW) failed on media id 402280, drive index 4,
>     I/O error (bptm.c.7197)" and "write error on media id 402280, drive index 
> 4, writing header block, I/O error".  Normally,
>     between 2 and 5 drives are downed every night - always with a tape stuck 
> in the the drive.  Occasionally the system will
>     freeze dozens of tapes because they're seen as "unmountable" which leads 
> to a boatload of status 96 (no media)
>     failures.  Our backup success rate has dropped from over 98% to below 80% 
> - management is freaking out.  We're
>     grasping at straws here folks, any help would be GREATLY appreciated!
>
>     --
>     Regards,
>     Ric Sokolowski (Ric.Sokolowski AT motorola DOT com)
>     Staff Systems Engineer
>     Phone: (954) 723-6332
>     Pager: 9545530742 AT messaging.nextel DOT com
>     Motorola, Inc.  / CGISS / Enterprise Computing
>     8000 West Sunrise Blvd, MS 22-2F, Plantation, FL 33322
>
>
>
>

<Prev in Thread] Current Thread [Next in Thread>