Networker

[Networker] AW: [Networker] Unload retries with NSR6.1.1-LGTpa33799

2002-11-06 08:12:24
Subject: [Networker] AW: [Networker] Unload retries with NSR6.1.1-LGTpa33799
From: Scholz Wolfgang <Wolfgang.Scholz AT FJA DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Wed, 6 Nov 2002 14:12:12 +0100
hi thierry,

we had been facing exactly the same problem. after several tries from our
support to play around with load sleep and unload sleep timeouts of the
jukebox they adviced us yesterday to upgrade to 6.1.2 on the backup server
and our storage nodes. we had the problem on weekends when we do mostly full
backups on our servers. i keep u informed if i survive the next weekend
without errors.

where have u heard about this LGTpa35119 fix ? i cant find it anywhere.

thx

wolfgang

-----Ursprungliche Nachricht-----
Von: Faidherbe, Thierry [mailto:Thierry.Faidherbe AT HP DOT COM]
Gesendet: Dienstag, 5. November 2002 21:32
An: NETWORKER AT listmail.temple DOT edu
Betreff: [Networker] Unload retries with NSR6.1.1-LGTpa33799


Dears,

I opened  a case with Legato support but as usual, I am not
seeing any possible end to the case. So, I am failing back to
this very powerful list, but this time, as requestor ...

I have been faced to media verification failures and be advised
by Legato support folks to install jumbo kit NSR611-LGTpa33799 as THE
solution. Ok, the amount of media verification failures decreased
a little bit but as usual introducing new software, a new bug has
been introduced and follows the kit : un-installing the
kit, the bug is gone but my media verification errors are back again
and reverse. A marvellous deadlock.

The problem I am seeing is that at ramdom time, on different jukeboxes
from different OS (Sun Solaris 7, 8 and HP Tru64 Unix 5) Networker
is getting confused with the loaded tapes and the tapes it try
to eject resulting in a lot of unload retries like

        10/20/02 18:12:17 nsrd: media info: failed unloading
             drive`rd=xxx:/dev/ntape/tape6_d1' to slot '314', error '10'

Looking at jukebox's inventory, I can see that tape from slot 314 (to
refer with the above sample) is loaded into drive
`rd=xxx:/dev/ntape/tape6_d1'
and `rd=xxx:/dev/ntape/tape7_d1' (in the same jukebox),
with volume name "-" BUT NO TAPE is physically loaded into any of the 2
devices.
Networker, trying to unload the tape reports "Read Open Error, I/O
Error".

As nsrjb -HE cannot be run during activity of the jukebox without
loosing
the inventory of tape drives being busy writing (I got a fix for
that but the fix is again bugged), the workaround I found is to edit
the fields "Loaded Barcode", "Loaded Volume" and "Loaded slot" from
detailed
view of Jukebox control panel after disabling the device and reenable it
after.
Then the final step being to reinventory the slot.

I am certainly not the only one who met that behaviour. I have been told
about
a fix number LGTpa35119 - "NetWorker cannot recover from an out of sync
Silo/Jukebox event." but cannot find any description about it.

Thanks for your help !

Thierry

The config in short : nsr611 LGTpa33799 on solaris 8, with 8 storage
nodes running nsr611 LGTPA33799 on tru64 Unix, 2* ATL P3000, 2* STK
L700,
1* Compaq TL894, 1* Compaq TL891. ATLs and Compaq : DLT7000, STK DLT7000
and 9840.
Round 250 Clients being backuped, 28 TB a week.

Kind regards - Bien cordialement - Vriendelijke groeten,

Thierry FAIDHERBE

Storage & Server Integration Practice
Tru64 Unix and Legato EBS Consultant

--
Note: To sign off this list, send a "signoff" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>