Networker

Re: [Networker] Unload retries with NSR6.1.1-LGTpa33799

2002-11-06 09:14:44
Subject: Re: [Networker] Unload retries with NSR6.1.1-LGTpa33799
From: Dale Mayes <dmayes AT KIMBALL DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Wed, 6 Nov 2002 09:14:34 -0500
FYI,

The problem hasn't totally disappeared even in 6.1.2.

Just this past week we had 2 9840 tapes fail verification.
I think it was strictly because the server (software not hardware) was too
busy.

Considering we're running a HP N-4000 with 2 550MHz CPU's and 4.5GB RAM...
and all it does is provide the databases for Legato...I think it should've
handled it.

The NDMP jukebox lost track of a tape in a drive and was looping trying to
load another tape.

Haven't opened a case...yet.

We're not quite as big as Thierry:

NetWorker server is HP-UX 11 running 6.1.2
10 storage nodes ranging from stand-alone drive, ATL-L200, STK-9730,
STK-9714, STK-9710, STK-9310
DLT4000, DLT7000, DLT8000, STK9840B
205 clients
Also NDMP based jukebox backing up 1.2TB EMC Celerra NAS

Good luck...all they'll tell you is just install 6.1.3 (which is late by the
way).

Dale



-----Original Message-----
From: Robert Maiello [mailto:robert.maiello AT MEDEC DOT COM]
Sent: Wednesday, November 06, 2002 8:51 AM
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Subject: Re: [Networker] AW: [Networker] Unload retries with
NSR6.1.1-LGTpa33799


Legato there, anyone?  Anyone ?

Ok,  I'll jump in with what I know..

LGTpa35119 and LGTpa33909 are listed as problems Networker has with
unloading
drives.

LGTpa35119 is a subset of LGTpa33799.. which I see in the fixed bug list
of Networker 6.1.2.

I was told there is a hot fix for 35119 and 33909 for 6.1.1, but alas, it
conflicted with my fix for tapes lost during cloning (LGTpa36541); the
nsrmmd
binary.   35119 and 33909 are fixed in 6.1.2 though.   Hotfixes need to be
requested from Legato..I don't think they post them for everyone.

So, I'm running 6.1.2 with the hotfix for LGTpa36541 and have had good luck;
unload errors and lost tapes (-*) seem to have dissappeared.


Robert Maiello
Thomson
Medical Economics




On Wed, 6 Nov 2002 14:12:12 +0100, Scholz Wolfgang <Wolfgang.Scholz AT FJA DOT 
COM>
wrote:

>hi thierry,
>
>we had been facing exactly the same problem. after several tries from our
>support to play around with load sleep and unload sleep timeouts of the
>jukebox they adviced us yesterday to upgrade to 6.1.2 on the backup server
>and our storage nodes. we had the problem on weekends when we do mostly
full
>backups on our servers. i keep u informed if i survive the next weekend
>without errors.
>
>where have u heard about this LGTpa35119 fix ? i cant find it anywhere.
>
>thx
>
>wolfgang
>
>-----Ursprungliche Nachricht-----
>Von: Faidherbe, Thierry [mailto:Thierry.Faidherbe AT HP DOT COM]
>Gesendet: Dienstag, 5. November 2002 21:32
>An: NETWORKER AT listmail.temple DOT edu
>Betreff: [Networker] Unload retries with NSR6.1.1-LGTpa33799
>
>
>Dears,
>
>I opened  a case with Legato support but as usual, I am not
>seeing any possible end to the case. So, I am failing back to
>this very powerful list, but this time, as requestor ...
>
>I have been faced to media verification failures and be advised
>by Legato support folks to install jumbo kit NSR611-LGTpa33799 as THE
>solution. Ok, the amount of media verification failures decreased
>a little bit but as usual introducing new software, a new bug has
>been introduced and follows the kit : un-installing the
>kit, the bug is gone but my media verification errors are back again
>and reverse. A marvellous deadlock.
>
>The problem I am seeing is that at ramdom time, on different jukeboxes
>from different OS (Sun Solaris 7, 8 and HP Tru64 Unix 5) Networker
>is getting confused with the loaded tapes and the tapes it try
>to eject resulting in a lot of unload retries like
>
>        10/20/02 18:12:17 nsrd: media info: failed unloading
>             drive`rd=xxx:/dev/ntape/tape6_d1' to slot '314', error '10'
>
>Looking at jukebox's inventory, I can see that tape from slot 314 (to
>refer with the above sample) is loaded into drive
>`rd=xxx:/dev/ntape/tape6_d1'
>and `rd=xxx:/dev/ntape/tape7_d1' (in the same jukebox),
>with volume name "-" BUT NO TAPE is physically loaded into any of the 2
>devices.
>Networker, trying to unload the tape reports "Read Open Error, I/O
>Error".
>
>As nsrjb -HE cannot be run during activity of the jukebox without
>loosing
>the inventory of tape drives being busy writing (I got a fix for
>that but the fix is again bugged), the workaround I found is to edit
>the fields "Loaded Barcode", "Loaded Volume" and "Loaded slot" from
>detailed
>view of Jukebox control panel after disabling the device and reenable it
>after.
>Then the final step being to reinventory the slot.
>
>I am certainly not the only one who met that behaviour. I have been told
>about
>a fix number LGTpa35119 - "NetWorker cannot recover from an out of sync
>Silo/Jukebox event." but cannot find any description about it.
>
>Thanks for your help !
>
>Thierry
>
>The config in short : nsr611 LGTpa33799 on solaris 8, with 8 storage
>nodes running nsr611 LGTPA33799 on tru64 Unix, 2* ATL P3000, 2* STK
>L700,
>1* Compaq TL894, 1* Compaq TL891. ATLs and Compaq : DLT7000, STK DLT7000
>and 9840.
>Round 250 Clients being backuped, 28 TB a week.
>
>Kind regards - Bien cordialement - Vriendelijke groeten,
>
>Thierry FAIDHERBE
>
>Storage & Server Integration Practice
>Tru64 Unix and Legato EBS Consultant
>
>--
>Note: To sign off this list, send a "signoff" command via email
>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>http://listmail.temple.edu/archives/networker.html where you can
>also view and post messages to the list.
>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>