Networker

Re: [Networker] Unload retries with NSR6.1.1-LGTpa33799

2002-11-06 15:35:41
Subject: Re: [Networker] Unload retries with NSR6.1.1-LGTpa33799
From: "Faidherbe, Thierry" <Thierry.Faidherbe AT HP DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Wed, 6 Nov 2002 21:35:34 +0100
Dears,

I have submitted LGTpa35119 for info to Legato and got the following
reply :

Here is the description of the LGTpa35119.:
'When NetWorker shows a drive as loaded (or unloaded) when the physical
drive is unloaded 
(or loaded) according to the silo software, all further backups hang as
NetWorker retries the
operation to unload (or load) a tape into that drive. This results in
the failure of unattended backups 
as queued nsrjbs cannot be serviced. '

The problem being shown on Jukebox and Silos.

Legato Solution after pointing above LGTpa (Thanks to Frank Lammers,
Legato Support): 
If you think you need this fix and it might match the description you
gave yesterday,
you have to update to 6.1.2 or later as we don't have any binary for
Tru64 with LGTpa33799.Build.6 installed. 
I can give you the fix ontop of 6.1.1, but not ontop of
6.1.1-LGTpa33799.
As you know LGTpa33799 replaces most networker binaries.
and it would not be good practice to put hotfixes on top of other
hotfixes,
when a new release that includes both fixes, like 6.1.2 is already
available.

Many thanks to CC'ed people : Robert, Dale and Scholz !

Thierry

Kind regards - Bien cordialement - Vriendelijke groeten,

Thierry FAIDHERBE

Storage & Server Integration Practice 
Tru64 Unix and Legato EBS Consultant
                                   
 *********       *********   HEWLETT - PACKARD
 *******    h      *******   1 Rue de l'aeronef/Luchtschipstraat
 ******    h        ******   1140 Bruxelles/Brussel/Brussels
 *****    hhhh  pppp *****   
 *****   h  h  p  p  *****   100/102 Blv de la Woluwe/Woluwedal
 *****  h  h  pppp   *****   1200 Bruxelles/Brussel/Brussels
 ******      p      ******   BELGIUM
 *******    p      *******                              
 *********       *********   Phone :    +32 (0)2  / 729.85.42   
                             Mobile :   +32 (0)498/  94.60.85 ***
CHANGED *** 
                             Fax :      +32 (0)2  / 729.88.30   
     I  N  V  E  N  T        Email :    thierry.faidherbe AT hp DOT com
                             Internet : http://www.hp.com/
________________________________________________________________________
_____

MOBISTAR SA/NV 

SYSTEM Team Charleroi, Mermoz 2 Phone : +32 (0)2  / 745.75.81  
Avenue Jean Mermoz, 32          Fax :   +32 (0)2  / 745.89.56  
6041 GOSSELIES                  Email : tfhaidhe AT mail.mobistar DOT be
BELGIUM                         Web :   http://www.mobistar.be/
________________________________________________________________________
_____

  


-----Original Message-----
From: Robert Maiello [mailto:robert.maiello AT MEDEC DOT COM] 
Sent: Wednesday, November 06, 2002 5:08 PM
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Subject: Re: [Networker] Unload retries with NSR6.1.1-LGTpa33799


Doh!

I have tapes fail media verification from time to time...one just last
night.   I attributed it to the media (new SDLT tapes) but now I'm not
so sure.

Robert Maiello


On Wed, 6 Nov 2002 09:14:34 -0500, Dale Mayes <dmayes AT KIMBALL DOT COM>
wrote:

>FYI,
>
>The problem hasn't totally disappeared even in 6.1.2.
>
>Just this past week we had 2 9840 tapes fail verification.
>I think it was strictly because the server (software not hardware) was
too
>busy.
>
>Considering we're running a HP N-4000 with 2 550MHz CPU's and 4.5GB
RAM...
>and all it does is provide the databases for Legato...I think it
should've
>handled it.
>
>The NDMP jukebox lost track of a tape in a drive and was looping trying
to
>load another tape.
>
>Haven't opened a case...yet.
>
>We're not quite as big as Thierry:
>
>NetWorker server is HP-UX 11 running 6.1.2
>10 storage nodes ranging from stand-alone drive, ATL-L200, STK-9730,
>STK-9714, STK-9710, STK-9310
>DLT4000, DLT7000, DLT8000, STK9840B
>205 clients
>Also NDMP based jukebox backing up 1.2TB EMC Celerra NAS
>
>Good luck...all they'll tell you is just install 6.1.3 (which is late
by the
>way).
>
>Dale
>
>
>
>-----Original Message-----
>From: Robert Maiello [mailto:robert.maiello AT MEDEC DOT COM]
>Sent: Wednesday, November 06, 2002 8:51 AM
>To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
>Subject: Re: [Networker] AW: [Networker] Unload retries with
>NSR6.1.1-LGTpa33799
>
>
>Legato there, anyone?  Anyone ?
>
>Ok,  I'll jump in with what I know..
>
>LGTpa35119 and LGTpa33909 are listed as problems Networker has with
>unloading
>drives.
>
>LGTpa35119 is a subset of LGTpa33799.. which I see in the fixed bug
list
>of Networker 6.1.2.
>
>I was told there is a hot fix for 35119 and 33909 for 6.1.1, but alas,
it
>conflicted with my fix for tapes lost during cloning (LGTpa36541); the
>nsrmmd
>binary.   35119 and 33909 are fixed in 6.1.2 though.   Hotfixes need to
be
>requested from Legato..I don't think they post them for everyone.
>
>So, I'm running 6.1.2 with the hotfix for LGTpa36541 and have had good
luck;
>unload errors and lost tapes (-*) seem to have dissappeared.
>
>
>Robert Maiello
>Thomson
>Medical Economics
>
>
>
>
>On Wed, 6 Nov 2002 14:12:12 +0100, Scholz Wolfgang
<Wolfgang.Scholz AT FJA DOT COM>
>wrote:
>
>>hi thierry,
>>
>>we had been facing exactly the same problem. after several tries from
our
>>support to play around with load sleep and unload sleep timeouts of
the
>>jukebox they adviced us yesterday to upgrade to 6.1.2 on the backup
server
>>and our storage nodes. we had the problem on weekends when we do
mostly
>full
>>backups on our servers. i keep u informed if i survive the next
weekend
>>without errors.
>>
>>where have u heard about this LGTpa35119 fix ? i cant find it
anywhere.
>>
>>thx
>>
>>wolfgang
>>
>>-----Ursprungliche Nachricht-----
>>Von: Faidherbe, Thierry [mailto:Thierry.Faidherbe AT HP DOT COM]
>>Gesendet: Dienstag, 5. November 2002 21:32
>>An: NETWORKER AT listmail.temple DOT edu
>>Betreff: [Networker] Unload retries with NSR6.1.1-LGTpa33799
>>
>>
>>Dears,
>>
>>I opened  a case with Legato support but as usual, I am not
>>seeing any possible end to the case. So, I am failing back to
>>this very powerful list, but this time, as requestor ...
>>
>>I have been faced to media verification failures and be advised
>>by Legato support folks to install jumbo kit NSR611-LGTpa33799 as THE
>>solution. Ok, the amount of media verification failures decreased
>>a little bit but as usual introducing new software, a new bug has
>>been introduced and follows the kit : un-installing the
>>kit, the bug is gone but my media verification errors are back again
>>and reverse. A marvellous deadlock.
>>
>>The problem I am seeing is that at ramdom time, on different jukeboxes
>>from different OS (Sun Solaris 7, 8 and HP Tru64 Unix 5) Networker
>>is getting confused with the loaded tapes and the tapes it try
>>to eject resulting in a lot of unload retries like
>>
>>        10/20/02 18:12:17 nsrd: media info: failed unloading
>>             drive`rd=xxx:/dev/ntape/tape6_d1' to slot '314', error
'10'
>>
>>Looking at jukebox's inventory, I can see that tape from slot 314 (to
>>refer with the above sample) is loaded into drive
>>`rd=xxx:/dev/ntape/tape6_d1'
>>and `rd=xxx:/dev/ntape/tape7_d1' (in the same jukebox),
>>with volume name "-" BUT NO TAPE is physically loaded into any of the
2
>>devices.
>>Networker, trying to unload the tape reports "Read Open Error, I/O
>>Error".
>>
>>As nsrjb -HE cannot be run during activity of the jukebox without
>>loosing
>>the inventory of tape drives being busy writing (I got a fix for
>>that but the fix is again bugged), the workaround I found is to edit
>>the fields "Loaded Barcode", "Loaded Volume" and "Loaded slot" from
>>detailed
>>view of Jukebox control panel after disabling the device and reenable
it
>>after.
>>Then the final step being to reinventory the slot.
>>
>>I am certainly not the only one who met that behaviour. I have been
told
>>about
>>a fix number LGTpa35119 - "NetWorker cannot recover from an out of
sync
>>Silo/Jukebox event." but cannot find any description about it.
>>
>>Thanks for your help !
>>
>>Thierry
>>
>>The config in short : nsr611 LGTpa33799 on solaris 8, with 8 storage
>>nodes running nsr611 LGTPA33799 on tru64 Unix, 2* ATL P3000, 2* STK
>>L700,
>>1* Compaq TL894, 1* Compaq TL891. ATLs and Compaq : DLT7000, STK
DLT7000
>>and 9840.
>>Round 250 Clients being backuped, 28 TB a week.
>>
>>Kind regards - Bien cordialement - Vriendelijke groeten,
>>
>>Thierry FAIDHERBE
>>
>>Storage & Server Integration Practice
>>Tru64 Unix and Legato EBS Consultant
>>
>>--
>>Note: To sign off this list, send a "signoff" command via email
>>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>>http://listmail.temple.edu/archives/networker.html where you can
>>also view and post messages to the list.
>>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
>
>--
>Note: To sign off this list, send a "signoff" command via email
>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>http://listmail.temple.edu/archives/networker.html where you can
>also view and post messages to the list.
>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
>
>--
>Note: To sign off this list, send a "signoff" command via email
>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>http://listmail.temple.edu/archives/networker.html where you can
>also view and post messages to the list.
>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>