Networker

Re: [Networker] problems from upgrade from 7.2.2 to 7.4.3

2008-10-27 11:45:20
Subject: Re: [Networker] problems from upgrade from 7.2.2 to 7.4.3
From: David Dulek <ddulek AT FASTENAL DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 27 Oct 2008 10:42:01 -0500
Keep an eye on the issue yourself.  EMC support is horrible at actually
letting you know the bug fixes are ready.  In fact, the only time I have
actually gotten a fix from support is when I call my DSM and ask him to
the rattle cages.

On Mon, 2008-10-27 at 10:13 -0400, Joel Fisher wrote:
> Update.
> 
> It appears the "NOT in index" issue was some type of mm database
> corruption.  The tech had me do some deleting in the mm directory and
> then restart the server.  Since that time, I have not had any issues
> pertaining to mounting media.  This also resolved the adv_file umounting
> issue.
> 
> Owner notification is broken as far as I can tell, they have a bug
> ticket for it and have added me to the affected list.  We still have
> several clients that are hanging up groups, I think that's just normal
> post upgrade mess to sort through.
> 
> So all and all, I needed a patched nsrd and a "rebuilt" mm database.
> 
> I'll update on the owner notification issue.
> 
> Thanks!
> 
> Joel
> 
> 
> 
> -----Original Message-----
> From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] 
> On
> Behalf Of Joel Fisher
> Sent: Thursday, October 23, 2008 4:59 PM
> To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
> Subject: Re: [Networker] problems from upgrade from 7.2.2 to 7.4.3
> 
> I'm still trying to figure out the notification thing... my techs
> suggestion doesn't appear to work or I've done it wrong.
> 
>  
> 
> I'll update when I've gotten somewhere.
> 
>  
> 
> Thanks!
> 
> 
> 
> Joel
> 
>  
> 
> From: Mike Borkowski [mailto:mikeb AT uwaterloo DOT ca] 
> Sent: Thursday, October 23, 2008 12:04 PM
> To: EMC NetWorker discussion; Joel Fisher
> Subject: Re: [Networker] problems from upgrade from 7.2.2 to 7.4.3
> 
>  
> 
> What did you use as the event name?
> 
> ---
> mike
> 
> Joel Fisher wrote: 
> 
> Hey Paul,
>  
> I have a script that I run about 30 minutes before my next backup window
> starts that stops groups(via nsradmin) that are still running.  I can
> shoot you the script if you want it.
>  
> I'm speaking w/ the EMC tech right now, and he said that they've changed
> owner notification so that you have to create a notification and then
> put the name of that notification in the owner notification field.  No
> biggie as long as I can pass a variable to the notification from the
> owner notification field.  Otherwise you'll have to setup a separate
> notification for each unique set of owners.  Or write a script to grab
> the client name and match it to a set of emails in a external file.  The
> later might not be too bad, I've considered doing this for a while.
>  
> Thanks for the response.
>  
> Joel
>  
> 
> -----Original Message-----
> From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] 
> On
> Behalf Of Goslin, Paul
> Sent: Thursday, October 23, 2008 10:26 AM
> To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
> Subject: Re: [Networker] problems from upgrade from 7.2.2 to 7.4.3
>  
> We recently upgraded from 7.2.2 to 7.4 SP2 and have also experienced the
> issue you refer to in item #4: "savegroups are not finishing and the
> jobs that are just hanging", when I check in the morning on our
> savegroups, some still show as running, with a few saves sets waiting to
> run that are in the 'contacting client' state, but no save-sessions are
> running... It will sit like this until I manually abort the group and
> then re-start it. If I do nothing, it causes the next days attempt to
> run the group to abort with 'savegroup still running'... Which has
> caused us big problems on the weekends as we have no operators with
> enough Networker smarts to monitor the groups on a daily basis and take
> corrective action if something does not complete...  
>  
> Is there any way to make a savegroup stop after a specific time period ?
> Say 23 hours & 30 minutes... If it has not completed, I want it to stop
> so the next days attempt of it can at least be started .....
>  
> I've also experienced some small problems with the 'Owner Notification'
> on a few clients. I have it setup to use BLAT to send e-mail to
> interested parties, but it only seems to work when the group completes
> normally on its own, never when the group has to be manually aborted...
> And it NEVER logs the sending of the e-mail in the daemon.log like it
> used to do ???  What's up with that ??? Why are these events not logged
> as they were before ???
>  
>   
> 
>       -----Original Message-----
>       From: EMC NetWorker discussion 
>       [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] On Behalf Of Joel Fisher
>       Sent: Thursday, October 23, 2008 10:02 AM
>       To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
>       Subject: [Networker] problems from upgrade from 7.2.2 to 7.4.3
>        
>       Hey Guys,
>        
>        
>        
>       Last Thursday I upgrade from 7.2.2 to 7.4.3. It has been less 
>       than smooth so far.
>        
>        
>        
>       It initial seemed to go flawlessly, but Monday morning nsrd 
>       crashed and would not stay running.  EMC provided a hotfixed 
>       nsrd that seems to have resolved that problem, but I have 
>       some other less critical problems that I was wondering if you 
>       guys have seen.
>        
>        
>        
>       1)      Adv_file devices keep randomly unmounting.  I've seen in
> the
>       archives people having issues with RO devices, but in my case 
>       it is any device RW or RO.  There isn't any message any the 
>       log about the dismount just that it notifies me if it needs 
>       it mounted.
>        
>       2)      'Owner notification' either doesn't work, or the
> functionality
>       has changed.  My existing scripts don't work with it.  For 
>       troubleshooting, I've made a very simple script that 
>       basically takes stdin and writes it to a file.  That doesn't 
>       work either.
>        
>       3)      Media that is labeled and previously working will not
> mount.
>       I'll get a message about "volume xxxxxx(volid xxxxxxxxxxx) 
>       NOT in media index".  But then after awhile it will mount, 
>       after no intervention on my part.  This is happening on tapes 
>       within a silo and on my adv_file type devices that keep 
>       unmounting.  May be related to the first problem.
>        
>       4)      Many, not all, savegroups are not finishing and the jobs
> that
>       are just hanging out are typically index saves.
>        
>       5)      Nsrjb shows empty slots... which in not normal for an
> acsls
>       silo.  It allows me to allocate the "volumes" in those slots, 
>       but the volumes are not actually in the silo.  In previous 
>       releases, a volume could not be allocated to a silo unless in 
>       was physically in the silo.
>       I'm assuming this is a bug not a design change.
>        
>        
>        
>       Any assistance would be appreciated.
>        
>        
>        
>       FYI... I do have a case open with EMC to address them.
>        
>        
>        
>       Thanks!
>        
>        
>        
>       Joel
>        
>        
>       To sign off this list, send email to 
>       listserv AT listserv.temple DOT edu and type "signoff networker" in 
>       the body of the email. Please write to 
>       networker-request AT listserv.temple DOT edu if you have any 
>       problems with this list. You can access the archives at 
>       http://listserv.temple.edu/archives/networker.html or via RSS 
>       at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>        
>           
> 
>  
> To sign off this list, send email to listserv AT listserv.temple DOT edu and
> type "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this
> list. You can access the archives at
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>  
> To sign off this list, send email to listserv AT listserv.temple DOT edu and
> type "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this
> list. You can access the archives at
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>  
>   
> 
> To sign off this list, send email to listserv AT listserv.temple DOT edu and
> type "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this
> list. You can access the archives at
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
> 
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type "signoff networker" in the body of the email. Please write to 
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this list. You can access the archives at 
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER