Networker

[Networker] Strange re-running / re-starting backups (7.5.1)

2011-11-17 15:28:30
Subject: [Networker] Strange re-running / re-starting backups (7.5.1)
From: Len Philpot <Len.Philpot AT CLECO DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Thu, 17 Nov 2011 14:27:46 -0600
We have a 7.5.1 (Solaris 9) server that we've migrated all but two of its 
100+ clients off of. We're replacing this server with an Avamar 
environment (hence the old Networker version) and these are the only two 
clients left. One of them is a Quantum SNAPserver on which we managed to 
install a Linux client several years ago. It's normally solid, but over 
the past few nights it's been repeating the backup of a certain 
filesystem. It will back up apparently OK, get to what should be about the 
end and then start over again. There will be an entry in daemon.log 
something like this :

    64690 11/17/11 11:40:52  savegrp savegrp:2am_group * 
<client>:/shares/export  See the file 
/nsr/tmp/sg/2am_group/sso.<client>.3Qaq0y for output of save command.
    7341 11/17/11 11:40:52  savegrp <client>:/shares/export failed.
    7339 11/17/11 11:40:52  savegrp <client>:/shares/export will retry 5 
more time(s)

If I look at sso.<client>.3Qaq0y, it says :

    <client>: /shares/export               level=full,    410 GB 09:40:51  
1700 files
    completed savetime=1321516801

And if I run mminfo against this client/saveset, it also appears that it 
was successful :

     ssid      client     name              ss created        ss completed 
      lvl   total ssflags volume  fl group
    3905208188 <client>   /shares/export   11/17/11 02:02:04  11/17/11 
11:40:51 full  410 GB vF    001151    mb 2am_group
    3905208188 <client>   /shares/export   11/17/11 02:02:04  11/17/11 
11:40:51 full  410 GB vF    001264    hb 2am_group
    3905208188 <client>   /shares/export   11/17/11 02:02:04  11/17/11 
11:40:51 full  410 GB vF    001265    mb 2am_group
    3905208188 <client>   /shares/export   11/17/11 02:02:04  11/17/11 
11:40:51 full  410 GB vF    001299    tb 2am_group
    3485812516 <client>   /shares/export   11/17/11 11:40:52 undefined 
full    0 KB vrEiF 001299    ca 2am_group

The last line is the re-started backup that I killed off. But all the 
others show vF (valid, finished) which is, AFAIK, Networkerese for "good 
to go", right?

In fact, killing the group will not stop the backup. I have to actually 
kill the applicable nsrmmd on the tape device and then shutdown / restart 
Networker to prevent it from re-launching in two minutes. Otherwise it 
will continue to run (apparently all the way).

I'm seeing this only on this client (duh :-) and only this saveset - There 
are several other savesets that complete normally every time. This client 
is in an incremental/weeknight and full-cloned/weekend schedule. It does 
the same thing regardless of whether it's full or incremental. It's as if 
the final communication that the backup is complete isn't taking place, so 
it think's it's failed and tries again.

Or something.... man I just *LOVE* backups!!! Right up there with root 
canal.... :-\


ANYONE seen anything like this before?

Thanks.

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>
  • [Networker] Strange re-running / re-starting backups (7.5.1), Len Philpot <=