Networker

[Networker] adhoc manual "save" reporting successfull completion while failin g

2003-03-27 16:10:27
Subject: [Networker] adhoc manual "save" reporting successfull completion while failin g
From: "O'Brien, Pat" <Pat.Obrien AT CHOICEPOINT DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Thu, 27 Mar 2003 16:10:22 -0500
os tru64 5.1
legato server 6.1.1

Having performed a normal save -s servername -e expiration_date "file or
list of files" with std_out and err_out redirected to a file which looks
like the following:
                        /path_1/dir_1/dir_2/dir_3/file_1
                        /path_1/dir_1/dir_2/dir_3/file_2
                        /path_1/dir_1/dir_2/dir_3/file_3
                        /path_1/dir_1/dir_2/dir_3/file_4
                        /path_1/dir_1/dir_2/dir_3/
                        /path_1/dir_1/dir_2/
                        /path_1/dir_1/
                        /path_1/
                        /

                        save: /path_1/dir_1/dir_2/dir_3  1418 MB 00:15:26
9 files

I seem to have just stumbled into a known bug, or at least am close to known
bugs, the jury is still out.   It would seem though that some bad thing
occurs on a drive directly attached to the server results in any saves
running on that drive to be killed by the server.  We have documented this
occuring on nightly incrementals, but the adhocs for us happen real often
and only on data not in incremental streams due to size (TB) staleness, and
multiple copies usually.

                syslog: NetWorker media: (warning) verification of volume
"BQJ997", volid 1850121217 failed, can not read record 7077 of file 65 on
sdlt tape BQJ997
                syslog: NetWorker media: (notice) verification of volume
"BQJ997", volid 1850121217 failed, volume is being marked as full.
                syslog: NetWorker media: (notice) Save set (2047402497)
clienta:/path_1 volume BQJ997 on /dev/ntape/tape5_d1 is being terminated
because: Media verification failed
                syslog: NetWorker media: (notice) Save set (2047360513)
clienta:/path_2 volume BQJ997 on /dev/ntape/tape5_d1 is being terminated
because: Media verification failed
                syslog: NetWorker media: (notice) Save set (2047331329)
clientb:/path_1 volume BQJ997 on /dev/ntape/tape5_d1 is being terminated
because: Media verification failed
                syslog: NetWorker media: (notice) Save set (2043590657)
clientb:/path_2 volume BQJ997 on /dev/ntape/tape5_d1 is being terminated
because: Media verification failed

When the above occurs, only the following exits from the save.  Most notable
missing is the summarization of the save, but no direct error messages and
the save is statused incomplete in the indexes: ( note: We have discovered
this event only happening 7-10 times, and only on 1 of our servers.  I do
realize the tape went full, but with thousands of tapes, this occurs
regularly.)

                        /path_1/dir_1/dir_2/dir_3/file_1
                        /path_1/dir_1/dir_2/dir_3/file_2
                        /path_1/dir_1/dir_2/dir_3/file_3

I can programatically scrub through the std_out files and interogate the
networker server with mminfo. The issue is we do launch these backups in a
manner to splatter across several threaded paths concurrently.  We prefer to
not add a saveset label with unique serial numbers, and even parsing within
the last day could result in multiple hit for a paths.  I am looking for
ideas to perform adhoc manual save verification directly into the server
preferable with mminfo.
thanks
pmob

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>
  • [Networker] adhoc manual "save" reporting successfull completion while failin g, O'Brien, Pat <=