Networker

[Networker] Help with storage nodes field?

2007-09-18 20:35:19
Subject: [Networker] Help with storage nodes field?
From: George Sinclair <George.Sinclair AT NOAA DOT GOV>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Tue, 18 Sep 2007 20:30:02 -0400
I have a question on listing multiple storage nodes in the Storage nodes field. Not sure I understand how NW loops through them.

Let's say you have two storage nodes in the list, but only the first snode has a tape for the required backup. What will happen if it takes the snode longer than the save mount timeout value to load and mount the tape?

I assume that it locks the first snode for the Save lockout value (lets say 1 minute), but then if there's no available tape on the second storage node, will it then come back to the first snode and then just keep looping back and forth? Or will it stop once it sees that the second storage node doesn't have what it needs and eventually fail, never retrying the first snode?

I had a group fail with savegroup completion messages like those shown below. There's also a bunch of these messages in the server's daemon.log file for the various save sets in the group. Only a few save sets completed. Most failed before the tape was even mounted. The client has only one snode listed, but that snode had the required tapes. The given pool doesn't have any devices selected. However, I recently changed the Save mount timeout on the devices for both snodes from 30 to 2, and the lockout from 0 to 1. The tape was not previously mounted before the group started, so it had to mount it first. The server lists both snodes, and nsrserverhost as the third snode, but as far as the group is concerned, shouldn't that only come into play when the server backs up the client's index? Not sure why it's complaining then about opening a session with the server when the client only has the two snodes listed? Anyway, I had never seen these problems before when the default values were 30 and 0, but I would like the client(s) to be able to use either storage node so that's why I changed the values.

--- Unsuccessful Save Sets ---
* client:/path 1 retry attempted
* client:/path save: error, no matching devices; check storage nodes, devices or pools
* client:/path Cannot open save session with server
* client:/path 1 retry attempted

This group has cloning enabled. The server logs show that the required tape was mounted about 3 minutes after the group started, but by that time, most of the save sets had already generated these error messages, so only a couple save sets completed. Cloning then worked, but obviously only for those few save sets that succeeded. I also had two other groups succeed but fail with similar messages when they started the actual clone process. None of these groups runs at the same time. Typically, there is no overlap. Seems that the save mount timeout and lockout values are probably not what they should be but not sure what to change them to. I don't want things waiting too long???

George

--
George Sinclair - NOAA/NESDIS/National Oceanographic Data Center
SSMC3 4th Floor Rm 4145       | Voice: (301) 713-3284 x210
1315 East West Highway        | Fax:   (301) 713-3301
Silver Spring, MD 20910-3282  | Web Site:  http://www.nodc.noaa.gov/
- Any opinions expressed in this message are NOT those of the US Govt. -
To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>