I have a question on listing multiple storage nodes in the Storage nodes
field. Not sure I understand how NW loops through them.
Let's say you have two storage nodes in the list, but only the first
snode has a tape for the required backup. What will happen if it takes
the snode longer than the save mount timeout value to load and mount the
tape?
I assume that it locks the first snode for the Save lockout value (lets
say 1 minute), but then if there's no available tape on the second
storage node, will it then come back to the first snode and then just
keep looping back and forth? Or will it stop once it sees that the
second storage node doesn't have what it needs and eventually fail,
never retrying the first snode?
I had a group fail with savegroup completion messages like those shown
below. There's also a bunch of these messages in the server's daemon.log
file for the various save sets in the group. Only a few save sets
completed. Most failed before the tape was even mounted. The client has
only one snode listed, but that snode had the required tapes. The given
pool doesn't have any devices selected. However, I recently changed the
Save mount timeout on the devices for both snodes from 30 to 2, and the
lockout from 0 to 1. The tape was not previously mounted before the
group started, so it had to mount it first. The server lists both
snodes, and nsrserverhost as the third snode, but as far as the group is
concerned, shouldn't that only come into play when the server backs up
the client's index? Not sure why it's complaining then about opening a
session with the server when the client only has the two snodes listed?
Anyway, I had never seen these problems before when the default values
were 30 and 0, but I would like the client(s) to be able to use either
storage node so that's why I changed the values.
--- Unsuccessful Save Sets ---
* client:/path 1 retry attempted
* client:/path save: error, no matching devices; check storage nodes,
devices or pools
* client:/path Cannot open save session with server
* client:/path 1 retry attempted
This group has cloning enabled. The server logs show that the required
tape was mounted about 3 minutes after the group started, but by that
time, most of the save sets had already generated these error messages,
so only a couple save sets completed. Cloning then worked, but obviously
only for those few save sets that succeeded. I also had two other groups
succeed but fail with similar messages when they started the actual
clone process. None of these groups runs at the same time. Typically,
there is no overlap. Seems that the save mount timeout and lockout
values are probably not what they should be but not sure what to change
them to. I don't want things waiting too long???
George
--
George Sinclair - NOAA/NESDIS/National Oceanographic Data Center
SSMC3 4th Floor Rm 4145 | Voice: (301) 713-3284 x210
1315 East West Highway | Fax: (301) 713-3301
Silver Spring, MD 20910-3282 | Web Site: http://www.nodc.noaa.gov/
- Any opinions expressed in this message are NOT those of the US Govt. -
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|