Hello,
We continue having problems with Legato 7.1.2 and 'Default' pool problems.
Some days Legato doesn't finish the backup and it asks for tapes in the
'Default' pool.
We've arrived to the conclusion that this happens when Legato has some
connectivity problems with the clients (due to different things, network,
DNS, etc.).
Today, only 1 machine in the group hasn't finished the backup. I've stopped
the group. I've used 'nsradmin -s machine -v1 -p390113' to see if we have
connectivity with the daemon running in the client. I've found this
connection a lot more slow than other clients. I've login into this machine
and restarted NetWorker (/etc/init.d/networker stop + start). Then, nsradmin
was connecting faster than before. I've restarted the backup and it finished
properly, without asking for tapes in the 'Default' pool.
I would like to know why Legato asks for tapes in the 'Default' pool:
a) if it has 4 tapes that can be used (the 2 into drives, one at 94% at the
other at 70%; and 2 that can be recicled)?
Name Barcode Written Used Mode Expiration Location Pool
LNR115 LNR115 167 GB full 20/11/2004 QUALSTAR LCFIB
Normal
LNR116 LNR116 169 GB full 20/11/2004 QUALSTAR LCFIB
Normal
LNR117 LNR117 168 GB full 20/11/2004 QUALSTAR LCFIB
Normal
LNR118 LNR118 150 GB 94% appen 23/11/2004 QUALSTAR LCFIB
Normal
LNR119(R) LNR119 166 GB full recyc expired QUALSTAR LCFIB
Normal
LNR120 LNR120 112 GB 70% appen 23/11/2004 QUALSTAR LCFIB
Normal
LNR121(R) LNR121 168 GB full recyc expired QUALSTAR LCFIB
Normal
b) only when it has problems when some machine (we detected the problem
asking for 'Default' tapes some days before when we had DNS problems and
Legato can't contact, resolve a lot of name machines)?
>>> Before aborting the group
...
media waiting event: Waiting for 1 writable volumes to backup pool 'Default'
tape(s) on server
media event cleared: Waiting for 1 writable volumes to backup pool 'Default'
tape(s) on server
savegroup alert: LCFIB-Normal aborted, total 60 client(s), 0 Hostname(s)
Unresolved, 1 Failed, 59 Succeeded. (machine1 Failed)
>>>> After restarting the the client 'machine1' and restarting the group
savegroup info: restarting LCFIB-Normal (with 60 client(s))
machine1:/data01 saving to pool 'LCFIB Normal' (LNR118)
...
server:index:machine1 saving to pool 'LCFIB Normal' (LNR118)
server:index:machine1 done saving to pool 'LCFIB Normal' (LNR118) 116 MB
server:index:server saving to pool 'LCFIB Normal' (LNR118)
server:index:server done saving to pool 'LCFIB Normal' (LNR118) 16 MB
server:bootstrap saving to pool 'LCFIB Normal' (LNR118)
server:bootstrap done saving to pool 'LCFIB Normal' (LNR118) 6241 KB
savegroup notice: LCFIB-Normal completed, total 60 client(s), 0 Hostname(s)
Unresolved, 0 Failed, 60 Succeeeded.
Any idea about what the problem is?
Thank you very much!
_________________________________________________________________
o o o Manel Rodero | LCFIB - UPC
o o o Systems Manager & HelpDesk | Campus Nord - Modul B6
o o o Laboratori de Calcul | Jordi Girona, 1-3
U P C Facultat Informatica Barcelona | 08034 Barcelona (Spain)
|
Mail : manel AT fib.upc DOT es | Tel: +00 34 93 401 6940
Web : http://www.fib.upc.es/~manel | Fax: +00 34 93 401 7040
_________________________________________________________________
--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list. Questions regarding this list
should be sent to stan AT temple DOT edu
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
|