Veritas-bu

[Veritas-bu] NBU + NetWare problem

2004-10-27 04:03:10
Subject: [Veritas-bu] NBU + NetWare problem
From: marek.gorka AT sun DOT com (Marek Gorka)
Date: Wed, 27 Oct 2004 10:03:10 +0200
Hello,

I would like to discuss a problem with backing up a NetWare client.

We have several servers giving us very similar problems, but I'll focus on 
one of them to ease the case.

<> FACTS - Backup Server

Backup server running Veritas NetBackup 4.5 MP3 (Master+Media) + patches:

- 112407-05
- 115803-01
- 113487-05
- 112408-01

System platform is Sun Solaris 8.

Library: 2*L25 (stacked) + 2*LTO tape devices.

Backup server backs up serveral clients: two NetWare, three Windows NT+2000, 
two Solaris boxes.

Server platform and clients other than NetWare run perfectly okay.

<> FACTS - Novell NetWare

Server platform is 5.1 with Service Pack 5. Server runs only directory 
synchronization application + BorderManager.

According to the docs and to the current environment, we use the following 
NetWare parameters:

- Maximum directory cache buffers 10000
- Minimum directory cache buffers 4000
- Maximum packets receive buffers 30000
- Minimum packets receive buffers 2000
- Maximum concurrent disk cache writes 2000
- Maximum concurrent directory cache writes 75
- Maximum physical receive packet size 4224

(Accross the environment some parameters may vary a little between NetWare 
servers, but not too much).

<> FACTS - Backup Type

We do a TARGET backup for the following targets:

- NDS  (directory)
- SYS (filesystem)
- DANE (filesystem)

We use OTM with the OTM file on DANE:\. Here is the config excerpt on the 
client side:

Cache_file = dane:otm
Cache_Control = 0
Cache_Size_Init = 700
Cache_Size_Max = 3172
[...]
Use = yes

There is always more space on DANE than the Cache_Size_Max parameter.

The NBU modules we use are:

tsands.nlm
tsa500.nlm
tsaproxy.nlm
otmload
bpcd
bpsrv

<> FACTS - The problem indeed

When we power on the netware box, the backup runs okay for all targets for 
several days. After 5-7 days something happens.

We encounter the following problems (probably on the same basis):

- neverending backup sessions: Backup session starts for a target. Server 
mounts and positions the tape. The Activity Monitor shows that everything 
goes fine, gives us "Begin Writing" and nothing happens. No data runs from 
the client to the server (checked with the 'snoop' utility). The NBU client 
on the NetWare side is running, 13782 port is opened. Nothing happens. There 
are no network problems. The session might last forever if we don't stop it 
manually (150 error code). This happens mostly for the filesystem targets: 
DANE + SYS, very rarely for NDS (directory). Usually with this problem goes 
another:

- OTM file is not deleted after backup session: it still exists on the DANE 
filesystem. We are unable to delete the file manually. We have to stop the 
NBU client. And this gives us another problem:

- NetWare server hangs: if we try to unload NBU modules, the console hangs. 
After a while the whole server hangs. The only solution is to reboot it 
using the 'reset' button on the server front panel.

Now, as the NetWare server has been rebooted, it runs okay for a couple of 
days (usually 5 or 6) and then gives us the same problem.

Sometimes the 'neverending' backup session manages to start the data flow 
from NetWare client to NBU Server. It runs fine for a minute or two when 
suddenly the data just stops going out from the client. The session is still 
open. We have to cancel it manually (150 error code).

----

Does anyone have experience with backing up the netware client? I'd be 
grateful for any help as I ran out of ideas.

-- 
Marek Gorka
Client Solutions
Sun Microsystems Poland

<Prev in Thread] Current Thread [Next in Thread>