Veritas-bu

[Veritas-bu] Typical success rates?

2006-06-30 01:45:17
Subject: [Veritas-bu] Typical success rates?
From: simon.weaver at astrium.eads.net (WEAVER, Simon)
Date: Fri, 30 Jun 2006 06:45:17 +0100
Exclude the "1"'s as I dont class them as failures :-)
 
 

Regards

Simon Weaver
3rd Line Technical Support
Windows Domain Administrator 

EADS Astrium Limited, B32AA IM (DCS)
Anchorage Road, Portsmouth, PO3 5PU

Email:  <mailto:Simon.Weaver at Astrium-eads.net> Simon.Weaver at 
Astrium-eads.net

-----Original Message-----
From: Wayne T Smith [mailto:wts at maine.edu] 
Sent: 29 June 2006 21:09
To: veritas-bu at mailman.eng.auburn.edu
Subject: Re: [Veritas-bu] Typical success rates?


Never have I had a day without failures.  Here's a sample from my past 24
hours (v5.1MP5 backup server ... various clients numbering less than 150 ...
reported errors only) ...



*       machine - status explanation 

*       01 - 41 - This is one of several laptops that are backed up whenever
it is connected, but isn't connected very often.  I wish NetBackup could
poll these machines quietly and back them up when they appear. (about 2
dozen job failures of this type have been omitted from this report) 

*       02 - 1 -  A mailbox could not be enumerated. The Exchange person may
correct these someday. 

*       03 - 54 - bpbrm listen for client timeout during accept from data
listen socket for 60 seconds (will look into this one, especially if it
repeats) 

*       04 - 58 - cannot connect (application does not play well with
NetBackup client - only a few backups are successful) 

*       05 - 1 - cannot open file - in use by another process (will try to
exclude these files because the error appears permanent). 

*       06 - 6 - failed to backup requested files.  This was an CINC Oracle
backup on an idle DB.  Maybe I can adjust script to force a change in the DB
or avoid backing up no changes 

*       07 - 6 - same as machine 06. 

*       08 - 54 - timeout connecting to client. NetBackup server was delayed
obtaining a tape drive, causing Oracle/RMAN to give up (I think). 

*       09 - 41 - network connection timed out. This was at very end of
backup ("end writing" in job details). Happens occasionally with this
client. 

*       10 - 1 - Some ".tmp" files in use by another process.  Will add
"*.tmp" to exclude list, but probably at expense of slowing backups?  Also,
unable to export RSM database. 

*       11 - 58 - cannot connect to client. Client machine is spread all
over a table, with HP trying to find what's wrong with it.  Has been down
for several *weeks*.  Have manually extended expiration of existing backups.
Too bad you can't tell NetBackup to keep its last full backups of a client &
policy. 

*       12 - 41 - similar to machines 06 and 09. 

*       13 - 1 - Several "filemaker" files unavailable for backup.  We don't
exclude because sometimes they can be backed up and that's better than none.


*       14 - 41 - Another mysterious network connection timed out at or near
end of file system backup, when job began delayed with "busy resources". 

*       15 - 57 client connection refused.  Similar to machine 04. 

*       16 - 1 - A Windows file, access_log, has a portion locked by another
process.  I cannot fix this without putting client in a policy of its own,
because the file is included for processing by a necessary include list
entry. 

*       17 - 54 - Machine is powered off due to a power outage.  User
doesn't care because he no longer works there and management hasn't decided
what to do yet. 

*       18 - 1 - A few classical "in use" failures (Windows defender and
perfdata), as well as a "cannot open old TIR file" failure.  Backing up TIR
files seems bogus, but I otherwise don't know how to avert the problem. 

*       19 - 58 - powered off due to same power outage as machine 17.
Machine is going away, but owner might resurrect it or want one more backup.


*       20 - 58 - trying machine 11 backup again. 

*       21 - 25 - cannot execute cmd on client.  No idea why this Exchange
DB CINC backup failed (immediately). A later job worked. 

*       22 - 1 - A relatively new Linux client trying to backup "sparse file
/sys/bus/pci/..." (many).  Will suggest owner exclude. 

*       23 - 1 - same as machine 22. 


So that's about 50 jobs with errors out of about 375, or about 10-15% of
jobs.  It gets better if you don't count status code 1s as failures, worse
if you consider a few clients have many file systems and multi-stream
enabled, and much better if you throw out all the failures that are
"expected"!

cheers, wayne

Whelan, Patrick wrote, in part,  on 6/29/2006 1:48 PM: 

Do you usually have a 100% success every backup session? If not what is a
typical success rate?



This email is for the intended addressee only.
If you have received it in error then you must not use, retain, disseminate or 
otherwise deal with it.
Please notify the sender by return email.
The views of the author may not necessarily constitute the views of EADS 
Astrium Limited.
Nothing in this email shall bind EADS Astrium Limited in any contract or 
obligation.

EADS Astrium Limited, Registered in England and Wales No. 2449259
Registered Office: Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2AS, England
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://mailman.eng.auburn.edu/pipermail/veritas-bu/attachments/20060630/ec63eb8b/attachment-0001.html