Veritas-bu

[Veritas-bu] Corrupted images on DSSU

2008-10-30 14:18:15
Subject: [Veritas-bu] Corrupted images on DSSU
From: "Bryan S. Leaman" <leaman AT bitbytes DOT com>
To: veritas-bu AT mailman.eng.auburn DOT edu
Date: Thu, 30 Oct 2008 13:50:36 -0400 (EDT)
I'm having a problem with images on DSSU occasionally failing to
duplicate/relocate to tape due to corruption.  When it occurs, the backup
job is successful but the duplicate fails with a size mismatch.  I'm
running NBU 6.5.2 on Solaris 10.

The issue seems to happen when the DSSU fills up during a backup and it
has to do some cleanup before resuming.  I usually have several backups
running concurrently to DSSU.  I get only a few corrupted images each week
even though I'm backing up 30+ clients to DSSU daily.  My HWM is 80% yet
it almost always goes to 100% before any cleanup is done.

For my DSSU filesystem I'm using ZFS RAID-Z with compression.  ZFS is not
reporting any errors so I don't think the physical disk storage is the
problem.

bpverify confirms that the image is bad.  Any ideas?

Here is a log showing a successful backup and the subsequent failed
duplicate:

10/29/2008 19:00:00 - requesting resource mirage-disk-staging-compressed
10/29/2008 19:00:00 - requesting resource
mirage.mydomain.com.NBU_CLIENT.MAXJOBS.rh13.mydomain.com
10/29/2008 19:00:00 - requesting resource
mirage.mydomain.com.NBU_POLICY.MAXJOBS.Prod_Rh13_FS_All
10/29/2008 19:00:01 - granted resource 
mirage.mydomain.com.NBU_CLIENT.MAXJOBS.rh13.mydomain.com
10/29/2008 19:00:01 - granted resource 
mirage.mydomain.com.NBU_POLICY.MAXJOBS.Prod_Rh13_FS_All
10/29/2008 19:00:01 - granted resource 
MediaID=@aaaae;Path=/dssu1;MediaServer=mirage.mydomain.com
10/29/2008 19:00:01 - granted resource  mirage-disk-staging-compressed
10/29/2008 19:00:01 - estimated 20225602 kbytes needed
10/29/2008 19:00:02 - started process bpbrm (pid=7903)
10/29/2008 19:00:02 - connecting
10/29/2008 19:00:02 - connected; connect time: 0:00:00
10/29/2008 19:00:06 - begin writing
10/29/2008 19:10:47 - Warning bptm (pid=7907) storage unit
mirage-disk-staging-compressed is full: processing disk full condition
10/29/2008 19:11:07 - Info bptm (pid=7907) initial volume /dssu1: Kbytes
total capacity: 138860608, used space: 138860608, free space: 0
10/29/2008 19:11:13 - Info bptm (pid=7907) Removed 27271 Kbytes
10/29/2008 19:11:13 - Info bptm (pid=7907) ending volume /dssu1: Kbytes
total capacity: 138860524, used space: 138833337, free space: 27186
10/29/2008 19:11:57 - end writing; write time: 0:11:51
the requested operation was successfully completed (0)

10/30/2008 00:33:49 - Error bpdm (pid=5009) ERR - Error occurred reading
TIR information,                           expected this frag 23996284 bytes, 
read this frag
23995904
10/30/2008 00:33:49 - Warning bptm (pid=26621) ERR - Error reading TIR
data, expected 23996284 bytes + 0 GB, received 0 bytes + 23995904 GB.
10/30/2008 00:35:18 - Error bpduplicate (pid=250) host mirage.mydomain.com
backup id rh13.mydomain.com_1225321201 read failed, media read error (85).
10/30/2008 00:35:18 - Error bpduplicate (pid=250) host mirage.mydomain.com
backupid rh13.mydomain.com_1225321201 write failed, media manager - system
error occurred (174).
10/30/2008 00:35:20 - Error bpduplicate (pid=250) Duplicate of backupid
rh13.mydomain.com_1225321201 failed, media manager - system error occurred
(174).

# bpverify -backupid rh13.mydomain.com_1225321201
Verify started Thu Oct 30 13:41:00 2008
INF - Verifying policy Prod_Rh13_FS_All, schedule Daily_Incremental
(rh13.mydomain.com_1225321201), path "/dssu1", created 10/29/2008
19:00:01.
INF - Verify of policy Prod_Rh13_FS_All, schedule Daily_Incremental
(rh13.mydomain.com_1225321201) failed, media read error.

INF - Status = no images were successfully processed.


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>
  • [Veritas-bu] Corrupted images on DSSU, Bryan S. Leaman <=