I'm having a problem with images on DSSU occasionally failing to
duplicate/relocate to tape due to corruption. When it occurs, the backup
job is successful but the duplicate fails with a size mismatch. I'm
running NBU 6.5.2 on Solaris 10.
The issue seems to happen when the DSSU fills up during a backup and it
has to do some cleanup before resuming. I usually have several backups
running concurrently to DSSU. I get only a few corrupted images each week
even though I'm backing up 30+ clients to DSSU daily. My HWM is 80% yet
it almost always goes to 100% before any cleanup is done.
For my DSSU filesystem I'm using ZFS RAID-Z with compression. ZFS is not
reporting any errors so I don't think the physical disk storage is the
problem.
bpverify confirms that the image is bad. Any ideas?
Here is a log showing a successful backup and the subsequent failed
duplicate:
10/29/2008 19:00:00 - requesting resource mirage-disk-staging-compressed
10/29/2008 19:00:00 - requesting resource
mirage.mydomain.com.NBU_CLIENT.MAXJOBS.rh13.mydomain.com
10/29/2008 19:00:00 - requesting resource
mirage.mydomain.com.NBU_POLICY.MAXJOBS.Prod_Rh13_FS_All
10/29/2008 19:00:01 - granted resource
mirage.mydomain.com.NBU_CLIENT.MAXJOBS.rh13.mydomain.com
10/29/2008 19:00:01 - granted resource
mirage.mydomain.com.NBU_POLICY.MAXJOBS.Prod_Rh13_FS_All
10/29/2008 19:00:01 - granted resource
MediaID=@aaaae;Path=/dssu1;MediaServer=mirage.mydomain.com
10/29/2008 19:00:01 - granted resource mirage-disk-staging-compressed
10/29/2008 19:00:01 - estimated 20225602 kbytes needed
10/29/2008 19:00:02 - started process bpbrm (pid=7903)
10/29/2008 19:00:02 - connecting
10/29/2008 19:00:02 - connected; connect time: 0:00:00
10/29/2008 19:00:06 - begin writing
10/29/2008 19:10:47 - Warning bptm (pid=7907) storage unit
mirage-disk-staging-compressed is full: processing disk full condition
10/29/2008 19:11:07 - Info bptm (pid=7907) initial volume /dssu1: Kbytes
total capacity: 138860608, used space: 138860608, free space: 0
10/29/2008 19:11:13 - Info bptm (pid=7907) Removed 27271 Kbytes
10/29/2008 19:11:13 - Info bptm (pid=7907) ending volume /dssu1: Kbytes
total capacity: 138860524, used space: 138833337, free space: 27186
10/29/2008 19:11:57 - end writing; write time: 0:11:51
the requested operation was successfully completed (0)
10/30/2008 00:33:49 - Error bpdm (pid=5009) ERR - Error occurred reading
TIR information, expected this frag 23996284 bytes,
read this frag
23995904
10/30/2008 00:33:49 - Warning bptm (pid=26621) ERR - Error reading TIR
data, expected 23996284 bytes + 0 GB, received 0 bytes + 23995904 GB.
10/30/2008 00:35:18 - Error bpduplicate (pid=250) host mirage.mydomain.com
backup id rh13.mydomain.com_1225321201 read failed, media read error (85).
10/30/2008 00:35:18 - Error bpduplicate (pid=250) host mirage.mydomain.com
backupid rh13.mydomain.com_1225321201 write failed, media manager - system
error occurred (174).
10/30/2008 00:35:20 - Error bpduplicate (pid=250) Duplicate of backupid
rh13.mydomain.com_1225321201 failed, media manager - system error occurred
(174).
# bpverify -backupid rh13.mydomain.com_1225321201
Verify started Thu Oct 30 13:41:00 2008
INF - Verifying policy Prod_Rh13_FS_All, schedule Daily_Incremental
(rh13.mydomain.com_1225321201), path "/dssu1", created 10/29/2008
19:00:01.
INF - Verify of policy Prod_Rh13_FS_All, schedule Daily_Incremental
(rh13.mydomain.com_1225321201) failed, media read error.
INF - Status = no images were successfully processed.
_______________________________________________
Veritas-bu maillist - Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
|