Veritas-bu

Re: [Veritas-bu] Disk Staging Flush

2010-07-01 17:48:25
Subject: Re: [Veritas-bu] Disk Staging Flush
From: Nate Sanders <sandersn AT dmotorworks DOT com>
To: "VERITAS-BU AT MAILMAN.ENG.AUBURN DOT EDU" <VERITAS-BU AT mailman.eng.auburn DOT edu>
Date: Thu, 1 Jul 2010 16:48:14 -0500
Has the job just gotten so large that it cannot write to disk staging
(700GB) any more? I know some of our Oracle Full jobs go straight to
tape because they are too large. But this is an incremental OS backup..
There is no way it's that large.

> Try 1
> PROCESS 1278017363 19536 bpdm
> PROCESS 1278017364 19533 bpbrm
> CONNECT 1278017364
> CONNECTED 1278017364
> DISKMOUNT 1278017364 NONE
> BEGIN_WRITING 1278017364
> LOG 1278017967 16 bpdm 19536 cannot write image to disk, No space left
> on device
> LOG 1278017967 16 bpbrm 19533 from client backup1: ERR - bpbkar
> exiting because backup is aborting
> END_WRITING 1278017971
> Started 1278017360
> KbPerSec 6709
> Kilobytes 1658368
> Files 57000
> ActivePid 19514
> RqstPid 19502
> MainPid 29500
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 19533
> Ended 1278017971
> Started 1278017360
> KbPerSec 6709
> Kilobytes 1658368
> Files 57000
> ActivePid 19514
> RqstPid 19502
> MainPid 29500
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 19533
> Ended 1278017971
> LOG 1278017971 16 bpsched 29500 backup of client backup1 exited with
> status 84 (media write error)
> Try 2
> PROCESS 1278017975 25253 bpdm
> PROCESS 1278017976 25250 bpbrm
> CONNECT 1278017976
> CONNECTED 1278017976
> DISKMOUNT 1278017976 NONE
> BEGIN_WRITING 1278017976
> LOG 1278018377 16 bpdm 25253 cannot write image to disk, attempted
> write of 262144 bytes, system wrote 77824
> LOG 1278018377 16 bpbrm 25250 from client backup1: ERR - bpbkar
> exiting because backup is aborting
> END_WRITING 1278018380
> Started 1278017971
> KbPerSec 9291
> Kilobytes 728064
> Files 4000
> ActivePid 25243
> RqstPid 19502
> MainPid 29500
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 25250
> Ended 1278018380
> LOG 1278018380 16 bpsched 29500 suspending further backup attempts for
> client backup1, policy prod_unix_day, schedule Daily_Incr because i
> t has exceeded the configured number of tries
> LOG 1278018380 16 bpsched 29500 backup of client backup1 exited with
> status 84 (media write error)


This is from another client (the previous one was the server trying to
backup its self)


> PROCESS 1278017576 20833 bpdm
> PROCESS 1278017577 20832 bpbrm
> CONNECT 1278017577
> CONNECTED 1278017577
> DISKMOUNT 1278017577 NONE
> BEGIN_WRITING 1278017577
> LOG 1278017966 16 bpdm 20833 cannot write image to disk, attempted
> write of 262144 bytes, system wrote 77824
> END_WRITING 1278017966
> ----- SNIP -----
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 20832
> Ended 1278017967
> LOG 1278017967 16 bpsched 29500 backup of client new-cleaner02 exited
> with status 84 (media write error)
> Try 2
> ----- SNIP -----
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 25207
> Ended 1278018377
> LOG 1278018377 16 bpsched 29500 suspending further backup attempts for
> client new-cleaner02, policy prod_unix_day, schedule Daily_Incr bec
> ause it has exceeded the configured number of tries
> LOG 1278018377 16 bpsched 29500 backup of client new-cleaner02 exited
> with status 84 (media write error)

A flush had been run on this DSSU just 15 minutes before and no other
jobs had written to it. I also verified there were no orphaned images on
there as well. It was %100 full, but nothing showed that the data wasn't
locked.


On 07/01/2010 04:35 PM, Martin, Jonathan wrote:
> Give us the output of the job log.
>
> -Jonathan
>
> -----Original Message-----
> From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
> [mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Nate
> Sanders
> Sent: Thursday, July 01, 2010 5:29 PM
> Cc: veritas-bu AT mailman.eng.auburn DOT edu
> Subject: Re: [Veritas-bu] Disk Staging Flush
>
> So the problem looks to be bigger than that. I'm still getting write
> failures and I even tried switching to a different physical disk for the
> DSSU. Out of 10 jobs, the exact same 4 jobs are failing every time.
>
>
> On 07/01/2010 01:03 PM, Nate Sanders wrote:
>   
>> Looks like 99% of the disk space was orphaned files. Thank you kind
>>     
> sir!
>   
>> On 07/01/2010 12:45 PM, Martin, Jonathan wrote:
>>
>> Run an "ls" on the directory where the images are. You should see
>> something like:
>>
>> |Image Identifier|C#|F#|Backup Time|.img
>> Server_1234567890_C1_F1_1234567890.img
>> Server_1234567890_C1_F2_1234567890.img
>> Server_1234567890_C1_F3_1234567890.img
>> Server_1234567890_C1_F4_1234567890.img
>> Server_1234567890_C1_F5_1234567890.img
>> Server_1234567890_C1_F6_1234567890.img
>> Server_1234567890_C1_F7_1234567890.img
>>
>> The C# is copy number (generally 1)
>> The F# is fragment number (#1 thru however big the image is.)
>>
>> If you see that you have images missing fragment #1 or with gaps in
>>     
> the
>   
>> numbers, then you probably have partial images (the known issue with
>>     
> 5.1
>   
>> DSSUs).
>>
>> Run bpimagelist -L -backupid <ImageIdentifier> to query the catalog.
>>     
> If
>   
>> the catalog does not recognize these files on the DSSU, then they are
>> orphaned. You can safely delete them.
>>
>>
>>
>>
>> --
>> Nate Sanders            Digital Motorworks
>> System Administrator      (512) 692 - 1038
>>
>>
>>
>>   
>>     
>   

-- 
Nate Sanders            Digital Motorworks
System Administrator      (512) 692 - 1038




This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify us immediately by e-mail and delete the message and any 
attachments from your system.
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu