Veritas-bu

Re: [Veritas-bu] Disk Staging Flush

2010-07-01 18:08:38
Subject: Re: [Veritas-bu] Disk Staging Flush
From: "Martin, Jonathan" <JMARTI05 AT intersil DOT com>
To: "Nate Sanders" <sandersn AT dmotorworks DOT com>, "VERITAS-BU AT MAILMAN.ENG.AUBURN DOT EDU" <VERITAS-BU AT mailman.eng.auburn DOT edu>
Date: Thu, 1 Jul 2010 18:08:26 -0400
I write 1.5TB+ images to DSSUs all the time, so I don't think the images
are too big. I so use a maximum fragment size of 10GB. (10240MB). Do you
have the bptm and/or bpdm log from the media server with the DSSU? (Not
sure if bpdm was around in 5.1?)  Is this one DSSU or all of them?

-Jonathan

-----Original Message-----
From: Nate Sanders [mailto:sandersn AT dmotorworks DOT com] 
Sent: Thursday, July 01, 2010 5:48 PM
To: VERITAS-BU AT MAILMAN.ENG.AUBURN DOT EDU
Cc: Martin, Jonathan
Subject: Re: [Veritas-bu] Disk Staging Flush

Has the job just gotten so large that it cannot write to disk staging
(700GB) any more? I know some of our Oracle Full jobs go straight to
tape because they are too large. But this is an incremental OS backup..
There is no way it's that large.

> Try 1
> PROCESS 1278017363 19536 bpdm
> PROCESS 1278017364 19533 bpbrm
> CONNECT 1278017364
> CONNECTED 1278017364
> DISKMOUNT 1278017364 NONE
> BEGIN_WRITING 1278017364
> LOG 1278017967 16 bpdm 19536 cannot write image to disk, No space left
> on device
> LOG 1278017967 16 bpbrm 19533 from client backup1: ERR - bpbkar
> exiting because backup is aborting
> END_WRITING 1278017971
> Started 1278017360
> KbPerSec 6709
> Kilobytes 1658368
> Files 57000
> ActivePid 19514
> RqstPid 19502
> MainPid 29500
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 19533
> Ended 1278017971
> Started 1278017360
> KbPerSec 6709
> Kilobytes 1658368
> Files 57000
> ActivePid 19514
> RqstPid 19502
> MainPid 29500
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 19533
> Ended 1278017971
> LOG 1278017971 16 bpsched 29500 backup of client backup1 exited with
> status 84 (media write error)
> Try 2
> PROCESS 1278017975 25253 bpdm
> PROCESS 1278017976 25250 bpbrm
> CONNECT 1278017976
> CONNECTED 1278017976
> DISKMOUNT 1278017976 NONE
> BEGIN_WRITING 1278017976
> LOG 1278018377 16 bpdm 25253 cannot write image to disk, attempted
> write of 262144 bytes, system wrote 77824
> LOG 1278018377 16 bpbrm 25250 from client backup1: ERR - bpbkar
> exiting because backup is aborting
> END_WRITING 1278018380
> Started 1278017971
> KbPerSec 9291
> Kilobytes 728064
> Files 4000
> ActivePid 25243
> RqstPid 19502
> MainPid 29500
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 25250
> Ended 1278018380
> LOG 1278018380 16 bpsched 29500 suspending further backup attempts for
> client backup1, policy prod_unix_day, schedule Daily_Incr because i
> t has exceeded the configured number of tries
> LOG 1278018380 16 bpsched 29500 backup of client backup1 exited with
> status 84 (media write error)


This is from another client (the previous one was the server trying to
backup its self)


> PROCESS 1278017576 20833 bpdm
> PROCESS 1278017577 20832 bpbrm
> CONNECT 1278017577
> CONNECTED 1278017577
> DISKMOUNT 1278017577 NONE
> BEGIN_WRITING 1278017577
> LOG 1278017966 16 bpdm 20833 cannot write image to disk, attempted
> write of 262144 bytes, system wrote 77824
> END_WRITING 1278017966
> ----- SNIP -----
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 20832
> Ended 1278017967
> LOG 1278017967 16 bpsched 29500 backup of client new-cleaner02 exited
> with status 84 (media write error)
> Try 2
> ----- SNIP -----
> Status 84
> DestStorageUnit dssu1
> DestMediaServer backup1.xxxxxxx.xxx
> NumTapesToEject 25207
> Ended 1278018377
> LOG 1278018377 16 bpsched 29500 suspending further backup attempts for
> client new-cleaner02, policy prod_unix_day, schedule Daily_Incr bec
> ause it has exceeded the configured number of tries
> LOG 1278018377 16 bpsched 29500 backup of client new-cleaner02 exited
> with status 84 (media write error)

A flush had been run on this DSSU just 15 minutes before and no other
jobs had written to it. I also verified there were no orphaned images on
there as well. It was %100 full, but nothing showed that the data wasn't
locked.


On 07/01/2010 04:35 PM, Martin, Jonathan wrote:
> Give us the output of the job log.
>
> -Jonathan
>
> -----Original Message-----
> From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
> [mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Nate
> Sanders
> Sent: Thursday, July 01, 2010 5:29 PM
> Cc: veritas-bu AT mailman.eng.auburn DOT edu
> Subject: Re: [Veritas-bu] Disk Staging Flush
>
> So the problem looks to be bigger than that. I'm still getting write
> failures and I even tried switching to a different physical disk for
the
> DSSU. Out of 10 jobs, the exact same 4 jobs are failing every time.
>
>
> On 07/01/2010 01:03 PM, Nate Sanders wrote:
>   
>> Looks like 99% of the disk space was orphaned files. Thank you kind
>>     
> sir!
>   
>> On 07/01/2010 12:45 PM, Martin, Jonathan wrote:
>>
>> Run an "ls" on the directory where the images are. You should see
>> something like:
>>
>> |Image Identifier|C#|F#|Backup Time|.img
>> Server_1234567890_C1_F1_1234567890.img
>> Server_1234567890_C1_F2_1234567890.img
>> Server_1234567890_C1_F3_1234567890.img
>> Server_1234567890_C1_F4_1234567890.img
>> Server_1234567890_C1_F5_1234567890.img
>> Server_1234567890_C1_F6_1234567890.img
>> Server_1234567890_C1_F7_1234567890.img
>>
>> The C# is copy number (generally 1)
>> The F# is fragment number (#1 thru however big the image is.)
>>
>> If you see that you have images missing fragment #1 or with gaps in
>>     
> the
>   
>> numbers, then you probably have partial images (the known issue with
>>     
> 5.1
>   
>> DSSUs).
>>
>> Run bpimagelist -L -backupid <ImageIdentifier> to query the catalog.
>>     
> If
>   
>> the catalog does not recognize these files on the DSSU, then they are
>> orphaned. You can safely delete them.
>>
>>
>>
>>
>> --
>> Nate Sanders            Digital Motorworks
>> System Administrator      (512) 692 - 1038
>>
>>
>>
>>   
>>     
>   

-- 
Nate Sanders            Digital Motorworks
System Administrator      (512) 692 - 1038




This message and any attachments are intended only for the use of the
addressee and may contain information that is privileged and
confidential. If the reader of the message is not the intended recipient
or an authorized representative of the intended recipient, you are
hereby notified that any dissemination of this communication is strictly
prohibited. If you have received this communication in error, please
notify us immediately by e-mail and delete the message and any
attachments from your system.
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>