Veritas-bu

[Veritas-bu] Backups slow to a crawl

2005-03-23 10:44:46
Subject: [Veritas-bu] Backups slow to a crawl
From: dave.markham AT fjserv DOT net (Dave Markham)
Date: Wed, 23 Mar 2005 15:44:46 +0000
Jeff McCombs wrote:

>Gurus,
>
>    NB 5.0 MP4, single combination media/master server, Solaris 9. Overland
>Neo 2000 26-slot 2 drive DLT.
>
>    I'm noticing that for some reason or another, all of my client backups
>have slowed to a _crawl_. A _cumulative_ (!) backup of local disk on a Sun
>V100 is taking somewhere on the order of 2 hours at this point, and with
>over 40 systems, I'm blowing past my window  consistently.
>
>    I'm not quite sure what's going on here, but as I sit and watch the
>output from 'iostat', I'm noticing that rmt/1 (the 2nd drive in the Neo) is
>fluxuating between 100% busy, with kw/s at close to zero, and busy @ 1-15%
>and kw/s up into the 1000's.
>
>    rmt/0 seems to be fine, kw/s sits consistently up in the 1.8-2K range,
>while busy is anywhere from 2% - 30% on average. My other disks aren't
>working hard, CPU isn't loaded and I've got plenty of memory.
>
>    The policy I'm using allows for multiple datastreams, no limits on jobs,
>and most schedules allow for an MPX of 2. I'm backing up ALL_LOCAL_DRIVES on
>all clients, and I'm not using any NEW_STREAM directives. I'm not seeing any
>errors on the media either.
>
>    Can anyone shed some light on what might be happening here? Am I looking
>at a drive that might be having some problems, or am I barking up the wrong
>tree, and it's something else entirely?
>
>    A small sample of iostat output covering the affected devices is below.
>
>sample (extra disks removed from putput);
>root@backup(pts/1):~# iostat -nx 1 100
>                    extended device statistics
>    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>    0.0    4.1    0.0  252.2  0.0  0.0    0.0    5.9   0   2 rmt/0
>    0.0    4.6    0.0  278.4  0.0  0.1    0.0   27.3   0  12 rmt/1
>
>    0.0    4.1    0.0  252.3  0.0  0.0    0.0    5.9   0   2 rmt/0
>    0.0    4.6    0.0  278.4  0.0  0.1    0.0   27.3   0  12 rmt/1
>
>    0.0   33.0    0.0 2076.4  0.0  0.2    0.0    5.8   0  19 rmt/0
>    0.0    2.0    0.0  125.8  0.0  1.0    0.0  490.0   0  98 rmt/1
>
>    0.0   38.0    0.0 2394.0  0.0  0.2    0.0    5.4   0  21 rmt/0
>    0.0    8.0    0.0  504.0  0.0  1.0    0.0  124.9   0 100 rmt/1
>
>    0.0   27.0    0.0 1701.1  0.0  0.2    0.0    6.5   0  17 rmt/0
>    0.0    2.0    0.0  126.0  0.0  1.0    0.0  499.9   0 100 rmt/1
>
>    0.0   33.0    0.0 2078.9  0.0  0.2    0.0    5.3   0  18 rmt/0
>    0.0    0.0    0.0    0.0  0.0  1.0    0.0    0.0   0 100 rmt/1
>
>    0.0   16.0    0.0 1008.0  0.0  0.1    0.0    6.2   0  10 rmt/0
>    0.0   13.0    0.0  819.0  0.0  0.6    0.0   48.4   0  63 rmt/1
>
>    0.0   40.0    0.0 2520.1  0.0  0.2    0.0    5.9   0  24 rmt/0
>    0.0    0.0    0.0    0.0  0.0  1.0    0.0    0.0   0 100 rmt/1
>
>    0.0   33.0    0.0 2078.9  0.0  0.2    0.0    5.3   0  18 rmt/0
>    0.0   10.0    0.0  630.0  0.0  1.0    0.0   99.9   0 100 rmt/1
>
>
>  
>
There is an issue with Disk Thrashing when using the muli streams policy 
section.

I never run multi streams as cannot gurantee files to be on different 
disks.

Unless you know all clients have a disk per file list you are putting in 
the policy try taking it off

Thought
Dave