Veritas-bu

[Veritas-bu] Backups slow to a crawl

2005-03-25 11:12:03
Subject: [Veritas-bu] Backups slow to a crawl
From: Bill_Jorgensen AT csgsystems DOT com (Jorgensen, Bill)
Date: Fri, 25 Mar 2005 09:12:03 -0700
Jeff:

Just a thought... I am not sure I have thoroughly read this thread so
forgive me if I rehash stuff.

Are your drives direct-attached via scsi? If so have you investigated
scsi cable problems? If the backup server is a Sun then take a look in
/var/adm/messages. Look for parity errors or statements about reduced
transfer rate. If you see things like that then look at the cable as the
issue. This one is tough.

Good luck,

Bill

--------------------------------------------------------
     Bill Jorgensen
     CSG Systems, Inc.
     (w) 303.200.3282
     (p) 303.947.9733
--------------------------------------------------------
     UNIX... Spoken with hushed and
     reverent tones.
--------------------------------------------------------

-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu] On Behalf Of Jeff
McCombs
Sent: Friday, March 25, 2005 8:42 AM
To: Veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu] Backups slow to a crawl

Gang,

    Ok. So I took Darren's suggestion and 'downed' the drive in NBU,
drove
out to our facility with a new, unused tape and slapped it into the
drive.

I hoped over to my home directory where I've got a good 5G or so of data
with a good mix of file sizes and types and ran the following;

Tar cf - . | compress | dd obs=1024k of=/dev/rmt/1 con=sync

And watched the output of iostat -xtcn, with samples being taken every
second.

And everything looked good for the first, oh.. 5 minutes or so. But the
longer that the stream to tape ran, the worse the performance started to
get. After 5 minutes I began to see the busy:kw/s ratio drop. Busy went
from
4-10 % and kw/s 3 MB/Sec when things were good, to 90-100% and kw/s of
100-200k/sec. The longer it ran, the worse it got. Eventually, 6 out of
10
samples were reading 100% busy and a kw/s of 0. The other 4 samples
would
range from busy @ 89 - 99, kw/s down into the sub-50k/sec range.

I also checked the output of 'iostat -xtcne' during this run, and while
there were soft and hard errors in the counters, these never actually
increased. 'iostat -nE' provided the following:

rmt/0           Soft Errors: 18 Hard Errors: 0 Transport Errors: 0
Vendor: QUANTUM  Product: DLT8000          Revision: 0250 Serial No: ?P
rmt/1           Soft Errors: 56 Hard Errors: 2 Transport Errors: 2
Vendor: QUANTUM  Product: DLT8000          Revision: 0250 Serial No: ?P

Again though, after performing more tests, I couldn't get these counters
to
increase.

I did get a response from Veritas. The tech on the phone suggested I
muck
with the buffers. Per his instructions, I set NET_BUFFER_SZ to 131072,
NUMBER_DATA_BUFFERS to 32, and SIZE_DATA_BUFFERS to 131072.

I ran a full backup of our system dedicated to managing Checkpoint
firewalls
(Sun V100, approx 8GB of data, 100 MB FDX network on the same 3750
switch &
VLAN as the backup system), and performance was actually worse on the
first
drive! Both drives sat at approximately 512k/sec, though busy was into
the
4-10% range for the duration of the backup.

Aargh. If this was a windows system, I'd be blaming drivers.. I checked
cables, cleaned and reseated the drives, made sure the SCSI controller
card
was seated properly, checked termination.. Guess I'll call Overland and
have
them get me a new drive.

Many thanks to those of you who have helped me out already. It's much
appreciated!

-jeff

On 3/24/05 11:14 AM, "Darren Dunham" <ddunham AT taos DOT com> wrote:
> 
> I didn't reply initially because it appeared that you had fixed it.
> 
> I too would be very suspicious of those iostat figures.  To me the
high
> busy alongside very low throughput screams drive problems.
Multiplexing
> shouldn't be affecting that.
> 
> If at all possible, I'd try to replicate the error by doing some drive
> testing outside of NBU.
> 
> Down the drive, load a scratch tape, then get busy with 'dd' or
> something.  Can you make it behave similarly?  If so, I'd make it my
> number one suspect.

-- 
Jeff McCombs                 |                                    NIC,
Inc
Systems Administrator        |
http://www.nicusa.com
jeffm AT nicusa DOT com             |                                NASDAQ:
EGOV
Phone: (703) 909-3277        |        "NIC - the People Behind
eGovernment"
--
    "So we went to Atari and said, 'Hey, we've got this amazing thing,
     even built with some of your parts, and what do you think about
     funding us? Or we'll give it to you. We just want to do it. Pay
     our salary, we'll come work for you.' And they said 'No.' So
     then we went to Hewlett-Packard, and they said 'Hey, we don't
     need you. You haven't got enough college yet."
                    - Steve Jobs, cofounder of Apple Computer



_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


<Prev in Thread] Current Thread [Next in Thread>