ADSM-L

Re: [ADSM-L] I'm getting new disk storage.

2009-08-06 18:42:27
Subject: Re: [ADSM-L] I'm getting new disk storage.
From: Kelly Lipp <lipp AT STORSERVER DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 6 Aug 2009 16:40:46 -0600
That is correct: you can lose a couple of storage pool volumes (on a failed 
disk) and still go on.  The operation in progress writing to those volumes will 
stall/fail (I don't know which, but I'm guessing retries probably save your 
butt).

Kelly Lipp
CTO
STORServer, Inc.
485-B Elkton Drive
Colorado Springs, CO 80907
719-266-8777 x7105
www.storserver.com


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Huebner,Andy,FORT WORTH,IT
Sent: Thursday, August 06, 2009 3:43 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] I'm getting new disk storage.

Thank you for the reply; I should have been more precise with the disk failure 
question. 

Doesn't a disk failure affect the backup run, or are your pools on many 
physicals so the loss of one is not fatal to the process?

I understand the data loss part; you have a short window of time for a low 
probability set of disk failures.


Thanks again.

Andy Huebner

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Kelly Lipp
Sent: Thursday, August 06, 2009 2:52 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] I'm getting new disk storage.

Andy,

In your case, you probably have a good bit of cache in the controller and your 
RAID sets are relatively small and you're using FC disks which are more 
reliable, tolerant and faster than SATA so all is good.  My data comes from the 
SATA 7.2K world with six or eight drives per RAID5.  Remember, our goal was to 
maximize capacity with these drives rather than performance.  In your case, you 
chose performance as most of your data is being stored long term somewhere else 
(cheaper disk or tape).

Failure in the disk pool is the question.  Since the data is copied relatively 
soon after arrival (both to the copy pool and to the migration pool) it is not 
at risk for very long.  If you do have a failure of a drive you probably have 
the data in the other pools, or still on the client. My rationale is as 
follows: if you lose a cachepool disk and the client loses data then you are in 
trouble. That's two bad things happening to one good person and what the 
chances?  If you lose the cachepool then the next time the client backs up the 
data will move again.

With the more reliable SAS/SCSI/FC drives, failures happen very infrequently.  

Are there holes in my rationale?  Of course.  Have I been bitten by them? No.

Long term storage is different.

Kelly Lipp
CTO
STORServer, Inc.
485-B Elkton Drive
Colorado Springs, CO 80907
719-266-8777 x7105
www.storserver.com


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Huebner,Andy,FORT WORTH,IT
Sent: Thursday, August 06, 2009 9:40 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] I'm getting new disk storage.

I must be missing something?  I cannot saturate my 15 FC 10k disks (3x 4+1 
RAID5) with 2GB Ethernet and 110 concurrent clients (330+ sessions).  The 
Ethernet on the other hand is saturated.
Out of curiosity, what happens with a disk failure?

Andy Huebner
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Kelly Lipp
Sent: Thursday, August 06, 2009 10:10 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] I'm getting new disk storage.

It seems the perfect environment for the JBOD I suggested as you do not want 
all those sessions banging a RAID5 array.

Of course work flow is an issue, but the moving a TB disk to disk or disk to 
tape is a couple of hour process if you work it correctly into your daily 
processing and probably isn't an issue for you.  Ideally you wouldn't need to 
move it again, but you really can't have that many sessions banging RAID5 so 
what else do you do?

I've arrived at this approach pragmatically and through trial and error.  I 
know that I can easily run fifty or sixty (and perhaps more) sessions to 12 SAS 
15K drives (use two disk pool volumes per drive to saturate each drive).

Kelly Lipp
CTO
STORServer, Inc.
485-B Elkton Drive
Colorado Springs, CO 80907
719-266-8777 x7105
www.storserver.com


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Michael Green
Sent: Thursday, August 06, 2009 4:04 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] I'm getting new disk storage.

On Tue, Aug 4, 2009 at 7:41 PM, Kelly Lipp<lipp AT storserver DOT com> wrote:
> You are exactly correct: modeling what will be rather than what is can be 
> tricky.  The problem really boils down to not having enough data to really 
> play with it adequately.
>
> I can tell you from experience on probably 200 TSM servers (Windows 2003 
> based, which is just fine all of you AIX heads!) that your overall scheme is 
> good.  One additional, I like to run initial backups to SAS drives using a 
> storage pool of device class disk.  Don't protect those drives simply use 
> JBOD as you will migrate the data out of them fairly soon either to tape or 
> file device class disk.  Use RAID5/6 for your SATA drives (We generally use 6 
> drive RAID5 sets in 12 bay shelves and 8 drive in 16 bay shelves) and keep a 
> smallish number of simultaneous backups to pools there as large numbers of 
> backups thrash the RAID set something awful.
>

Your suggestion of using JBOD instead of RAID for DISKCLASS data is
intriguing... I accept the reasoning behind not protecting these
drives by RAID. But how does it affect the workflow performance-wise?
What is the rational behind this? Economy (no wasted drives for
parity), performance gains?


>
> How much data do you backup today?  How many clients simultaneously? If you 
> want to take this private, give me a call.  I have a pretty good idea how to 
> size this if I have some more information.  Perhaps can save you a testing 
> step (testing costs money that you could spend on additional storage...).

That particular server backs up  700-1000GB  (~400K+ affected objects)
coming from just under 100 nodes nightly.
Thanks for offering me a phone consultation :) I won't bother you for
the time being, but maybe I will at a later time :)

>
> Thanks,
>
> Kelly Lipp
> CTO
> STORServer, Inc.


This e-mail (including any attachments) is confidential and may be legally 
privileged. If you are not an intended recipient or an authorized 
representative of an intended recipient, you are prohibited from using, copying 
or distributing the information in this e-mail or its attachments. If you have 
received this e-mail in error, please notify the sender immediately by return 
e-mail and delete all copies of this message and any attachments.
Thank you.


This e-mail (including any attachments) is confidential and may be legally 
privileged. If you are not an intended recipient or an authorized 
representative of an intended recipient, you are prohibited from using, copying 
or distributing the information in this e-mail or its attachments. If you have 
received this e-mail in error, please notify the sender immediately by return 
e-mail and delete all copies of this message and any attachments.
Thank you.

<Prev in Thread] Current Thread [Next in Thread>