Re: [ADSM-L] Raid 1 vs Raid 5
2010-08-11 10:01:24
After last night I don't think going with no raid will be an option at
all. I've been trying to get TSM running for the last 3 weeks, and have
had 3 drive failures in my SAN. But I have to admit, our SAN is an old
FastT 200 - with 136 gb drives. So we're trying to do this with stone
age equipment.
On 8/10/2010 12:36 AM, Roger Deschner wrote:
RAID5 is fine. But my strategy for handling disk failures is a bit
radical. I developed it a while back when I inherited a bunch of rather
unreliable 36gb disks, which I only dared to run as RAID1. Some of you
will think I've now completely lost my mind, but this works well.
1. Set all TSM volumes in the array with the failed disk to readonly
2. Migrate or MOVE DATA as fast as I can to the next stgpool, or even to
the same stgpool. Which I do will depend on what time of day it is and
what part of the daily cycle. Many times, disk failures happen when the
stgpools are nowhere close to full, so this may go very quickly.
3. Disasseble the array.
4. Build a new array, incorporating the spare or replacement disk
5. Allocate new TSM volumes (I have a script for this) and place it all
back into service.
I've now beaten a RAID5 resync by several hours, which narrows the
second-failure exposure, avoided its performance penalty, and the data
is much safer because it's been migrated to where it was headed to
anyway. I have found migration to be MUCH faster than a RAID rebuild -
even if the failure happens during the primary backup window. It's
faster regardless of the RAID level - 1, 5, or 10. The reason is that
over the course of a 24-hour day a disk stgpool will statistically
average less than half full. Get rid of that data quickly and you don't
have to endure a RAID resync at all.
The only downside of this procedure is that it requires my active
participation, so if I'm off camping in the mountains, RAID rebuild can
just be allowed to happen with its performance penalty.
Roger Deschner University of Illinois at Chicago rogerd AT uic DOT edu
======I have not lost my mind -- it is backed up on tape somewhere.=====
On Mon, 9 Aug 2010, Orville Lantto wrote:
The biggest factor in using RAID 5, and to a lessor extent RAID 0, is to get
the OS tuning and disk system tuning right. TSM writes 256 kB blocks for
storage pools. RAID 5 will work reasonably well if the stripe size on the disk
system is 256 kB. Also, make sure all OS tuning takes the large blocks into
account. The OS properties of the disk, Fibre card, and possibly the volume
group all have to allow 256 kB blocks to pass through without fragmentation.
Orville Lantto
-----Original Message-----
From: J. Pohlmann<jpohlmann AT SHAW DOT CA>
To: ADSM-L AT VM.MARIST DOT EDU
Sent: Mon, Aug 9, 2010 1:03 pm
Subject: Re: [ADSM-L] Raid 1 vs Raid 5
Another comment - RAID 5 gives you striping, so does RAID 0. Striping is
what gives you disk performance so that you can "feed" multiple tape drives
at a reasonable speed. Example a TSM server with 4 LTO4 drives has an
achievable tape bandwidth somewhere around 300 MB/sec - your disk needs to
be able to deliver this bandwidth unless you want you have your tape drives
slow down (speed match or stop/backhitch).
As for the impact of a drive failure - I also prefer RAID 5. Depending on
the OS platform there is more work when you have to recover file systems.
Joerg Pohlmann
250-585-3711
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Ochs, Duane
Sent: Monday, August 09, 2010 09:37
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Raid 1 vs Raid 5
I use raid-5 for all diskpools.
Although I don't agree with no raid, in some instances it is less of an
issue than others.
A few of my pools use caching for some of our more popular servers that get
restores.
As well as our daily exchange and db backups.
Can't think of a single instance where calling a group back and saying we
need you to resend a couple servers because a disk died on the backup
server. I'm not saying that it is a huge issue, but from the mindset of the
end users and upper management that we, the retention team, has not
protected itself from a disk failure to save a tb or so of space would be
very difficult to swallow.
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Kelly Lipp
Sent: Monday, August 09, 2010 11:24 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: Raid 1 vs Raid 5
I'll amplify what Skylar said: if your goal for this disk pool is short term
storage then I probably wouldn't use any RAID protection as the data will be
backed up to tape and then migrated to tape again. And as Skylar said,
worst case, the client will send it again if it somehow escapes.
Conserve space: don't RAID...
Kelly J. Lipp
O: 719-531-5574 C: 719-238-5239
kellyjlipp AT yahoo DOT com
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Skylar Thompson
Sent: Monday, August 09, 2010 9:33 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Raid 1 vs Raid 5
Do you have tape in your primary storage hierarchy? If so, remember that
even if part of your disk pool fails, you only lose access to the data that
are on the failed volumes. You can then regenerate that data by either
running another backup from the nodes that had backed up to that volume (if
the backup to the copy pool hasn't happened yet) or from the copy pool. New
backups can continue against the disk pool volumes that are still available,
or can be cut through directly to tape if the entire pool is unavailable.
On 08/09/10 08:23, Dana Holland wrote:
Does anyone have opinions about setting up storage pools as Raid 1 as
opposed to Raid 5? We have a very limited amount of disk space at the
moment and don't know when we'll get approval to buy more. At the time
we first started planning to implement TSM, we purchased what we
thought would be plenty of storage. But, that was 4 years ago - and
our usage has grown. Now, if I choose Raid 1, I barely have enough to
create a primary and copy storage pool for one of our servers. And
that isn't allowing for any growth at all. And I'm not sure how much
additional space incremental backups would take. I know Raid 5 would
give me more storage space, but I've also read that it's harder to
recover from if there's a disk failure (read this on a TSM site
somewhere). So, I'm wondering what some of you are using?
__________ Information from ESET NOD32 Antivirus, version of virus
signature database 5352 (20100809) __________
The message was checked by ESET NOD32 Antivirus.
http://www.eset.com
--
-- Skylar Thompson (skylar2 AT u.washington DOT edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S048, (206)-685-7354
-- University of Washington School of Medicine
__________ Information from ESET NOD32 Antivirus, version of virus signature
database 5354 (20100810) __________
The message was checked by ESET NOD32 Antivirus.
http://www.eset.com
__________ Information from ESET NOD32 Antivirus, version of virus signature
database 5357 (20100811) __________
The message was checked by ESET NOD32 Antivirus.
http://www.eset.com
|
|
|