Bacula-users

Re: [Bacula-users] Time for change

2008-12-18 04:21:11
Subject: Re: [Bacula-users] Time for change
From: Arno Lehmann <al AT its-lehmann DOT de>
To: undisclosed-recipients:;
Date: Thu, 18 Dec 2008 10:15:57 +0100
Hi,

18.12.2008 06:50, Jesper Krogh wrote:
> Alan Brown wrote:
>> Jesper Krogh wrote:
>>>> I'm running spooling on a 4 drive software raid0 quite happily on a 4Gb
>>>> 3GHz P4D machine. The limiting factors are disk head seek time(*) when
>>>> running concurrent backups to 2 LTO2 drives and available SATA ports.
>>>> Because of that I'm considering dropping in solid state disks.
>>>>     
>>> I still have got to see a reasonable priced SSD' disk that can deliver 
>>> around 100MB/s both ways at the same time.
>>>   
>> There aren't any mechanical disks which can do it either.
>>
>> Which is why I'm not trying to do that - replacing 4 RAID0 mechanical 
>> disks with 4 SSDs will provide similar sustained throughput to the 
>> mechanical RAID0, but provide _much_ better performance for anything 
>> where the mechanical disks had head seeking involved - such as multiple 
>> simultaneous input/output streams to LTO drives.
>>
>>> http://www.slashgear.com/samsung-64gb-ssd-performance-benchmarks-278717/
>>>
>>>   
>> Make sure you compare apples with something remotely looking like 
>> apples. The ONLY SSds which are suitable fo this kind of use are SLCs, 
>> not MLCs
> 
> Ok. I havent spend enough time in that area.

In short, MLC's are cheaper but slower. There are claims they are less 
reliable, i.e. allow lesser rewrites.

>>> I have beefed up my director with sufficient amount of memory and 
>>> mounted it as a "ramdisk" for spooling. That doesn't impose any 
>>> limitations on the 2 LTO3 drives attached.
>>>   
>> How much do you regard as "sufficient"?
>>
>> 100-200Gb ram and systems capable of addressing that amount of memory 
>> are still far more expensive than a stack of flash drives, else I'd use 
>> them.
> 
> But do you need to spool a complete tape? In order to avoid doing "evil" 
> stuff to you tape drive, much less is sufficient.

Well, Alan probably runs jobs spanning more than one tape. Also, he's 
running jobs concurrently. Now, if he wants to avoid interleaving of 
different jobs, he needs to spool complete jobs, and probably several 
jobs of more than LTO3 capacity - thus the need for a rather large 
spool space.

>> My concern isn't just backup run time.
> 
> So you'd like to spool a complete Job? Whats you average job-size? (mine 
> is less than 8GB) if its larger, we just need a period of despooling 
> (I'd love to have concurrent spooling/despooling in bacula). Currenly I 
> use at most 32GB for spooling area, with a Job Concurrency at 4 and 2 
> tape drives.

I believe Alan is handling data sets a bit larger...

> When doing large backups(full+archive) I mostly have one
> (or two) drives in action at the same time while spooling to disk with 
> 2(or 3) threads at the same time. The LTO3 drives far outperform our 
> network speed (1gbit). Transfer to tapes are in the range from 60MB/s to 
> 100MB/s (and I unfortunately have no idea why they spread that much).
> 
> In total numbers we're around 25TB to disk/month with monthly full and
> daily incrementals.
> 
> Concerned about job run time, its my impression that spool space only 
> speeds up incremental/differential.

Depends - the important thing is that spooling allows multiple 
concurrent jobs without or with less interleaving.

>> Restore times are also important and having a tape read back 1Gb, then 
>> seek, then pull back another 1Gb (or even 10Gb) is a significant 
>> penalty  over reading larger blocks when worst-case 75Tb+ restores are 
>> considered (25-60 days on 2 drive LTO2, dpeending on the directory 
>> structures being restored.)
> 
> Whats the time consuming part in this? Seeking on tapes? Neither SSD's 
> or memory will change that.

Moving tapes, loading and unloading times. Seek times add to that, but 
are not that bad with LTO.

> AFAIK the spooling area is only used when 
> going TO tape, not FROM tape.

Right, but if you manage to have all jobs in one continuos block on 
tape, you minimize the number of volumes needed per job, and thus save 
much of the volume handling time.

>>> And spooling doesnt need any form for persistence, so its fine that its 
>>> gone after reboot
>> Indeed.
>>
>> If it was practical I'd use ramdisks. Right now it's not. Apart from the 
>> cost factor there is very little hardware which can address more than 
>> 128Gb of Dram. There are RAM arrays which are setup to operate as F/O 
>> scsi devices, but these are currently "silly money" as they're marketed 
>> at the world of high end, high cost databases.
> 
> Again, I assume we're talking about spooling space, then try to think 
> about if you need that much.

I believe Alan thought about it (and I hope my reasoning above is 
correct :-)

>> In 12 months time that may change, Ram is always falling in price - but 
>> Flash drive pricing is falling faster,  performance/durability is rising 
>> at the same time and there isn't the same issue with massive address 
>> ranges as it just looks like more disk, vs having to change out entire 
>> servers at $20k a time if RAM limits are reached.
>>
>> I'm not just looking at the issue of my current setup. Projects are 
>> already pencilled onsite which will increase storage demands by a factor 
>> of 20 from the current size within 12 months and I have to try and be 
>> ready to back that data up.

Now that's gonna be interesting... need any help? ;-)

> Can you give some numbers, so we have a feeling about the sizes you talk 
> about?

Be advised to sit down before you read his reply :-)
Over time, Alan revealed a bit about the amount of data he's storing, 
and I think you can believe him when he claims doing 75TB restores...

Arno

-- 
Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück
www.its-lehmann.de

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users