Bacula-users

Re: [Bacula-users] maximum client file size

2015-05-21 13:53:49
Subject: Re: [Bacula-users] maximum client file size
From: Devin Reade <gdr AT gno DOT org>
To: bacula-users <bacula-users AT lists.sourceforge DOT net>
Date: Thu, 21 May 2015 11:51:31 -0600
--On Thursday, May 21, 2015 06:50:31 PM +0200 "Rados?aw Korzeniewski" 
<radoslaw AT korzeniewski DOT net> wrote:

> Why do you need to use a 500MB volume in size? This days it is like
> distributing movies on floppies instead of DVD/BR.

Some of my older deployments had data patterns where incrementals
are typically small, but at times can be large, and the total space
available in all pools was ... strained.  By using a single scratch
pool I could have the situation where the incremental/differential pools
might normally be n volumes but could balloon out to 10n volumes for
a while and then shrink back to n after retention periods are over.
In that case, a 500MB volume size seemed to be a good compromise
in a situation where total pool size was limited.

And since 500MB volumes never caused me problems, I consequently
never got away from them.

There's probably a certain amount of historical cruft behind the
500MB number, too.  I've got a script that I've used for years when
I want an impromptu archive of a system that (among other things)
does a dump+split using 500MB volumes.  Although I almost always
archive to removable disks now, that size allowed me to use either
smaller hard disks, DVDs, or if necessary CDs as the media to hold
the volumes.  Years ago I was writing Bacula volumes to DVDs for
offsites so the script volume size probably influenced the Bacula
volume size at the time.

> What benefit do you want to achieve with 500MB volume size?
[...]
> I see a performance problem when I have 500k files in a single directory
> instead of 1. With this kind of setup I have about 10k volumes. All is
> working without a problem.

Do keep in mind that I was not proposing 500MB volumes for the
system in question; it was a comment on previous usage.  However,
for virtual cartridges even with 500MB volumes the most you're
going to see on the largest media commonly available (6TB) is
12k files.

Yes, large numbers of files in a directory can be a performance hit,
but my 512k volume count comment was referring to the catalog, not
files in a single directory.  (Think of the set of volumes being
spread across various cartridges.)

>> My gut is saying to go with 2GB volume sizes, but I'm curious.
>>
> Are you afraid of large files (volumes)?  - it was a joke :)

When it comes to backups and DR, I tend to be conservative.  To
give a serious answer to your joke, it's because of that magical
32-bit barrier (which, luckily, the problems for which are getting
more rare than they used to be).  It's kind of like using spaces
in filenames on a UNIX system.  Yes, theoretically it shouldn't
be a problem.  But you never know when some SOB is going to be
lazy creating his shell script that can't handle them.

Devin


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users