Hi!
>> ... Unfortunately, I'm
>> not exaggerating to say that we have users with file profiles that
>> consist of 35million files (or more) with average file sizes less than
>> 10kb. The user might be long-gone, and eligible for deletion, but it
>> takes forever to spider that filesystem, and occasionally causes
>> problems like directory cache thrashing, etc.
>
> "I don't have a solution but I admire the problem". How long would
> bacula take to stat 35M files?
For us: the incremental currently takes 19 hours for 47M files on an
otherwise idle 44-disk hardware RAID10. Yes, this is slowly starting to
become a problem.
The main issues I see are:
- That fileserver doesn't really have enough memory to keep all metadata
cached (which is our problem, I'm working on migrating to newer
hardware).
- The FD seems to run in a single thread, so it only does one stat at a
time. This means that effectively only one or two disks from the array
are used at any given time. Is there a way to parallelize the FD?
(something simpler than one job per user homedir, that would be a
management nightmare)
We had the same issue with an rsync of the same fileset (those files are
rsynced to a standby fileserver; Bacula then takes the backup from that
server to avoid burdening the active fileserver twice.) Parallelizing that
over 20 rsync threads reduced the sync time from 12 to 2 hours, that's why
I'm wondering if it would be possible for Bacula to do that too.
Gtnx
Marcel
--
Marcel de Boer
Test engineer, Service Routing R&D, IP/Optical Networks
Nokia, Antwerp, Belgium
On Thu, 31 Mar 2016, EXT Dimitri Maziuk wrote:
> On 03/31/2016 12:35 PM, Lloyd Brown wrote:
>
>> ... Unfortunately, I'm
>> not exaggerating to say that we have users with file profiles that
>> consist of 35million files (or more) with average file sizes less than
>> 10kb. The user might be long-gone, and eligible for deletion, but it
>> takes forever to spider that filesystem, and occasionally causes
>> problems like directory cache thrashing, etc.
>
> "I don't have a solution but I admire the problem". How long would
> bacula take to stat 35M files?
>
>
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|