Bacula-users

Re: [Bacula-users] Restores very slow while selecting files

2017-04-12 03:46:24
Subject: Re: [Bacula-users] Restores very slow while selecting files
From: Francisco Javier Funes Nieto <esencia AT gmail DOT com>
To: Tom Yates <madlists AT teaparty DOT net>
Date: Wed, 12 Apr 2017 09:45:26 +0200
The missing question, which Database Catalog are you using ? 

El 12 abr. 2017 9:26 a. m., "Tom Yates" <madlists AT teaparty DOT net> escribió:
I've got a fairly big filesystem (3TB, 15M files) of which I want to
(test) restore a part.  I know that if the backend DB is slow the
"Building file list" stage can take some time, but I have it striped over
a 5-SAS-disc RAID-0, and this step takes only about eight minutes.

The problems start once I navigate to the directory I want restored
(which admittedly contains the bulk of the files and about half the total
space), and do an "add home".

The current job has been stuck on this step for over 15 hours, now.  When
I strace bacula-dir I see a lot of:

[pid 26711] write(6, "P\0\0\0\3SELECT FilenameId FROM File"..., 84) = 84
[pid 26711] read(6, "\1\0\0\1\1@\0\0\2\3def\6bacula\10Filename\10Fi"..., 16384) = 102
[pid 26711] poll([{fd=6, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 26711] write(6, "m\0\0\0\3SELECT FileId, LStat, MD5 F"..., 113) = 113
[pid 26711] read(6, "\1\0\0\1\0030\0\0\2\3def\6bacula\4File\4File\6F"..., 16384) = 249
[pid 26711] poll([{fd=6, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 26711] write(6, "P\0\0\0\3SELECT FilenameId FROM File"..., 84) = 84
[pid 26711] read(6, "\1\0\0\1\1@\0\0\2\3def\6bacula\10Filename\10Fi"..., 16384) = 102
[pid 26711] poll([{fd=6, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 26711] write(6, "m\0\0\0\3SELECT FileId, LStat, MD5 F"..., 113) = 113
[pid 26711] read(6, "\1\0\0\1\0030\0\0\2\3def\6bacula\4File\4File\6F"..., 16384) = 250
[pid 26711] poll([{fd=6, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)

So I presume it's stepping through the built directory tree querying the
database about each of these files.  Problem is that any restore that
takes ~24 hours just to kick off is not making my clients happy.

The CentOS 6 server has 16GB of memory and does not seem short of it
(negligible swap usage).  We're currently using the CentOS 6 bacula
packages, which are v5.0.0.  I tried building 5.2.13 from source,
upgrading, and running that, but it wasn't noticeably better, so I
downgraded again.  I'm happy to go to a still-later version if there is
reason to think that this step is better optimised in that version.  If
building custom indexes would help, I'm open to that, too.  If I'm doing
something fundamentally stupid, it would be really useful to know!

Apart from "don't restore your home area", does anyone have any advice?
Thanks.


--

    Tom Yates - Teaparty Network Central - +44/0 1223 704038


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users