Bacula-users

[Bacula-users] Restores very slow while selecting files

2017-04-12 03:22:55
Subject: [Bacula-users] Restores very slow while selecting files
From: Tom Yates <madlists AT teaparty DOT net>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 12 Apr 2017 08:03:32 +0100 (BST)
I've got a fairly big filesystem (3TB, 15M files) of which I want to 
(test) restore a part.  I know that if the backend DB is slow the 
"Building file list" stage can take some time, but I have it striped over 
a 5-SAS-disc RAID-0, and this step takes only about eight minutes.

The problems start once I navigate to the directory I want restored 
(which admittedly contains the bulk of the files and about half the total 
space), and do an "add home".

The current job has been stuck on this step for over 15 hours, now.  When 
I strace bacula-dir I see a lot of:

[pid 26711] write(6, "P\0\0\0\3SELECT FilenameId FROM File"..., 84) = 84
[pid 26711] read(6, "\1\0\0\1\1@\0\0\2\3def\6bacula\10Filename\10Fi"..., 16384) 
= 102
[pid 26711] poll([{fd=6, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 26711] write(6, "m\0\0\0\3SELECT FileId, LStat, MD5 F"..., 113) = 113
[pid 26711] read(6, "\1\0\0\1\0030\0\0\2\3def\6bacula\4File\4File\6F"..., 
16384) = 249
[pid 26711] poll([{fd=6, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 26711] write(6, "P\0\0\0\3SELECT FilenameId FROM File"..., 84) = 84
[pid 26711] read(6, "\1\0\0\1\1@\0\0\2\3def\6bacula\10Filename\10Fi"..., 16384) 
= 102
[pid 26711] poll([{fd=6, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 26711] write(6, "m\0\0\0\3SELECT FileId, LStat, MD5 F"..., 113) = 113
[pid 26711] read(6, "\1\0\0\1\0030\0\0\2\3def\6bacula\4File\4File\6F"..., 
16384) = 250
[pid 26711] poll([{fd=6, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)

So I presume it's stepping through the built directory tree querying the 
database about each of these files.  Problem is that any restore that 
takes ~24 hours just to kick off is not making my clients happy.

The CentOS 6 server has 16GB of memory and does not seem short of it 
(negligible swap usage).  We're currently using the CentOS 6 bacula 
packages, which are v5.0.0.  I tried building 5.2.13 from source, 
upgrading, and running that, but it wasn't noticeably better, so I 
downgraded again.  I'm happy to go to a still-later version if there is 
reason to think that this step is better optimised in that version.  If 
building custom indexes would help, I'm open to that, too.  If I'm doing 
something fundamentally stupid, it would be really useful to know!

Apart from "don't restore your home area", does anyone have any advice? 
Thanks.


-- 

    Tom Yates - Teaparty Network Central - +44/0 1223 704038


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users