Bacula-users

Re: [Bacula-users] bacula splitting big jobs in to two

2015-02-11 10:05:17
Subject: Re: [Bacula-users] bacula splitting big jobs in to two
From: Richard Fox <rfox AT mbl DOT edu>
To: "Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC]" <uthra.r.rao AT nasa DOT gov>
Date: Wed, 11 Feb 2015 10:01:49 -0500 (EST)
Hi,

On Thu, 29 Jan 2015, Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC] 
wrote:

> 
> I run bacula 5.2.12 on a RHEL server which is attached to a Tape Library. I 
> have two LTO5 tape
> drives. Since the data on one of my server has grown big the back-up takes 
> 10-12 days to
> complete. I would like to split this job in to two jobs. Has anybody done 
> this kind of a set-up?
> I need some guidance on how to go about it.

I did something like this with our backups, dividing them up in order to 
decrease the amount of time necessary for backups. At the time I conceived 
this system, we had about 56TB of data. For historic reasons our backups 
were made from NFS exports from a single central NAS fileserver, because 
it was proprietary, I couldn't directly access the fileserver any other 
way. (This has changed but I haven't made substantial changes to the 
procedure yet.)

I have two tape libraries for this each with two drives, and both support 
partitioning. I partitioned the two libraries so that from the computer's 
perspective I have 4 libraries.

Our filesystems are composed primarily of user directories and working 
group directories. I calculated the size of each of these directories and 
then used a basic bin-packing algorithm to subdivide them into four groups 
each being assigned to a parition. A preperation script is used to mount 
each directory as a separate NFS mount in its respective "group" 
directory. The groups were sized such that the ratio of the data to the 
capacity of a given library's partition were more or less the same for the 
four groups.

I created four pools, four jobs, for storage devices, etc., all to 
match up with the 4 library paritions and four groups, essentially 
creating 4 complete backup jobs.

When it's time to take new base backups, I repeat the calculation, 
bin-packing part and run the update the mount points to redistribute the 
data.

It did cut down the time required to do base backups substantially.

I was also able to remove a set of files that while only occupied about a 
terabyte, contained tens of millions of tiny files that massively slowed 
down the backup process whenever it hit that spot. Perhaps you have 
similar conditions on your filesystems?

Thanks,
Rich.

-- 
  Rich Fox
  Systems Administrator
  JBPC - Marine Biological Laboratory
  http://www.mbl.edu/jbpc
  508-289-7669 - mbl-at-richfox.org

------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>