Bacula-users

Re: [Bacula-users] Catalogue snapshot utility : any interest?

2010-10-04 08:41:56
Subject: Re: [Bacula-users] Catalogue snapshot utility : any interest?
From: "James Harper" <james.harper AT bendigoit.com DOT au>
To: "Rory Campbell-Lange" <rory AT campbell-lange DOT net>
Date: Mon, 4 Oct 2010 23:39:39 +1100
> On 04/10/10, James Harper (james.harper AT bendigoit.com DOT au) wrote:
> > >
> > > I have developed a catalogue snapshot facility in python to
snapshot
> > > one job's catalogue and dump it to disk.
> ...
> > How much smaller is the catalogue subset vs the full catalogue?
> 
> Good question.
> 
> I'm not able to answer that question fully at present as I don't have
enough
> jobs in my current database to know.
> 
> My currrent database has the following jobs in it:
> 
>  jobid | jobfiles | jobgigs
> -------+----------+---------
>      1 |  7706717 | 6833.90
>      8 |  3965507 | 4480.83
>      9 |  1273459 |  129.87
>     50 |   646336 |  512.07
>     60 |  7845561 | 6990.67
> 
> A full pg_dump of the catalogue is 2.8G. The output of the catalogue
snapshot
> for job 60 is 1.6G. Naturally, the full pg_dump of the whole database
will
> continue to grow over time.
> 
> (The job 60 cataloge file compresses to about 300MB with bzip2 -9).
> 
> I'm a little suprised that the proportion of job 60 to the whole is so
high.
> Job 60 is similar to job 1, but I don't expect they share much
information.
> I'll have to look into that.
> 

If jobid 60 and job 1 were the same backup job then a lot of the
information may be shared in the filename table. Even if they are
backups of similar servers then they will share a lot of filename data
and that filename data has to come with the extracted catalogue so you
might not be saving that much.

James


------------------------------------------------------------------------------
Virtualization is moving to the mainstream and overtaking non-virtualized
environment for deploying applications. Does it make network security 
easier or more difficult to achieve? Read this whitepaper to separate the 
two and get a better understanding.
http://p.sf.net/sfu/hp-phase2-d2d
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users