
Re: [Bacula-users] query for file sizes in a job

2011-10-07 14:03:21
Subject: Re: [Bacula-users] query for file sizes in a job
From: Stuart McGraw <smcg4191 AT frii DOT com>
To: Bacula-users AT lists.sourceforge DOT net
Date: Fri, 07 Oct 2011 11:30:20 -0600
On 10/06/2011 12:36 PM, Jeff Shanholtz wrote:
> I’m currently tuning my exclude rules and one of the things I 
> want to do is make sure I’m not backing up any massive files
> that don’t need to be backed up. Is there any way to get bacula
> to list file sizes along with the file names since llist doesn’t
> do this?

The filesize and other file attributes are stored in 
(psuedo?-)base-64 encoded form in the lstat field of the 
'file' table of the catalog database.

I ran into the same problem and, since I'm using Postgresql
for my catalogs, wrote a little pg extension function in C 
that is called with an lstat value and the index number of 
the stat field wanted.  This is used as a base to define 
some one-line convenience functions like lstat_size(text), 
lstat_mtime(text), etc, which then allows one to define 
views like:

   CREATE VIEW v_files AS (
        SELECT f.fileid,
               CASE fileindex WHEN 0 THEN 'X' ELSE ' ' END AS del,
               lstat_size (lstat) AS size,
               TIMESTAMP WITH TIME ZONE 'epoch' + lstat_mtime (lstat) * 
INTERVAL '1 second' AS mtime,
               p.path|| AS filename
        FROM file f
        JOIN path p ON p.pathid=f.pathid
        JOIN filename n ON n.filenameid=f.filenameid);

which generates results like:

SELECT * FROM v_files WHERE ...whatever...;

 fileid  | jobid | del |   size   |         mtime          | filename           
 2155605 |  1750 |     |    39656 | 2011-10-06 21:18:17-06 | 
 2155606 |  1750 |     |     4096 | 2011-10-06 21:18:35-06 | /srv/backup/
 2155607 |  1750 | X   |        0 | 2011-10-05 19:59:34-06 | 
 2155571 |  1749 |     | 39553788 | 2011-10-05 21:24:16-06 | 
 2155565 |  1748 |     |    39424 | 2011-10-05 20:24:49-06 | c:/stuart/pmt.xls
 2155566 |  1748 |     |     1365 | 2011-10-05 21:22:42-06 | 
 2155567 |  1748 |     | 45197314 | 2011-10-05 21:23:07-06 | 

I've found it very convenient and will be happy to
pass it on to anyone interested but have to add a 
disclaimer is that this was the first time I've used
C in 20 years, first time I ever wrote a PG extension
function and first time I ever looked at the Bacula 
source code, so be warned. :-)

All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net