Bacula-users

[Bacula-users] Plans for support block-based dedupe?

2013-01-04 09:45:27
Subject: [Bacula-users] Plans for support block-based dedupe?
From: tonyalbers <bacula-forum AT backupcentral DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Fri, 04 Jan 2013 06:42:58 -0800
Spooling is definetely off.

using tar to dump the directory 3 times now.

1. Before first tar:

[root@dkarhbus02 download]# sdfscli -volume-info
Volume Capacity : 1.5 TB
Volume Current Size : 681 B
Volume Max Percentage Full : Unlimited
Volume Duplicate Data Written : 0 B
Volume Unique Data Written: 96 KB
Volume Data Read : 1.9 MB
Volume Virtual Dedup Rate (Dup/Total Bytes Written) : 0%
Volume Real Dedup Rate (DSE Size/Total Bytes Written) : -1.46195E7%
Volume Actual Storage Savings (Unique Blocks Stored/Current Size) : 
-2.11037458194E9%
[root@dkarhbus02 download]#

2. After first tar:
[root@dkarhbus02 download]# sdfscli -volume-info
Volume Capacity : 1.5 TB
Volume Current Size : 6 GB
Volume Max Percentage Full : Unlimited
Volume Duplicate Data Written : 32 KB
Volume Unique Data Written: 6 GB
Volume Data Read : 1.9 MB
Volume Virtual Dedup Rate (Dup/Total Bytes Written) : 0.0%
Volume Real Dedup Rate (DSE Size/Total Bytes Written) : -221.98%
Volume Actual Storage Savings (Unique Blocks Stored/Current Size) : -221.98%
[root@dkarhbus02 download]#

3. After second tar:
[root@dkarhbus02 download]# sdfscli -volume-info
Volume Capacity : 1.5 TB
Volume Current Size : 12.1 GB
Volume Max Percentage Full : Unlimited
Volume Duplicate Data Written : 12.1 GB
Volume Unique Data Written: 96 KB
Volume Data Read : 1.9 MB
Volume Virtual Dedup Rate (Dup/Total Bytes Written) : 100.0%
Volume Real Dedup Rate (DSE Size/Total Bytes Written) : -60.99%
Volume Actual Storage Savings (Unique Blocks Stored/Current Size) : -60.99%
[root@dkarhbus02 download]#

4. After the third tar:
[root@dkarhbus02 download]# sdfscli -volume-info
Volume Capacity : 1.5 TB
Volume Current Size : 18.1 GB
Volume Max Percentage Full : Unlimited
Volume Duplicate Data Written : 18.1 GB
Volume Unique Data Written: 128 KB
Volume Data Read : 1.9 MB
Volume Virtual Dedup Rate (Dup/Total Bytes Written) : 100.0%
Volume Real Dedup Rate (DSE Size/Total Bytes Written) : -7.33%
Volume Actual Storage Savings (Unique Blocks Stored/Current Size) : -7.33%
[root@dkarhbus02 download]#

Which looks good to me.

Now, we delete the tar files and do 3 backups using bacula:

1. After first backup:
[root@dkarhbus02 ~]# sdfscli -volume-info
Volume Capacity : 1.5 TB
Volume Current Size : 6 GB
Volume Max Percentage Full : Unlimited
Volume Duplicate Data Written : 0 B
Volume Unique Data Written: 6 GB
Volume Data Read : 1.9 MB
Volume Virtual Dedup Rate (Dup/Total Bytes Written) : 0%
Volume Real Dedup Rate (DSE Size/Total Bytes Written) : -321.8%
Volume Actual Storage Savings (Unique Blocks Stored/Current Size) : -321.81%
You have new mail in /var/spool/mail/root
[root@dkarhbus02 ~]#

2. After second backup:
[root@dkarhbus02 bacula]# sdfscli -volume-info
Volume Capacity : 1.5 TB
Volume Current Size : 12.1 GB
Volume Max Percentage Full : Unlimited
Volume Duplicate Data Written : 0 B
Volume Unique Data Written: 12.1 GB
Volume Data Read : 1.9 MB
Volume Virtual Dedup Rate (Dup/Total Bytes Written) : 0%
Volume Real Dedup Rate (DSE Size/Total Bytes Written) : -160.9%
Volume Actual Storage Savings (Unique Blocks Stored/Current Size) : -160.9%
[root@dkarhbus02 bacula]#

3. After the third backup:
[root@dkarhbus02 bacula]# sdfscli -volume-info
Volume Capacity : 1.5 TB
Volume Current Size : 18.1 GB
Volume Max Percentage Full : Unlimited
Volume Duplicate Data Written : 0 B
Volume Unique Data Written: 18.1 GB
Volume Data Read : 2 MB
Volume Virtual Dedup Rate (Dup/Total Bytes Written) : 0%
Volume Real Dedup Rate (DSE Size/Total Bytes Written) : -107.27%
Volume Actual Storage Savings (Unique Blocks Stored/Current Size) : -107.27%
You have new mail in /var/spool/mail/root
[root@dkarhbus02 bacula]#

So, looking at Volume Duplicate Data Written and Volume Unique Data Written 
it's obvious that the data created by bacula is not a good candidate for 
deduplication.

I haven't got the DXi online right now, but I'll test it out when I do and get 
back on that.

/tony

+----------------------------------------------------------------------
|This was sent by tony.albers AT gmail DOT com via Backup Central.
|Forward SPAM to abuse AT backupcentral DOT com.
+----------------------------------------------------------------------



------------------------------------------------------------------------------
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122812
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users