Bacula-users

Re: [Bacula-users] [Bacula-devel] Virtual backup

2008-09-08 20:31:45
Subject: Re: [Bacula-users] [Bacula-devel] Virtual backup
From: Michael Heim <Michael.Heim AT gmx DOT com>
To: Kern Sibbald <kern AT sibbald DOT com>
Date: Tue, 09 Sep 2008 02:31:24 +0200
Kern Sibbald wrote:
On Sunday 07 September 2008 03:04:21 Michael Heim wrote:
  
Hi Kern,

can't you use something like data spooling for this?

First step: Make a virtFull to a temp spool file on disk
Second step: Spool this to the destination
    
Yes, interesting idea.  It could possibly help in some cases, but the problem 
is that it doesn't scale well enough.  Take for example an extreme example: 
someone has a backup to tape amounting to 10TB.  With this, suggestion, it 
would be necessary to spool 10TB to disk then to write it to tape.  Someone 
with this size of data will require a solution that is tape to tape ...

  
Hi Kern,

I think a 10 TB backup should be no problem to spool today for a company with such a request. But first, such a big filesystem should split into several smaller parts to be more flexible for a restore. In the datacenter, where I worked for over 7 years, we have had a TSM cluster with two SUN L500 tapelibs, each equipped with 8 LTO3 FC drives and about 500 slots. For holding backup data first (spool + migration), the cluster was connected to two EMC Clariion CX3-40 with 32 TB SATA capacity each (64 TB total). I think in environments with 10 TB filesystems a 10-20 TB spool area cannot be a problem.

But filesystems with several TBs are very rare. In my opinion only video/multimedia application can utilize them. For normal file data the average filesize is between 50kb and 300kb. This means about 3-20 million files per TB.
A normal filesystem with more then about 500-1000 GB data shouldn't be used, because the time to restore is too high. A real life example (from my experience):
    A restore of a NTFS filesystem (on a HP DL385G2 Windows 2003 cluster, 2x2,4 GHz Opteron-DC, 8 GB RAM) with 400 GB and 10 million files, would take about 3 days on a EMC Clariion CX700 with 4 GB cache (measured on a 400 GB slice of a Raid5 with 8x300 GB FC 10.000 UPM harddisks - no other IO is using the same Raid) with TSM.

For such big filesystems (>1 TB or several million files) a file by file restore isn't really possible for a company with a SLA or a max. restore time. To handle those filesystems, imagebackups (a TSM feature to backup a whole filesystem with a own snapshot method) are used, so the restore of the same filesystem (400 GB, 10 million files) will only take 2h, because the whole filesystem image is restored.

A good strategy is to create only filesystems below one TB and with less then 3-5 million files, because bigger filesystems couldn't be handled properly with normal hardware. In a normal IT environment with several smaller filesystems a spool area shouldn't be any problem. The only requirement to such a spool area should be very fast concurrent reads and writes, so a Raid 10 (or perhaps SSDs) should be considered.

I think the best suggestion so far has been to add new code in Bacula that 
uses the volume list (list of volumes to be read) to ensure that none of 
those volumes are selected for writting.  That will provide a quite 
reasonable means of ensuring there are no deadlocks -- providing the user has 
two drives available.

Best regards,

Kern

  
regards
Michael

Kern Sibbald wrote:
    
Hello,

As many of you know Virtual Backup (consolidation, synthetic full, ...)
is a new feature that is implemented in the development trunk and
scheduled to be released later this year.  It essentially copies what
would be a "full current" restore to a new Volume thus creating an
virtual backup that can serve as a Full backup.  This has a lot of
advantages, particularly for sites with full backups that run long times
or for remote sites where the time to transmit a full backup is
excessive.

The Virtual Backup feature works much like Migration and Copy.  It reads
from the required Volumes and writes to a Volume specified in the pool as
"Next Pool".  This ensures that the read and write Volumes are different.

Everything seems to work fine with the Virtual Backup.  However, thinking
about longer term operations, it has occurred to me that when you want to
make a second Virtual Backup things will become very complicated.  First,
the Virtual backup will want to read the previous Virtual backup volume,
and then if that Volume is not full, it will want to write to the same
Volume.  Even if the volume is full, you will be in a situation where the
Job will want to read and write to volumes in the same pool.  In all
those cases, there is no guarantee that there will not be a deadlock
situation (actually Bacula currently cancels any job attempting to read
and write from the same Storage device).

I am not 100% sure what to do to resolve this issue.  I suppose one could
run a Migration job to "move" the Virtual Backup back to the Pool from
which it originally came, then the next Virtual Backup would work fine
(the same as the first one), but that sounds a bit kludgie.

If anyone has any suggestions, I would appreciate to hear them.  However,
suggestions that require implementing significant amounts of code or
complex new algorithms such as deadlock detection won't be very helpful
since there is no time left to do such implementations between now and
release time.  In addition, deadlock detection won't help, what we really
need is deadlock resolution, and that is an even more difficult subject.

Best regards,

Kern


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's
challenge Build the coolest Linux based applications with Moblin SDK& 
win great prizes Grand prize is a trip for two to an Open Source event
anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url="">
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
      
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's
challenge Build the coolest Linux based applications with Moblin SDK & win
great prizes Grand prize is a trip for two to an Open Source event anywhere
in the world http://moblin-contest.org/redirect.php?banner_id=100&url="">
_______________________________________________
Bacula-devel mailing list
Bacula-devel AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-devel
    
  

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>