Bacula-users

Re: [Bacula-users] Should I use data spooling when writing to nfs mounted storage?

2011-03-03 13:43:35
Subject: Re: [Bacula-users] Should I use data spooling when writing to nfs mounted storage?
From: Fabio Napoleoni - ZENIT <fabio AT zenit DOT org>
To: Alan Brown <ajb2 AT mssl.ucl.ac DOT uk>
Date: Thu, 3 Mar 2011 19:40:39 +0100
Il giorno 03/mar/2011, alle ore 17.58, Alan Brown ha scritto:

> John Drescher wrote:
> 
>> I made a try with this results
>>> 
>>> JobId 7: Spooling data ...
>>> JobId 7: Job write elapsed time = 00:13:07, Transfer rate = 1.295 M 
>>> Bytes/second
>>> JobId 7: Committing spooled data to Volume "FullVolume-0004". Despooling 
>>> 1,021,072,888 bytes ...
>>> JobId 7: Despooling elapsed time = 00:02:19, Transfer rate = 7.345 M 
>>> Bytes/second
>>> JobId 7: Sending spooled attrs to the Director. Despooling 6,816,214 bytes 
>>> ...
>>> 
>>> There is only a little improvement of performances.
> 
> Spooling or not spooling is very much a case of "it depends".
> 
> There is little point in using spooling if the destination device is a disk 
> drive and you are not using concurrent jobs.(*)
> 
> For a NFS mounted destination network delays are your biggest problem.

Thank you for your analysis, after that I think that the problem is not the nfs 
overhead, because the despooling phase (over nfs filsystem) has a transfer rate 
of 7.3 MBps, so it's fine. Instead the first phase (bacula-fd -> bacula-sd) 
happens at 1.2 MBps which is very poor value I think.

So the bottleneck should be the client configuration or something similar. What 
I should check to improve performances?

This is the fileset used for that backup

FileSet {
  Name = "FileSystem Full"
  Include {
    Options {
      compression=GZIP # compress backup
      signature = MD5  # store md5 of saved files
      onefs = no       # remember to exclude nfs directory: default no
      accurate = mcs   # attributes to consider when examining changes
      verify = pins5   # attributes to consider when verifying files
      noatime = yes    # don't update file access time
    }
    # whole filesystem
    File = /
  }
  
  Exclude { 
    # use this syntax to exclude files coming from a client file
    # The given file MUST exist on the client otherwise backup will fail
    File = "\\</etc/bacula/exclusions"
  }
  
  Exclude {
    # Standard exclusion
    File = /var/lib/bacula
    File = /var/cache
    File = /var/spool
    File = /dev
    File = /proc
    File = /sys
    File = /mnt
    File = /tmp
    File = /nfs
    File = /.journal
    File = /.fsck
  }
}

and this is my job

JobDefs {
  Name = "Server Filesystem"
  Enabled = yes                    # remove when in production
  Accurate = no                    # handle deleted and moved files
  Type = Backup
  FileSet = "FileSystem Full"      # mandatory
  Schedule = FileSystemWeeklyCycle 
  Messages = Standard              # tuning dello standard oppure ridefinire
  Pool = Default                   # overwritten by following directives
  Full Backup Pool = FullPool                   # which pool uses for full 
backups
  Differential Backup Pool = DifferentialPool   # which pool uses for 
differential backups
  Incremental Backup Pool = IncrementalPool     # which pool uses for 
incremental backup
  Priority = 10
  Write Bootstrap = "/var/lib/bacula/%c.bsr" # optionally copy this file on 
other source when backup is done
  Spool Data = yes                 # avoid network congestion due to nfs writes
}

> (*) If you are using concurrent jobs and writing output to disk, spooling 
> will usually help maintain speeds, because writing multiple files to the 
> target disk will result in slowdowns due to head seeking. A single job 
> despooling is more akin to streaming throughput.

Ok, currently I haven't tried concurrent jobs, but I'll maintain spooling 
configuration when I'll do them.

--
Fabio Napoleoni - ZENIT
fabio AT zenit DOT org

"Computer Science is no more about computers than astronomy is
about telescopes"

                                                    Edsger W. Dijkstra


------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>