Bacula-users

[Bacula-users] Slow DB-Inserts from Solaris Clients

2009-04-23 05:51:41
Subject: [Bacula-users] Slow DB-Inserts from Solaris Clients
From: Ronald Buder <rbuder AT proficom-ag DOT de>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 23 Apr 2009 11:45:43 +0200
Hi List,

I have addressed this before without much of a valuable answer, however we 
seem to have gathered some new info.

Specs of Director, SD, Postgres-DB (all on one box)
- 2 Quad-Core Intel Xeon CPUs, 3Ghz each
- 4GB Ram
- Postgres-DB on a seperate RAID-Array

We're running 64bit Debian Etch on the director, Postgres currently is a 8.3, 
self compiled which came to about a 50% boost as opposed to the 8.1 package 
we had been running up to last week.

We're still running 2.4.4 because we haven't had the chance to upgrade yet. 
Time issues mostly.

We're backing up Windows Clients, Linux, Sparc Solaris (mostly 10), as well as 
a few AIX and HP-UX boxes. All backups are good.

However we have been experiencing problems with the Solaris backups for quite 
a while now. The database operations are extremely slow as opposed to other 
systems.

Here is an example of a troublesome solaris job:

2009-04-14 00:09:55 dss-bacula-dir JobId 141405: Start Backup JobId
141405, Job=TCB-CONT0006_System.2009-04-13_22.00.01.15
2009-04-14 00:09:55 dss-bacula-dir JobId 141405: Using Device "FileStorage1"
2009-04-14 00:09:55 dss-bacula-sd JobId 141405: Volume "00124" previously 
written, moving to end of data.
2009-04-14 00:09:55 dss-bacula-sd JobId 141405: Ready to append to end of 
Volume "00124" size=71455106270
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /var is a different 
filesystem. Will not descend from / into /var
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /export is a different 
filesystem. Will not descend from / into /export
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /etc/svc/volatile is a 
different filesystem. Will not descend from / into /etc/svc/volatile
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /system/contract is a 
different filesystem. Will not descend from / into /system/contract
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /system/object is a 
different filesystem. Will not descend from / into /system/object
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /devices is a different 
filesystem. Will not descend from / into /devices
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /net is a different 
filesystem. Will not descend from / into /net
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /home is a different 
filesystem. Will not descend from / into /home
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /vol is a different 
filesystem. Will not descend from / into /vol
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /var/run is a different 
filesystem. Will not descend from /var into /var/run
2009-04-14 00:22:28 dss-bacula-sd JobId 141405: User defined maximum volume 
capacity 96,636,764,160 exceeded on device "FileStorage1" (/stage0/drive1).
2009-04-14 00:22:28 dss-bacula-sd JobId 141405: End of medium on 
Volume "00124" Bytes=96,636,731,120 Blocks=1,497,973 at 14-Apr-2009 00:22.
2009-04-14 00:22:28 dss-bacula-sd JobId 141405: 3307 Issuing 
autochanger "unload slot 24, drive 1" command.
2009-04-14 00:22:33 dss-bacula-dir JobId 141405: There are no more Jobs 
associated with Volume "00113". Marking it purged.
2009-04-14 00:22:33 dss-bacula-dir JobId 141405: All records pruned from 
Volume "00113"; marking it "Purged"
2009-04-14 00:22:33 dss-bacula-dir JobId 141405: Recycled volume "00113"
2009-04-14 00:22:33 dss-bacula-sd JobId 141405: 3301 Issuing 
autochanger "loaded? drive 1" command.
2009-04-14 00:22:33 dss-bacula-sd JobId 141405: 3302 Autochanger "loaded? 
drive 1", result: nothing loaded.
2009-04-14 00:22:33 dss-bacula-sd JobId 141405: 3304 Issuing autochanger "load 
slot 13, drive 1" command.
2009-04-14 00:22:33 dss-bacula-sd JobId 141405: 3305 Autochanger "load slot 
13, drive 1", status is OK.
2009-04-14 00:22:33 dss-bacula-sd JobId 141405: Recycled volume "00113" on 
device "FileStorage1" (/stage0/drive1), all previous data lost.
2009-04-14 00:22:33 dss-bacula-sd JobId 141405: New volume "00113" mounted on 
device "FileStorage1" (/stage0/drive1) at 14-Apr-2009 00:22.
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /var/stage/sso-a9002/root 
is a different filesystem. Will not descend from /var 
into /var/stage/sso-a9002/root
1970-01-01 01:00:01 TCB-CONT0006 JobId 141405:      /var/stage/nsdp-d9003 is a 
different filesystem. Will not descend from /var into /var/stage/nsdp-d9003
2009-04-14 00:25:32 dss-bacula-sd JobId 141405: Job write elapsed time = 
00:15:32, Transfer rate = 18.29 M bytes/second
2009-04-14 02:14:43 dss-bacula-dir JobId 141405: Bacula dss-bacula-dir 2.4.4 
(28Dec08): 14-Apr-2009 02:14:42
  Build OS:               x86_64-pc-linux-gnu debian 4.0
  JobId:                  141405
  Job:                    TCB-CONT0006_System.2009-04-13_22.00.01.15
  Backup Level:           Full
  Client:                 "TCB-CONT0006" 2.2.8 (26Jan08) 
sparc-sun-solaris2.8,solaris,5.8
  FileSet:                "SolarisFullSet" 2009-04-02 22:00:00
  Pool:                   "Disk" (From Job resource)
  Storage:                "File" (From Pool resource)
  Scheduled time:         13-Apr-2009 22:00:01
  Start time:             14-Apr-2009 00:09:55
  End time:               14-Apr-2009 02:14:42
  Elapsed time:           2 hours 4 mins 47 secs
  Priority:               10
  FD Files Written:       310,276
  SD Files Written:       310,276
  FD Bytes Written:       17,007,096,358 (17.00 GB)
  SD Bytes Written:       17,048,778,647 (17.04 GB)
  Rate:                   2271.6 KB/s
  Software Compression:   None
  VSS:                    no
  Storage Encryption:     no
  Volume name(s):         00124|00113
  Volume Session Id:      92
  Volume Session Time:    1239615605
  Last Volume Bytes:      9,999,385,381 (9.999 GB)
  Non-fatal FD errors:    0
  SD Errors:              0
  FD termination status:  OK
  SD termination status:  OK
  Termination:            Backup OK

2009-04-14 02:14:43 dss-bacula-dir JobId 141405: Begin pruning Jobs.
2009-04-14 02:14:44 dss-bacula-dir JobId 141405: No Jobs found to prune.
2009-04-14 02:14:44 dss-bacula-dir JobId 141405: Begin pruning Files.
2009-04-14 02:14:44 dss-bacula-dir JobId 141405: Pruned Files from 2 Jobs for 
client TCB-CONT0006 from catalog.
2009-04-14 02:14:44 dss-bacula-dir JobId 141405: End auto prune.

As you can see we have a few warning which can safely be ignored. We backed up 
some 310k files and some 17GB.  18.29MB/s transferrate isn't exactly great, 
but enough for our purposes. As you can see from the above log the transfer 
itself took only about 16 minutes (0:09 to 0:25), whereas the insert which 
follows the transfer went for about 110 minutes (2:14).

Here is another example. A Linux job this time:

2009-04-18 10:43:42 dss-bacula-dir JobId 143114: Start Backup JobId 143114, 
Job=IDD-T0013_System.2009-04-18_09.00.01.05
2009-04-18 10:43:42 dss-bacula-dir JobId 143114: Using Device "FileStorage1"
1970-01-01 01:00:01 idd-t0013 JobId 143114:      /var/lib/nfs/rpc_pipefs is a 
different filesystem. Will not descend from / into /var/lib/nfs/rpc_pipefs
1970-01-01 01:00:01 idd-t0013 JobId 143114:      /dev is a different 
filesystem. Will not descend from / into /dev
1970-01-01 01:00:01 idd-t0013 JobId 143114:      /ora00 is a different 
filesystem. Will not descend from / into /ora00
1970-01-01 01:00:01 idd-t0013 JobId 143114:      /boot is a different 
filesystem. Will not descend from / into /boot
1970-01-01 01:00:01 idd-t0013 JobId 143114:      /var/lib/nfs/rpc_pipefs is a 
different filesystem. Will not descend from /var into /var/lib/nfs/rpc_pipefs
2009-04-18 10:47:07 dss-bacula-sd JobId 143114: Job write elapsed time = 
00:03:25, Transfer rate = 23.49 M bytes/second
2009-04-18 10:48:09 dss-bacula-dir JobId 143114: Bacula dss-bacula-dir 2.4.4 
(28Dec08): 18-Apr-2009 10:48:07
  Build OS:               x86_64-pc-linux-gnu debian 4.0
  JobId:                  143114
  Job:                    IDD-T0013_System.2009-04-18_09.00.01.05
  Backup Level:           Full
  Client:                 "IDD-T0013" 2.4.2 (26Jul08) 
i686-redhat-linux-gnu,redhat,
  FileSet:                "LinuxFullSet" 2008-04-23 22:00:00
  Pool:                   "Disk" (From Job resource)
  Storage:                "File" (From Pool resource)
  Scheduled time:         18-Apr-2009 09:00:01
  Start time:             18-Apr-2009 10:43:42
  End time:               18-Apr-2009 10:48:07
  Elapsed time:           4 mins 25 secs
  Priority:               10
  FD Files Written:       285,321
  SD Files Written:       285,321
  FD Bytes Written:       4,775,491,269 (4.775 GB)
  SD Bytes Written:       4,816,494,175 (4.816 GB)
  Rate:                   18020.7 KB/s
  Software Compression:   None
  VSS:                    no
  Storage Encryption:     no
  Volume name(s):         00103
  Volume Session Id:      406
  Volume Session Time:    1239902538
  Last Volume Bytes:      35,997,755,905 (35.99 GB)
  Non-fatal FD errors:    0
  SD Errors:              0
  FD termination status:  OK
  SD termination status:  OK
  Termination:            Backup OK

2009-04-18 10:48:09 dss-bacula-dir JobId 143114: Begin pruning Jobs.
2009-04-18 10:48:22 dss-bacula-dir JobId 143114: No Jobs found to prune.
2009-04-18 10:48:22 dss-bacula-dir JobId 143114: Begin pruning Files.
2009-04-18 10:48:22 dss-bacula-dir JobId 143114: No Files found to prune.
2009-04-18 10:48:23 dss-bacula-dir JobId 143114: End auto prune.

The warnings are about the same. It took about 3.5 minutes to transfer the 
4.8GBs of data. 285k files are just a few less than 310k, however the insert 
was apparently done in about 1 minute.

I really am running out of ideas here. Any input (except rtfm please) is 
appreciated.

Regards

Ronald Buder

-- 
Mit freundlichen Grüßen

Ronald Buder

Tel.: +49(351)440080
Fax: +49(351)4400818
Mobil: +49(179)3218366
Email: rbuder AT proficom-ag DOT de
web: www.proficom-ag.de

profi.com AG business solutions
Firmensitz: Stresemannplatz 3, 01309 Dresden
Büro Berlin: Potsdamer Platz 11, 10785 Berlin

Amtsgericht Dresden, HRB 23438
Vorstand: Heiko Worm, Aufsichtsratsvorsitzender: Friedrich Geise

------------------------------------------------------------------------------
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>