Bacula-users

Re: [Bacula-users] Failing Windows Backups

2016-04-15 09:17:14
Subject: Re: [Bacula-users] Failing Windows Backups
From: Mingus Dew <shon.stephens AT gmail DOT com>
To: bacula-users <bacula-users AT lists.sourceforge DOT net>
Date: Fri, 15 Apr 2016 09:16:26 -0400
I also wanted to include the Job info I have for an example failure

2016-04-15 04:10:48mt-back4.director JobId 711684: Start Backup JobId 711684, Job=egv-fdcpss-vm04_Daily_Disk.2016-04-15_04.00.00_26

2016-04-15 04:10:48mt-back4.director JobId 711684: Using Device "Mentora_Incr_Device-5" to write.

2016-04-15 04:10:49egv-fdcpss-vm04.fdcclient JobId 711684: shell command: run ClientRunBeforeJob "start /w wbadmin delete systemstatebackup -backuptarget:C: -keepversions:0 -quiet"

2016-04-15 04:10:51egv-fdcpss-vm04.fdcclient JobId 711684: shell command: run ClientRunBeforeJob "start /w wbadmin start systemstatebackup -backuptarget:C: -quiet"

2016-04-15 04:28:33mt-back4.director JobId 711684: There are no more Jobs associated with Volume "Mentora_Incr_Disk-13390". Marking it purged.

2016-04-15 04:28:33mt-back4.director JobId 711684: All records pruned from Volume "Mentora_Incr_Disk-13390"; marking it "Purged"

2016-04-15 04:28:33mt-back4.director JobId 711684: Recycled volume "Mentora_Incr_Disk-13390"

2016-04-15 04:28:33mt-back4.storage JobId 711684: Recycled volume "Mentora_Incr_Disk-13390" on device "Mentora_Incr_Device-5" (/mnt/backup1/mentora/bacula/incremental), all previous data lost.

2016-04-15 04:28:34mt-back4.director JobId 711684: Max Volume jobs=1 exceeded. Marking Volume "Mentora_Incr_Disk-13390" as Used.

2016-04-15 04:28:34egv-fdcpss-vm04.fdcclient JobId 711684: Generate VSS snapshots. Driver="Win64 VSS", Drive(s)="C"

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "Task Scheduler Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "VSS Metadata Store Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "Performance Counters Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "System Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "ASR Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "IIS Config Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "Registry Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "Shadow Copy Optimization Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "COM+ REGDB Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "BITS Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:39egv-fdcpss-vm04.fdcclient JobId 711684: VSS Writer (BackupComplete): "WMI Writer", State: 0x1 (VSS_WS_STABLE)

2016-04-15 04:36:44egv-fdcpss-vm04.fdcclient JobId 711684: Error: lib/bsock.c:382 Socket is terminated=1 on call to client:10.0.1.65:9102

2016-04-15 04:36:45mt-back4.storage JobId 711684: Elapsed time=00:08:11, Transfer rate=35.40 M Bytes/second

2016-04-15 04:36:44egv-fdcpss-vm04.fdcclient JobId 711684: shell command: run ClientAfterJob "start /w wbadmin delete systemstatebackup -backuptarget:C: -keepversions:0 -quiet"

2016-04-15 04:36:47mt-back4.director JobId 711684: Fatal error: Network error with FD during Backup: ERR=Connection reset by peer

2016-04-15 04:36:47mt-back4.director JobId 711684: Error: Bacula mt-back4.director 5.2.13 (19Jan13):

  JobId:                  711684


On Fri, Apr 15, 2016 at 9:12 AM, Mingus Dew <shon.stephens AT gmail DOT com> wrote:
Dear All,
     I'm assuming that setting "Heartbeat Interval = 300" is the same as 5 minutes. I want to be clear on it's usage though.. it would be for when a Firewall or Router is not honoring KeepAlive?
     
     I'm having issues with backup failures on a mix of Windows clients. These all seem to have bsock.c timeout issues. Sometimes the jobs complete, mostly the FD completes writing files to SD, but loses connection to the Director and doesn't write the job to the database. It's become a hot button issue for me that I can't seem to resolve.

     I've added the Heartbeat Interval to the Director and FD configs. I've verified that the security team is not seeing any dropped connections in their FW logs, and the timeout is 4h on connections. I've curated the FileSet to minimize the data transferred. The systems are all VMs and some version of Windows (Windows 2k8, 2k12). Other than that I haven't been able to find anything common.

     I'd appreciate any ideas. The server is Bacula 7.0.5 and clients are all 5.x 

Yours,
Shon


     

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>