BackupPC-users

Re: [BackupPC-users] Multiple issues on newly installed 4.1.0.

2017-04-19 10:20:51
Subject: Re: [BackupPC-users] Multiple issues on newly installed 4.1.0.
From: Jens Potthast <jens.potthast AT innovation.uni-bremen DOT de>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Wed, 19 Apr 2017 16:04:16 +0200

Hello,

After upgrading from 3.x to 4.1 and 4.1.1 and converting all backups to 4.x format, I apparently have the same problem: BackupPC_tarExtract is using 100% CPU and the backup seems to be halted. To stop the backup, I need to stop it exactly twice in the Web-Frontend. Don't know why it has to stopped twice.

Manual backup with verbose logging seems to work better. Sometimes backup stalls at 100% CPU, sometimes it works. It seems to work after having the stalled backup stopped on the console with CRTL+C (again, needs two attempts).

Here is my log from a failed attempt, shortened for better readability, my comments in square brackets:

[Thousands of lines like these:]
  same     644       0/0         301 Somefolder/Somefolder/Somefile.one
  same     644       0/0       59177 Somefolder/Somefolder/Somefile.two
  delete   644       0/0       12013 Somefolder/Somefolder/Somefile.three
  delete   644       0/0      103814 Somefolder/Somefolder/Somefile.four
  delete   755       0/0           0 Somefolder/Somefolder

[Then a huge amount like these:]
tarExtract: copyInodes: finished getAll()
tarExtract: copyInodes: finished getAll()
tarExtract: copyInodes: finished getAll()
tarExtract: copyInodes: finished getAll()
tarExtract: copyInodes: finished getAll()

[backup uses 100% CPU with no disk or network IO. Need to cancel with CTRL+C (twice)]

^C^Cexiting after signal INT
__bpc_progress_state__ fail cleanup
BackupFailCleanup: nFilesTotal = 0, type = full, BackupCase = 4, inPlace = 0, lastBkupNum = 1617
Removing prior partial backup #1617
__bpc_progress_state__ delete #1617
cmdSystemOrEval: about to system /usr/share/backuppc/bin/BackupPC_backupDelete -h <backup-host> -n 1617 -l
Xfer PIDs are now 12036,11684
xferPids 12036,11684
BackupPC_backupDelete: removing #1617
__bpc_progress_state__ merge #1617 -> #1614
BackupPC_backupDelete: Merge into backup 1614
Deltas for 1614:
Uncompressed HT:
Compressed HT:
Xfer PIDs are now 12036,12037,11684
xferPids 12036,12037,11684
__bpc_progress_state__ refCnt #1618
bpc_attrib_backwardCompat: WriteOldStyleAttribFile = 0, KeepOldAttribFiles = 0
__bpc_progress_state__ cntUpdate #1618
__bpc_progress_state__ rename #1618
bpc_poolRefFileRead: got 486418 entries (nRead = 524288)
__bpc_progress_state__ sumUpdate 0/128
bpc_poolRefFileRead: got 2 entries (nRead = 41)
bpc_poolRefFileRead: got 8 entries (nRead = 149)
bpc_poolRefFileRead: got 26 entries (nRead = 473)
[...many nearly identical lines omitted]

__bpc_progress_state__ rename total
BackupPC_refCountUpdate: host <backup-host> got 0 errors (took 18 secs)
Xfer PIDs are now 11684
xferPids 11684
Finished BackupPC_refCountUpdate (running time: 18 sec)
Xfer PIDs are now 11684
xferPids 11684
dump failed: aborted by user (signal=INT)


BackupPC 3.x worked flawlessly for many years. Is there anything I can do to help fix this problem?


Regards,

Jens


Am 2017-04-10 06:01, schrieb Craig Barratt:

I'm not sure where to start.  First, I updated the documentation and pushed a change to update the mtime on the backup directories to match the backup ending time.
 
Second, I'm ignoring a few comments since they seem like gratuitous complaining.  Feel free to submit a pull request on git if you really care about those issues and I'll consider them.
 
If your main config file is in /etc/BackupPC/config.pl, then your per-client config.pl files need to be below /etc/BackupPC/pc.  That's why putting your N$ setting in /data/BackupPC/pc/win2k8server/config.pl has no effect.
 
I'm not sure why the N$ backup and SCGI are consuming 100% cpu time.  Are you sure you want to use SCGI?
 
You could get real-time output from the N$ backup by running BackupPC_dump manually with the -v option.  You should also increase $Conf{XferLogLevel} to, eg, 5:
 
    su BackupPCUser
    BackupPC_dump -v win2k8server

Craig
 
 

On Fri, Apr 7, 2017 at 1:28 AM, G.W. Haywood <backup AT jubileegroup.co DOT uk> wrote:
Hello again,

On Wed, 5 Apr 2017, G.W. Haywood wrote:

> ... later I added the 'N$' share to the BackupPC
> configuration.  That was about a week ago.  Last night's backup
> started at 21:00.  After more than 14, hours one process is still
> using 100% of a CPU:
>
> PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
> 6352 ged       20   0  109364  68124   4568 R 100.0  1.7 449:24.60
> /usr/bin/perl /usr/local/BackupPC/bin/BackupPC_tarExtract -h hostname -s N$
> -f

After letting it run for 36 hours with no sign of it writing to the log:

-rw-r----- 1 ged ged 2580494 Apr  5 02:32 XferLOG.8.z

I tried to stop it:

host07:# >>> ps axufwww | grep Backup
ged      32382  0.0  0.3  58900 12380 ?        S    Mar30   0:01 /usr/bin/perl
/usr/local/BackupPC/bin/BackupPC -d
ged      32383  0.0  0.5  74072 23428 ?        SN   Mar30   0:00  \_
/usr/bin/perl /usr/local/BackupPC/bin/BackupPC_Admin_SCGI
ged      32384  0.0  0.5  78456 24056 ?        SN   Mar30   0:00  |   \_
/usr/bin/perl /usr/local/BackupPC/bin/BackupPC_Admin_SCGI
ged      20756  0.0  0.6  79120 24720 ?        SN   Mar31   0:00  |   \_
/usr/bin/perl /usr/local/BackupPC/bin/BackupPC_Admin_SCGI
ged       4310  0.0  0.5  79624 21432 ?        S    Apr04   0:14  \_
/usr/bin/perl /usr/local/BackupPC/bin/BackupPC_dump hostname
ged       6352 91.8  1.6 109364 68124 ?        R    Apr04 1818:28      \_
/usr/bin/perl /usr/local/BackupPC/bin/BackupPC_tarExtract -h hostname -s N$ -f
host07:# >>> /etc/init.d/backuppc stop
Stopping backuppc:
ok.
host07:# >>> ps axufwww | grep Backup
ged       6352 91.8  1.6 109364 68124 ?        R    Apr04 1818:54 /usr/bin/perl
/usr/local/BackupPC/bin/BackupPC_tarExtract -h hostname -s N$ -f
host07:# >>> ps axufwww | grep Backup
ged       6352 91.8  1.6 109364 68124 ?        R    Apr04 1819:01 /usr/bin/perl
/usr/local/BackupPC/bin/BackupPC_tarExtract -h hostname -s N$ -f

The errant process wouldn't respond to '/etc/init.d/backuppc stop'
so I killed it:

host07:# >>> kill 6352
host07:# >>> ps axufwww | grep Backup
host07:# >>>

Then I restarted BackupPC:

host07:# >>> /etc/init.d/backuppc start
Starting backuppc: ok.

Then I let it run for another day, with similar results.  At 09:00
BackupPC_tarExtract is chewing 100% CPU and XferLOG.9.z was last
written at 00:37 this morning.

There seems to be an infinite loop in there somewhere which I'm
somehow triggering.

Any suggestions for debugging?

--

73,
Ged.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/