BackupPC-users

Re: [BackupPC-users] BackupPC_trashClean (?) freezes system

2016-07-06 21:22:11
Subject: Re: [BackupPC-users] BackupPC_trashClean (?) freezes system
From: Adam Goryachev <mailinglists AT websitemanagers.com DOT au>
To: backuppc-users AT lists.sourceforge DOT net
Date: Thu, 7 Jul 2016 09:34:58 +1000
On 07/07/16 01:40, Witold Arndt wrote:
> Hei,
>
> On Mittwoch, 29. Juni 2016 20:56:55 CEST Holger Parplies wrote:
>
>> Witold Arndt wrote on 2016-06-27 08:53:40 +0200 [Re: [BackupPC-users]
> BackupPC_trashClean (?) freezes system]:
>>> On Sonntag, 26. Juni 2016 22:21:45 CEST Adam Goryachev wrote:
>>>> Can you login to the server after it has "hung"? I'm assuming yes since
>>>> you can try to kill the process.
>>>> I'd strongly suggest checking the various logs, starting with dmesg
>>>> Also, check the physical "host" to see what it thinks the status of the
>>>> VM is.
>>> Jep, I can login to the vm and everything besides backuppc is running and
>>> instantly responsive. Other processes which use the disk have no problem
>>> reading or wrting and iotop shows no hangups.
>> are they using the same file system? Can you show us a 'df -T' and perhaps
>> 'df -i' of your BackupPC VM?
> Yes, everything is on /dev/vda1, storage is on /san:
>
> $ df -T
> Filesystem        Type      1K-blocks      Used  Available Use% Mounted on
> udev              devtmpfs    2013336         4    2013332   1% /dev
> tmpfs             tmpfs        404824       364     404460   1% /run
> /dev/vda1         ext4        4391408   2543372    1601920  62% /
> none              tmpfs             4         0          4   0% /sys/fs/cgroup
> none              tmpfs          5120         0       5120   0% /run/lock
> none              tmpfs       2024120         0    2024120   0% /run/shm
> none              tmpfs        102400         0     102400   0% /run/user
> san:/vol1/storage nfs4     2879636864 188612096 2690922368   7% /san
>
> $ df -i
> Filesystem                 Inodes    IUsed     IFree IUse% Mounted on
> udev                       503334      411    502923    1% /dev
> tmpfs                      506030      333    505697    1% /run
> /dev/vda1                  287424   141971    145453   50% /
> none                       506030        2    506028    1% /sys/fs/cgroup
> none                       506030        5    506025    1% /run/lock
> none                       506030        1    506029    1% /run/shm
> none                       506030        2    506028    1% /run/user
> san:/vol1/storage       182853632 25729421 157124211   15% /san
>   
>>>> Almost every time I've tried to kill a process and seen it turn into a
>>>> zombie, it's because the process was sleeping / waiting for disk IO, and
>>>> it won't die until after the OS decides the disk IO has failed or
>>>> succeeded.>
>>> This is consistent with the 85% waiting usage, but there are no errors any
>>> log (dmesg, syslog, backuppc/log/*) whatsoever.
>>>
>>> I'm a bit lost since there were no configuration changes (besides removal
>>> and addition of backup clients) and this setup has been running since
>>> 04/2014.
>> I would suspect file system corruption. Is the trash directory empty when
>> the freeze occurs? In general, I'd suggest an 'fsck', but with a BackupPC
>> pool that might not work. You *could* try moving the trash directory out of
>> the way and recreating it with the same permissions. This would avoid
>> accessing a problematic file within it, supposing this is causing the
>> problems. Though, normally, I'd expect something in the system log files in
>> case of a file system panic. Well, 'df -T' might tell us more.
> fsck was done already and didn't show any errors. Since I didn't have any
> outages in the last days I'm not sure about the contents of trash/, but I will
> keep an eye on this.

I see, so now you let us know it is a NFS mount point.... I suspect you 
are seeing some NFS related issue. To confirm, simply login and run ls 
/san if you see the directory listing, then NFS is fine, if it hangs, 
then NFS is the problem. Once you know that, you can focus on solving 
the NFS problem and forget about backuppc.

I suspect you are hitting some performance issue which you didn't see 
before. Try tuning your NFS mount options, and or checking both nfs 
server and nfs client for relevant statistics/logs/etc.... (hint, 
resource exhaustion is happening somewhere).

You might find it better to get your NFS server to deal with the trash 
folder locally instead of the backuppc server, all you need is some cron 
based script that will do a rm -rf on the contents on a regular basis, 
and look at the backuppc script to comment out or disable the trash 
cleanup step....

PS, actually, you will probably find it better to install local drives 
onto the backuppc server instead of using NFS!


Regards,
Adam

-- 
Adam Goryachev Website Managers www.websitemanagers.com.au

------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>