BackupPC-users

Re: [BackupPC-users] Need for better 'timeout' logic

2008-12-07 11:10:59
Subject: Re: [BackupPC-users] Need for better 'timeout' logic
From: Johan Ehnberg <johan AT ehnberg DOT net>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Sun, 07 Dec 2008 17:49:05 +0200
dan wrote:
>     Specifically, it seems to me that we should distinguish (at least)
>     among the following situations for long dump/restore times
>     1. Large backups/slow links - here...
> 
> 
>     2. Disconnected PC or newly degraded link speed - here it would be
>       nice to have a separate "timeout"...
> 
>     3. Rebooted PC - in this case it may depend on the backup method. For
>       'rsync' you might as well stop ...
> 
>     4. Hung backup - here you want to stop if no "activity" (e.g., no
>       information) is being transferred ...
> 
> 
> maybe the timeout should actually run a script and proceed based on the 
> response.  have the timeout set low and run a script to check the 
> bandwidth usage of that process, then let the timeout period run again 
> and compare the bandwidth.  This would compensate for a large file as 
> there would be bandwidth used for the rsync process and allow a quick 
> termination if there is no network connectivity anymore.
> 
> thoughts?
> 
> 

I've been dealing with this quite a lot with complex setups.

The pragmatic approach is SSH, it has an option that has been helpful: 
BatchMode (or just ServerAliveInterval). Technically, it covers problems 
2 and 3. 1 should not be a problem (or only handled by the current 
client timeout) and 4 should be reported in a LOG as it is - otherwise 
it's a bug.

Also, only SSH is truly designed for connections over the internet. SMB 
especially, but also rsync can't handle it very well and rely on TCP 
only. In other words, they are best suited for LAN:s. (In which case you 
have access to the hardware and drivers, which cause any problems.)

However, analysing the bandwidth would be very interesting to create a 
smarter or  more universal diagnostic. But it would have to be aware of 
things like rsync checksumming times and concurrent connecions/server 
loads. Something like "if all of these are true, stop":
- No network activity for the link
- Network load (bottlenecs included!) is not at 100%
- Server load is not at 100%
- Client load is not at 100%
- Situation is unchanged after a period of time

This should take to account load balancing priorities, nice levels...

I look forward to any insights this discussion may reveal.

Regards,
Johan

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/