Amanda-Users

amanda client issue, dumps failing

2009-08-12 13:41:32
Subject: amanda client issue, dumps failing
From: Brian Cuttler <brian AT wadsworth DOT org>
To: amanda-users AT wadsworth DOT org
Date: Wed, 12 Aug 2009 13:24:35 -0400
Amanda users,

For the issue below, we have see several hundred thousand emails
move through the system system each day. UFSdump is failing because
it seems too many files come and go, queries to "continue" but can't
get a reply (I don't know of a way anyway).

We tried to switch the problem DLE to gtar - but the estimate phase
seems to take hours to run. I haven't set etimeout high enough to get
a estimate yet, and this will push actual dumps back by hours.

Is there a workaround I can employ, either to get quicker estimates
(ok to assume level 0 is the usage of the partition) or get ufsdump
to work ?

I've recommended we do something to quiess the system, but our mail god
hasn't seemed to take any interested in that suggestion. Nor do we 
currently have a mechanism to snapshot or replicate (rsync, break a
mirror, etc) the partition.

                                                thanks,

                                                Brian

----- Forwarded message from Brian Cuttler <brian AT wadsworth DOT org> -----

Date: Tue, 11 Aug 2009 11:37:35 -0400
From: Brian Cuttler <brian AT wadsworth DOT org>
To: daver <daver AT wadsworth DOT org>, amanda-users AT amanda DOT org,
        Chris Knight <knight AT wadsworth DOT org>
Cc: Ivan Auger <ivan.auger AT wadsworth DOT org>
Subject: Re: amanda probelm
In-Reply-To: <4A818AA0.5070300 AT wadsworth DOT org>
User-Agent: Mutt/1.4.1i


Reviewing the issue.

Server, Solaris 10x86, Amanda 2.6.1 (with patches)
Client, Solaris 9, Amanda  2.4.4

The problem performing level 0 dumps is that there are a large
number of files in flux -- its the mailhost system -- so ufsdump
eventually asks for help, to continue or quit.

There is no help in non-interactive mode, and I don't know if there
is a mechanism to get amanda to respond to ufsdump's query. So the
level 0 of /usr1 usually fails.

Warnings - Dave's suggestion, that the history of the error be more
explicite is a good one. Can # amdump report last successful level 0,
ie due date, if current level zero fails ? Put it in the notes 
section or something ?

Non-solutions - a snapshot of an open file is an open file. All things
being equal, you will get as many open files in a snapshot as in a
live system. This will not resolve the problem.

Work arounds
    1) quiess the mail server for a period of time, force a level 0
       of /usr1 during that interval.

    2) Quiess mail delivery long enough to snapshot/rsync or break
       a mirror. Backup the placid copy.

    3) Q: will a TAR of the DLE get a backup when ufsdump can not ?

    4) "Its not really a problem."
       By and large each individual message will be backed up "most"
       of the time, each message is its own file and if its not on
       the level 0 its on almost all of the level 1 and 2 dumps.
       What we will lose are the index files, which would probably
       require a rebuild after a restore anyway, so that are not
       that important to get only tape.


On Tue, Aug 11, 2009 at 11:13:36AM -0400, daver wrote:
> brian checking over all the overdue files systems on curie, I find only 
> mailserv:/usr1 is truly overdue
> 
> As you mentioned., this may be due to the system being too active.  
> there are NO warnings, in this regard, with the exception of a "Can't 
> switch to degraded mode for unknown reason" in amdump
> 
> as we just about never read these amanda files and use the email 
> generated by the system to notify us of problems, this would seem to be 
> a significant problem with Amanda.  I agree that the developers should 
> be contacted in this regard.
> 
> as to getting /usr1 backed up. if amanda can't do it, perhaps we need to 
> consider an alternative. like tar
> 
> 
> 
---
   Brian R Cuttler                 brian.cuttler AT wadsworth DOT org
   Computer Systems Support        (v) 518 486-1697
   Wadsworth Center                (f) 518 473-6384
   NYS Department of Health        Help Desk 518 473-0773


----- End forwarded message -----
---
   Brian R Cuttler                 brian.cuttler AT wadsworth DOT org
   Computer Systems Support        (v) 518 486-1697
   Wadsworth Center                (f) 518 473-6384
   NYS Department of Health        Help Desk 518 473-0773



IMPORTANT NOTICE: This e-mail and any attachments may contain
confidential or sensitive information which is, or may be, legally
privileged or otherwise protected by law from further disclosure.  It
is intended only for the addressee.  If you received this in error or
from someone who was not authorized to send it to you, please do not
distribute, copy or use it or any attachments.  Please notify the
sender immediately by reply e-mail and delete this from your
system. Thank you for your cooperation.



<Prev in Thread] Current Thread [Next in Thread>