Amanda-Users

Amanda's response to malfunction of tape library

2008-06-13 12:13:38
Subject: Amanda's response to malfunction of tape library
From: Chris Hoogendyk <hoogendyk AT bio.umass DOT edu>
To: AMANDA users <amanda-users AT amanda DOT org>
Date: Fri, 13 Jun 2008 09:36:36 -0400
We had an intense line of thunderstorms come through in the night of June 10th. It took out power across campus. In our building, log files indicate the power was out from about 23:00 until 00:50, so nearly 2 hours. Too long for the battery backups to carry everything.

Amanda backups were running when the power dropped. I have no way of knowing if the tape was writing when the power dropped or was just sitting in the drive. Anyway, amcleanup did it's job. The serious problem is with the library. It was improperly powered off, and it now doesn't seem to sense the tape that was in the drive. I have to sort that out. In the meantime, mt and mtx aren't happy. The puzzling thing is that when Amanda ran starting the evening of June 11, it did not fall back to incrementals only. It got an error on the tape check, but proceeded to do a mix of fulls and incrementals. When I did amstatus, it showed current things that were "dump done" and "wait for writing to tape", along with things from the 10th that were "waiting to flush". I happen to have plenty of holding disk space, so it's not a problem. But it's also not what I expected to have happened.

The mtx debug was as follows:


mormyrid:/tmp/amanda/server:amanda$ more chg-zd-mtx.20080611224501.debug

chg-zd-mtx: debug 1 pid 998 ruid 555 euid 555: start at Wed Jun 11 22:45:01 2008
22:45:04 Using config file /usr/local/etc/amanda/daily/changer.conf
22:45:04 Arg info:
        $# = 1
        $0 = "/usr/local/libexec/chg-zd-mtx"
        $1 = "-info"
22:45:04 Running: /usr/local/sbin/mtx status
22:45:04 Exit code: 1
        Stderr:
no Data Transfer Element reported
22:45:04 Exit (2) -> <none> no slots available
chg-zd-mtx: pid 1082 finish time Wed Jun 11 22:45:04 2008



As a follow up question to this, is there even a moderately clean way of telling Amanda to cut its losses and terminate if we have UPS software running and get the message that a power outage is impending? I know that terminating ufsdump is more than a little problematic. So there may be a limit to what Amanda can do. But, if it could at least terminate taper and either put the library in a static position with mtx or allow me to run a script that would do that, it might save some trouble.


---------------

Chris Hoogendyk

-
  O__  ---- Systems Administrator
 c/ /'_ --- Biology & Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst
<hoogendyk AT bio.umass DOT edu>

---------------
Erdös 4



<Prev in Thread] Current Thread [Next in Thread>