Amanda-Users

Re: Wrong amadmin reports and missing data

2006-03-09 11:24:50
Subject: Re: Wrong amadmin reports and missing data
From: "Iulian Topliceanu" <iulian.topliceanu AT net-m DOT de>
To: amanda-users AT amanda DOT org
Date: Thu, 9 Mar 2006 17:19:36 +0100 (CET)
Paul Bijnens wrote:
> On 2006-03-09 13:25, Iulian Topliceanu wrote:
>>
>> I'm using AMANDA server 2.4.5p1 on a RH 9 having DLT tapes and vtapes as
>> well.
>>
>> All the backup clients are Linux machines using ext3 fs.
>>
>> I had two particular problem with a client running CentOS and using
>> tar-1.14 and amanda-client 2.4.5p1.
>
> The CentOS tar-1.14 that I use does work fine AFAIK, and I also use
Amanda 2.4.5p1 on the production server (and clients).
>
>
>> 1: amadmin (and the mail reports as well) reported that /var/spool/mail/p
>> level 0 dump, was on vtape-7, but for unknown reasons, vtape-7 didn't
>> contain any data regarding /var/spool/mail/p,  it contained
>> other data that had no relevance.
>
> Here you talk about "vtape-7", below you talk about "vtapes-7", plural.
> Typing mistake in the mail, I presume...
Typo, sorry.
>
> "other data"?  amanda backups? or garbage?
It contained other data, not garbage. It contained data that was reported
as failed to be backuped.

  baal     /data/data0/share1  lev 0  FAILED [dump larger than tape,
31665455 KB, incremental dump also larger than tape]


>
> One other thing: is the /var/spool/mail/p a directory or a plain file?
/var/spool/mail/p is a directory.
>
>>
>> I wasn't able to find /var/spool/mail/p level 0 dump, on any of the
>> vtapes. The backup was inconsistent.
>
> What do you mean with "inconsistent" here?
> Is that entry the only one that is missing, and seems the rest to
> be OK?
It wasn't the only entry that wasn't ok, I had another 7 DLE's which had
the same problem, all of them reported to be also on vtape-7 (singular)
>
>
>>
>> I've checked the history of AMANDA, and /var/spool/mail/p level 0 dump,
>> was reported to be on vtapes-7, and there wasn't any error during the
>> backup procedure.
>
> ... "vtapes-7" ...
>
>
>>
>> The dump definition looks like this:
>>
>> dumpcycle 10 day        # the number of days in the normal dump cycle
>> runspercycle 8  day    # the number of amdump runs in dumpcycle days
>> tapecycle 12 tapes      # the number of tapes in rotation
>
> seems fine.
>
>>
>> Can I still trust AMANDA reports or should I do a manual check?
>
> I trust them.  But you don't have to believe me.
I've trusted them as well and I will still trust them
> If you have reason to not trust them, then extra checks are appropriate.
> However, would those extra checks have detected an error that Amanda
> was unable to?  What kind of extra checks did you think of?
Amanda reported that the dumps where successfully to tape vtape-7, but
taking a simple look to what that vtape-7 contained (using ls), has proved
the opposite, there was no sign of /var/spool/mail/p (and the other DLE's
which had the same problem) on vtape-7.

vtape-7 contained  /data/data0/share1 (directory, which I said above, was
reported not to be backuped successfully)

I should mention that these this didn't occure on the same run. My guess
is that Amanda backuped successfully /var/spool/mail/p (and the other
DLE's) on the 24.02 and wrote them to vtape-7, but on the 2.03 when this
error occured:

baal /data/data0/share1 lev 0 FAILED [dump larger than tape, 31665455 KB,
incremental dump also larger than tape]

The dumps where written again on vtape-7, so that's the reason why
/data/data0/share1 appears to pe on vtape-7 instead of /var/spool/mail/p
(and the other DLE's)

But why doesn't amadmin report that? Why doesn't Amanda make again a level
0 dump of /var/spool/mail/p?

I've noticed that 12 vtapes where used in less than 10 days (the dumpcycle
= 10 days) but is that an excuse for missing data?
>
>
>>
>> 2: the last level 0 dump of /var/spool/mail/p was on the 24.02, and the
>> next incremental dump took place on the 27.02, so, though the level 0 dump
>> of /var/spool/mail/p was missing, the incremental backup of
>> /var/spool/mail/p should have had included all the new mails between 24.02
>> - 27.02.
>>
>> How is is possible for GNUtar to skip files that have a ctime newer than
>> the last level 0 dump?
>
> Do you mean that the level 1 dump did not contain the expected files
either?  Did it contain anything at all?
Level 1 dump didn't contain *all* of the expected files. Some of the mails
received during the level 0 dump and level 1 dump, where not present.

I should add that the volume /var/spool/mail (which is split in
/var/spool/mail/[a-z]) is 130 GB big and has ~7 million inodes.

Iulian Topliceanu