Amanda-Users

Re: Multi-tape span failure

2007-10-31 17:29:35
Subject: Re: Multi-tape span failure
From: Jon LaBadie <jon AT jgcomp DOT com>
To: amanda-users AT amanda DOT org
Date: Wed, 31 Oct 2007 17:19:53 -0400
On Wed, Oct 31, 2007 at 01:59:48PM -0500, Tom Hansen wrote:
> Jon LaBadie wrote:
> >On Tue, Oct 30, 2007 at 11:31:53PM -0500, Tom Hansen wrote:
> >  
> >>BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, 
> >>configured to backup several large (300Gb +) filesystems spanning 
> >>several tapes.  I have a robot changer, LTO1 tapes (100Gb capacity) and 
> >>I used:
> >>
> >>   tape_splitsize 3Gb
> >>   fallback_splitsize 256m
> >>
> >>[ stuff deleted ]
> >>MY QUESTION:  Is there any way to configure Amanda such that such a tape 
> >>error would simply go to the next tape, instead of the worst possible 
> >>action, which is to abort the whole job?
> >>
> >>Short of that, is there any way Amanda could start up from where it left 
> >>off?
> >>
> >>    
> >
> >Short answer - no.  If the backups are in a holding disk they can
> >still be flushed to tapes, but resume a backup no.
> >
> >
> >Something in your report is amiss.  If amanda had successfully
> >used 6 tapes, it would have completed backing up and taping
> >one or more of your 300GB DLE's.  There is no reason a failed
> >tape after that would invalidate those backups.  And your
> >report (emailed or available with amreport) would show that.
> >  
> 
> Following is the report.  It clearly says "FAILED" for all 4 filesystems 
> under "FAILURE AND STRANGE DUMP SUMMARY" and sure enough, I could not 
> see any files using "amrecover". (I have done a test using one small 
> filesystem, and amrecover did work in that case, so I'm pretty confident 
> that my setup is good.)
> 
> I did just notice that, at the very bottom, it does not indicate failure 
> for the two filesystems that were complete.  I'm not sure what to make 
> of that.
> 
> Thanks for your comments.  (Oh and BTW, I was totally wrong about the 
> dump time, it was more like 20 hours)
> 
> -Tom
> 
> 
> 
> 
> 
> Hostname: waterbase
> Org     : GLWI
> Config  : fullback
> Date    : October 29, 2007
> 
> These dumps were to tapes GLWIBACK-001, GLWIBACK-002, GLWIBACK-003, 
> GLWIBACK-004, GLWIBACK-005, GLWIBACK-006.
> *** A TAPE ERROR OCCURRED: [No more writable valid tape found].
> Some dumps may have been left in the holding disk.
> Run amflush to flush them to tape.
> The next 9 tapes Amanda expects to use are: 9 new tapes.
> 
> FAILURE AND STRANGE DUMP SUMMARY:
>  waterbase.uwm.edu  /media/raid2  lev 0  FAILED [out of tape]
>  waterbase.uwm.edu  /media/raid2  lev 0  FAILED [data write: Broken pipe]
>  waterbase.uwm.edu  /             lev 0  FAILED [can't switch to 
> incremental dump]
>  waterbase.uwm.edu  /media/raid2  lev 0  FAILED [dump to tape failed]
> 
> 
> STATISTICS:
>                          Total       Full      Incr.
>                        --------   --------   --------
> Estimate Time (hrs:min)    1:00
> Run Time (hrs:min)        20:06
> Dump Time (hrs:min)       16:25      16:25       0:00
> Output Size (meg)      690435.5   690435.5        0.0
> Original Size (meg)    690351.3   690351.3        0.0
> Avg Compressed Size (%)     --         --         --
> Filesystems Dumped            2          2          0
> Avg Dump Rate (k/s)     11966.4    11966.4        --
> 
> Tape Time (hrs:min)       16:14      16:14       0:00
> Tape Size (meg)        690435.5   690435.5        0.0
> Tape Used (%)             665.3      665.3        0.0
> Filesystems Taped             2          2          0
> 
> Chunks Taped               3121       3121          0
> Avg Tp Write Rate (k/s) 12093.4    12093.4        --
> 
> USAGE BY TAPE:
>  Label              Time      Size      %    Nb    Nc
>  GLWIBACK-001       3:01 130531776K  122.8     0   498
>  GLWIBACK-002       3:10 135774016K  127.7     0   518
>  GLWIBACK-003       3:01 123874432K  116.5     1   473
>  GLWIBACK-004       3:05 143113152K  134.6     0   546
>  GLWIBACK-005       2:56 124765312K  117.4     0   476
>  GLWIBACK-006       3:38 159734400K  150.3     1   610
> 
> 
> FAILED AND STRANGE DUMP DETAILS:
> 
> /--  waterbase.uwm.edu /media/raid2 lev 0 FAILED [data write: Broken pipe]
> sendbackup: start [waterbase.uwm.edu:/media/raid2 level 0]
> sendbackup: info BACKUP=/bin/tar
> sendbackup: info RECOVER_CMD=/bin/tar -xpGf - ...
> sendbackup: info end
> | gtar: ./mysql_trans/mysql.sock: socket ignored
> \--------
> 
> 
> NOTES:
>  planner: Adding new disk waterbase.uwm.edu:/.
>  planner: Adding new disk waterbase.uwm.edu:/media/raid0.
>  planner: Adding new disk waterbase.uwm.edu:/media/raid1.
>  planner: Adding new disk waterbase.uwm.edu:/media/raid2.
>  taper: mmap failed (Cannot allocate memory): using fallback split size 
> of 262144kb to buffer waterbase.uwm.edu:/media/raid1.0 in-memory
>  taper: tape GLWIBACK-001 kb 130547712 fm 499 writing file: short write
>  taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 
> 130547712kb mark: [writing file: short write]
>  taper: tape GLWIBACK-002 kb 135895488 fm 519 writing file: short write
>  taper: continuing waterbase.uwm.edu:/media/raid1.0 on new tape from 
> 266338304kb mark: [writing file: short write]
>  taper: mmap failed (Cannot allocate memory): using fallback split size 
> of 262144kb to buffer waterbase.uwm.edu:/media/raid0.0 in-memory
>  taper: tape GLWIBACK-003 kb 124064672 fm 474 writing file: short write
>  taper: continuing waterbase.uwm.edu:/media/raid0.0 on new tape from 
> 21233664kb mark: [writing file: short write]
>  taper: tape GLWIBACK-004 kb 143219328 fm 547 writing file: short write
>  taper: continuing waterbase.uwm.edu:/media/raid0.0 on new tape from 
> 164364288kb mark: [writing file: short write]
>  taper: tape GLWIBACK-005 kb 125018816 fm 477 writing file: short write
>  taper: continuing waterbase.uwm.edu:/media/raid0.0 on new tape from 
> 289144832kb mark: [writing file: short write]
>  taper: mmap failed (Cannot allocate memory): using fallback split size 
> of 262144kb to buffer waterbase.uwm.edu:/media/raid2.0 in-memory
>  taper: tape GLWIBACK-006 kb 159989024 fm 611 writing file: short write
> 
> 
> DUMP SUMMARY:
>                                       DUMPER STATS               TAPER 
> STATS
> HOSTNAME     DISK        L ORIG-KB  OUT-KB  COMP%  MMM:SS   KB/s 
> MMM:SS   KB/s
> -------------------------- ------------------------------------- 
> -------------
> waterbase.uw /           0 FAILED 
> --------------------------------------------
> waterbase.uw -edia/raid0 0 337970560 338011808    --   466:31 12074.1 
> 460:04 12245.2
> waterbase.uw -edia/raid1 0 368949150 368994176    --   518:11 11866.7 
> 514:18 11957.7
> waterbase.uw -edia/raid2 0 FAILED 
> --------------------------------------------
> 
> (brought to you by Amanda version 2.5.2p1)
> 

It appears to me that two DLE were successfully dumped and taped.

jl
-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 4455 Province Line Road        (609) 252-0159
 Princeton, NJ  08540-4322      (609) 683-7220 (fax)

<Prev in Thread] Current Thread [Next in Thread>