BackupPC-users

Re: [BackupPC-users] BackupPC recovery from unreliable disk

2011-12-21 23:44:35
Subject: Re: [BackupPC-users] BackupPC recovery from unreliable disk
From: hansbkk AT gmail DOT com
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Thu, 22 Dec 2011 11:42:49 +0700
I know this doesn't help for now, but next time make sure your storage
platform doesn't depend on hardware reliability - of which there is no
such thing, long term.

On the low end I recommend LVM over RAID1 for small, RAID6 for bigger
systems, obviously high-end environments have their SANs.

Just FFR. . .

On Thu, Dec 22, 2011 at 9:50 AM, JP Vossen <jp AT jpsdomain DOT org> wrote:
> I'm running Debian Squeeze stock backuppc-3.1.0-9 on a server and I'm
> getting kernel messages [1] and SMART errors [2] about the WD 2TB SATA
> disk.  Fine, I RMA'd it and have the new one...  Now what?  I know I can
> either 'dd' or start fresh.  But...
>
>
> If I start fresh, I know everything will be work and be valid, but I
> lose my historical backups when I wipe the bad disk and RMA it.
>
>
> If I 'ddrescue' BAD --> GOOD, I'll worry about the integity of the
> BackupPC store.  As I understand it, the incoming files are hashed and
> stored, but the store itself is never checked (true?).  So when I do
> backups, if an incoming file hash matches a file already in the store,
> the incoming file is "de-duped" and dropped.  But what if the file
> actually in the store is corrupt due to the bad disk?
>
> Am I correct?  If so, is there a way to have BackupPC validate that the
> files in the pool actually match their hash and weren't mangled by the disk?
>
>
> Any other solution I'm missing?
>
> Thanks,
> JP
> ___________________________________________
> [1] Example kernel errors:
>
> Security Events for kernel
> =-=-=-=-=-=-=-=-=-=-=-=-=-
> kernel: [4020993.728571] end_request: I/O error, dev sda, sector 81203507
> kernel: [4021009.712952] end_request: I/O error, dev sda, sector 81203507
>
> System Events
> =-=-=-=-=-=-=
> kernel: [4020983.471256] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0
> action 0x0
> kernel: [4020983.471290] ata3.00: BMDMA stat 0x25
> kernel: [4020983.471315] ata3.00: failed command: READ DMA
> kernel: [4020983.471347] ata3.00: cmd
> c8/00:18:33:11:d7/00:00:00:00:00/e4 tag 0 dma 12288 in
> kernel: [4020983.471351]          res
> 51/40:07:33:11:d7/40:00:28:00:00/e4 Emask 0x9 (media error)
> kernel: [4020983.471424] ata3.00: status: { DRDY ERR }
> kernel: [4020983.471446] ata3.00: error: { UNC }
> kernel: [4020983.501157] ata3.00: configured for UDMA/133
>
>
> [2] Example SMART error:
>
> Error 1704 occurred at disk power-on lifetime: 10149 hours (422 days +
> 21 hours)
>   When the command that caused the error occurred, the device was
> active or idle.
>
>   After command completion occurred, registers were:
>   ER ST SC SN CL CH DH
>   -- -- -- -- -- -- --
>   40 51 40 45 66 01 e0  Error: UNC 64 sectors at LBA = 0x00016645 = 91717
>
>   Commands leading to the command that caused the error were:
>   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
>   -- -- -- -- -- -- -- --  ----------------  --------------------
>   c8 00 40 3f 66 01 e0 08  46d+13:36:50.242  READ DMA
>   ec 00 00 00 00 00 a0 08  46d+13:36:50.233  IDENTIFY DEVICE
>   ef 03 46 00 00 00 a0 08  46d+13:36:50.225  SET FEATURES [Set transfer
> mode]
>
> ----------------------------|:::======|-------------------------------
> JP Vossen, CISSP            |:::======|      http://bashcookbook.com/
> My Account, My Opinions     |=========|      http://www.jpsdomain.org/
> ----------------------------|=========|-------------------------------
> "Microsoft Tax" = the additional hardware & yearly fees for the add-on
> software required to protect Windows from its own poorly designed and
> implemented self, while the overhead incidentally flattens Moore's Law.
>
> ------------------------------------------------------------------------------
> Write once. Port to many.
> Get the SDK and tools to simplify cross-platform app development. Create
> new or port existing apps to sell to consumers worldwide. Explore the
> Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
> http://p.sf.net/sfu/intel-appdev
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/