Bacula-users

[Bacula-users] How to recover volumes put into "error" state

2009-02-18 08:31:38
Subject: [Bacula-users] How to recover volumes put into "error" state
From: Win Htin <win.htin AT gmail DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 18 Feb 2009 08:29:10 -0500
Hi folks,

Is there a way to recover a volume that was put into "error" status?
This happened to two volumes, each from different pool in my
production environment.

Bacula Version : 2.2.6 (with 2.2.8 patch)
OS : RHEL4

Following is the first output:

==================== start ========================
18-Feb 06:00 backupserver-dir JobId 8168: BeforeJob: run command
"/etc/bacula/config/make_catalog_backup bacula bacula BACULA
localhost"
18-Feb 06:01 backupserver-dir JobId 8168: Start Backup JobId 8168,
Job=BackupCatalog.2009-02-18_06.00.28
18-Feb 06:01 backupserver-dir JobId 8168: Using Device "LTO4_1"
18-Feb 06:02 backupserver-sd JobId 8168: 3307 Issuing autochanger
"unload slot 3, drive 0" command.
18-Feb 06:03 backupserver-sd JobId 8168: 3304 Issuing autochanger
"load slot 5, drive 0" command.
18-Feb 06:03 backupserver-sd JobId 8168: 3305 Autochanger "load slot
5, drive 0", status is OK.
18-Feb 06:03 backupserver-sd JobId 8168: 3301 Issuing autochanger
"loaded? drive 0" command.
18-Feb 06:03 backupserver-sd JobId 8168: 3302 Autochanger "loaded?
drive 0", result is Slot 5.
18-Feb 06:03 backupserver-sd JobId 8168: Volume "000023" previously
written, moving to end of data.
18-Feb 06:25 backupserver-sd JobId 8168: Error: Unable to position to
end of data on device "LTO4_1" (/dev/nst0): ERR=dev.c:1326 read error
on "LTO4_1" (/dev/nst0). ERR=Input/output error.

18-Feb 06:25 backupserver-sd JobId 8168: Marking Volume "000023" in
Error in Catalog.
18-Feb 06:26 backupserver-dir JobId 8168: Using Volume "000024" from
'Scratch' pool.
18-Feb 06:26 backupserver-sd JobId 8168: 3307 Issuing autochanger
"unload slot 5, drive 0" command.
18-Feb 06:27 backupserver-sd JobId 8168: 3304 Issuing autochanger
"load slot 15, drive 0" command.
18-Feb 06:28 backupserver-sd JobId 8168: 3305 Autochanger "load slot
15, drive 0", status is OK.
18-Feb 06:28 backupserver-sd JobId 8168: 3301 Issuing autochanger
"loaded? drive 0" command.
18-Feb 06:28 backupserver-sd JobId 8168: 3302 Autochanger "loaded?
drive 0", result is Slot 15.
18-Feb 06:28 backupserver-sd JobId 8168: Wrote label to prelabeled
Volume "000024" on device "LTO4_1" (/dev/nst0)
18-Feb 06:28 backupserver-sd JobId 8168: Job write elapsed time =
00:00:35, Transfer rate = 69.95 M bytes/second
18-Feb 06:29 backupserver-dir JobId 8168: Bacula backupserver-dir
2.2.6 (10Nov07): 18-Feb-2009 06:29:03
  Build OS:               x86_64-unknown-linux-gnu redhat Enterprise release
  JobId:                  8168
  Job:                    BackupCatalog.2009-02-18_06.00.28
  Backup Level:           Full
  Client:                 "backupserver-fd" 2.2.6 (10Nov07)
x86_64-unknown-linux-gnu,redhat,Enterprise release
  FileSet:                "Catalog" 2007-12-03 04:00:00
  Pool:                   "Fulls" (From Job resource)
  Storage:                "TS3200_1_DRV1" (From Pool resource)
  Scheduled time:         18-Feb-2009 06:00:00
  Start time:             18-Feb-2009 06:01:53
  End time:               18-Feb-2009 06:29:03
  Elapsed time:           27 mins 10 secs
=================== end ==========================

Following is the second output:
=================== start ==========================
17-Feb 23:00 backupserver-dir JobId 8161: Using Device "LTO4_2"
17-Feb 23:00 backupserver-sd JobId 8161: 3301 Issuing autochanger
"loaded? drive 1" command.
17-Feb 23:00 backupserver-sd JobId 8161: 3302 Autochanger "loaded?
drive 1", result is Slot 12.
17-Feb 23:00 backupserver-sd JobId 8161: Volume "626AAL" previously
written, moving to end of data.
18-Feb 02:27 backupserver-sd JobId 8161: Error: Unable to position to
end of data on device "LTO4_2" (/dev/nst1): ERR=dev.c:1326 read error
on "LTO4_2" (/dev/nst1). ERR=Input/output error.

18-Feb 02:27 backupserver-sd JobId 8161: Marking Volume "626AAL" in
Error in Catalog.
18-Feb 02:28 backupserver-dir JobId 8161: Using Volume "000027" from
'Scratch' pool.
18-Feb 02:28 backupserver-sd JobId 8161: 3307 Issuing autochanger
"unload slot 12, drive 1" command.
18-Feb 02:28 backupserver-sd JobId 8161: 3304 Issuing autochanger
"load slot 13, drive 1" command.
 ==================== end =======================

I'm not sure if this has anything to do with updating the LTO4
firmware yesterday afternoon. I stopped all Bacula processes during
the firmware update but there were volumes/tapes in the tape drives.
Then again, those volumes which were in the tape drives were appended
without errors. It happened only when other volumes had to be loaded.
Thanks in advance for your suggestions.

Win

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users