Here is an actual [edited] transcript of the problem:
Checking on things yesterday afternoon before the jobs run:
Scheduled Jobs:
Level Type Pri Scheduled Name Volume
===================================================================================
Full Backup 10 19-Jun-08 02:00 fservNightlySave
Wednesday-0229
Full Backup 10 19-Jun-08 02:00 mservNightlySave
Wednesday-0229
Full Backup 10 19-Jun-08 02:15 BackupCatalog
Wednesday-0229
====
+---------+----------------+-----------+---------+----------------+----------+--------------+---------+------+-----------+-----------+---------------------+
| MediaId | VolumeName | VolStatus | Enabled | VolBytes |
VolFiles | VolRetention | Recycle | Slot | InChanger | MediaType |
LastWritten |
+---------+----------------+-----------+---------+----------------+----------+--------------+---------+------+-----------+-----------+---------------------+
| 229 | Wednesday-0229 | Recycle | 1 | 46,618,951,680
| 49 | 157,680,000 | 1 | 0 | 0 | SDLT |
2008-05-22 02:58:30 |
| 261 | Wednesday-0010 | Used | 1 | 80,370,662,400
| 82 | 157,680,000 | 1 | 0 | 0 | SDLT |
2008-06-12 03:29:58 |
+---------+----------------+-----------+---------+----------------+----------+--------------+---------+------+-----------+-----------+---------------------+
This morning:
stat dir
Running Jobs:
JobId Level Name Status
======================================================================
1887 Full fservNightlySave.2008-06-19_02.00.12 is waiting on
Storage InternalQuantum
1888 Full mservNightlySave.2008-06-19_02.00.13 is waiting execution
1889 Full BackupCatalog.2008-06-19_02.15.15 is waiting execution
====
After cancelling the jobs:
*mess
19-Jun 08:47 fserv-sd JobId 1887: Fatal error:
Device "InternalQuantum" with MediaType "SDLT" requested by DIR not
found in SD Device resources.
19-Jun 08:47 fserv-dir JobId 1887: Fatal error:
Storage daemon didn't accept Device "InternalQuantum" because:
3924 Device "InternalQuantum" not in SD Device resources.
BUT IT IS IN SD Device resources - from this particular bacula-sd.conf
(which is an alternate config when compared to another system detailed
in my first posting):
# A FreeBSD tape drive
#
Device {
Name = InternalQuantum
Description = "DDS-4 for FreeBSD"
Media Type = SDLT
Archive Device = /dev/sa0
AutomaticMount = yes
AlwaysOpen = yes
LabelMedia = yes
Offline On Unmount = no
Hardware End of Medium = no
BSF at EOM = yes
Backward Space Record = no
Backward Space File = no
Fast Forward Space File = no
TWO EOF = yes
}
And yes I have tried the other configuration for FreeBSD tape drives -
same result.
There is something going on that is preventing bacula from properly
dealing with a tape catalog entry marked Recycle when the correct tape
is in the drive. Again, when I erase the tape (mt rewind mt weof), I can
run these three jobs in the same order as above without problem. I have
not yet updated the firmware but I have my doubts since the firmware is
fairly up-to-date. The peculiar error is the "waiting on Storage" - I
have not seen this before and when I Google it there is almost nothing
found.
Kern, Dan, Bacula gurus, please have a look at this ... the problem is
wearing thin for both the users and myself ... and thank you for an
otherwise fantastic product that I have used for years.
**********************************************
Bruno Friedmann wrote:
> All I can talk to you, is that I've got last week a customer having lots of
> trouble with
> their quantum sdlt (IBM release).
>
> We get a new tape 10 pack. Inside there's some bad tape. And if you try a
> simple operation
> like loading tape and issue a mt -f /dev/st0 rewind or status or eject we
> ended a I/O error
> Cartridge fault.
> The bad was we have to stop the serveur, remove the electrical power. Then
> relauch.
>
> We update the server with lastest firmware. (There's one for the lsi internal
> scsi card).
>
> Now the tape load, indicate an I/O error cartridge fault, but we keep the
> drive working.
> An eject command is working ( without shutdown ).
>
> So my advise would be, try to find the lastest firmware hp update CD ( which
> is 8.0 ).
> And boot with it, to double-check if there's any update ....
> It really help. (I see this on another customers with ML570 Proliant which
> have disk trouble last saturday night).
>
> Hope this help you a bit.
>
> RA Cohen wrote:
>
>> I have used Bacula with great success with Compaq Proliant DL380 G3s and
>> the Compaq (Quantum) 20/40 and 40/80 DLT drives. These were mostly
>> external drives plugged into the same SCSI channel as the drives on
>> these machines. I never had problems that left me scratching my head
>> and, incidentally, never ran these drives with SCSI terminators.
>>
>> Now I have had to upgrade several of my customers to the Quantum 160/320
>> SDLT and there have been nothing but problems! OK, I understand it is
>> best practice to add a separate SCSI controller for tape drives in
>> general, and I have done so. I also believe these drives like to be
>> terminated, and I have done that. But what in the world is going on when:
>>
>> 1. Correct tape is in drive, tape marked Recycle in catalog.
>> 2. Stat Dir reports the jobs scheduled to run that night correctly, and
>> directed to the tape in the drive.
>> 3. Night comes - unfortunately I do not have the exact message here -
>> but the key is the first job is "waiting on device ExternalQuantum"
>> [ExternalQuantum] is obviously the drive. The other jobs are also just
>> piled up behind as expected. If is issue "mount" I get the same "waiting
>> on" message eventually.
>> 4. If I cancel everything, umount the drive, quit bacula, and run:
>>
>> 5. mt -f /dev/sa0 rewind
>> 6. mt -f /dev/sa0 weof
>>
>> Re-enter bconsole and:
>>
>> 7. If I run each job manually (in the identical and correct order
>> because the first job loads and last job unloads) without purging and
>> deleting the volume I am in the same place exactly.
>> 8. If I purge and delete the volume and then manually run each job they
>> all run perfectly.
>>
>> And I also should mention that this seems to be somewhat tape-dependent
>> - sometimes they will recycle and accept the job correctly. But, the
>> tapes that produce the error mentioned on step #3 seem also to be able
>> to accept the jobs when done in the manner of step #8, above, so I
>> cannot conclude these tapes are bad. And I also should note I am having
>> this problem on two somewhat different systems:
>>
>> -Proliant G3 FreeBSD 6.2 server Bacula 2.2.8_2 with drive on separate
>> SCSI controller and terminated. External drive array shelf attached to
>> on-board controller as are the internal drives.
>> -Proliant G3 FreeBSD 6.2 server Bacula 2.2.8_2 drive on same on-board
>> controller as internal drives and not terminated.
>>
>> Here is the drive stuff from bacula-sd.conf:
>>
>> # A FreeBSD tape drive
>> Device {
>> Name = Quantum160
>> Media Type = SDLT
>> Archive Device = /dev/sa0
>> AutomaticMount = yes
>> AlwaysOpen = yes
>> LabelMedia = yes
>> Offline On Unmount = no
>> Hardware End of Medium = no
>> BSF at EOM = no
>> Backward Space Record = no
>> Backward Space File = no
>> Fast Forward Space File = yes
>> TWO EOF = no
>> }
>>
>> I have tried the alternate configuration in the Bacula docs with same
>> result.
>>
>> I'm stumped - any and all help welcomed and appreciated.
>>
>>
>
>
>
--
Roy A Cohen
Network Advantage LLC
413.330.9568
www.net-vantage.com
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|