Bacula-users

Re: [Bacula-users] Issues With Quantum 160/320 SDLT - HP/Compaq Proliant G3

2008-06-19 09:17:40
Subject: Re: [Bacula-users] Issues With Quantum 160/320 SDLT - HP/Compaq Proliant G3
From: RA Cohen <roy AT net-vantage DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 19 Jun 2008 09:17:21 -0400
Here is an actual [edited] transcript of the problem:

Checking on things yesterday afternoon before the jobs run:

Scheduled Jobs:
Level          Type     Pri  Scheduled          Name               Volume
===================================================================================
Full           Backup    10  19-Jun-08 02:00    fservNightlySave   
Wednesday-0229
Full           Backup    10  19-Jun-08 02:00    mservNightlySave   
Wednesday-0229
Full           Backup    10  19-Jun-08 02:15    BackupCatalog      
Wednesday-0229
====

+---------+----------------+-----------+---------+----------------+----------+--------------+---------+------+-----------+-----------+---------------------+
| MediaId | VolumeName     | VolStatus | Enabled | VolBytes       | 
VolFiles | VolRetention | Recycle | Slot | InChanger | MediaType | 
LastWritten         |
+---------+----------------+-----------+---------+----------------+----------+--------------+---------+------+-----------+-----------+---------------------+
|     229 | Wednesday-0229 | Recycle   |       1 | 46,618,951,680 
|       49 |  157,680,000 |       1 |    0 |         0 | SDLT      | 
2008-05-22 02:58:30 |
|     261 | Wednesday-0010 | Used      |       1 | 80,370,662,400 
|       82 |  157,680,000 |       1 |    0 |         0 | SDLT      | 
2008-06-12 03:29:58 |
+---------+----------------+-----------+---------+----------------+----------+--------------+---------+------+-----------+-----------+---------------------+

This morning:

stat dir

Running Jobs:
 JobId Level   Name                       Status
======================================================================
  1887 Full    fservNightlySave.2008-06-19_02.00.12 is waiting on 
Storage InternalQuantum
  1888 Full    mservNightlySave.2008-06-19_02.00.13 is waiting execution
  1889 Full    BackupCatalog.2008-06-19_02.15.15 is waiting execution
====

After cancelling the jobs:

*mess
19-Jun 08:47 fserv-sd JobId 1887: Fatal error:
     Device "InternalQuantum" with MediaType "SDLT" requested by DIR not 
found in SD Device resources.
19-Jun 08:47 fserv-dir JobId 1887: Fatal error:
     Storage daemon didn't accept Device "InternalQuantum" because:
     3924 Device "InternalQuantum" not in SD Device resources.

BUT IT IS IN SD Device resources - from this particular bacula-sd.conf 
(which is an alternate config when compared to another system detailed 
in my first posting):

# A FreeBSD tape drive
#
Device {
  Name = InternalQuantum
  Description = "DDS-4 for FreeBSD"
  Media Type = SDLT
  Archive Device = /dev/sa0
  AutomaticMount = yes
  AlwaysOpen = yes
  LabelMedia = yes
  Offline On Unmount = no
  Hardware End of Medium = no
  BSF at EOM = yes
  Backward Space Record = no
  Backward Space File = no
  Fast Forward Space File = no
  TWO EOF = yes
}

And yes I have tried the other configuration for FreeBSD tape drives - 
same result.

There is something going on that is preventing bacula from properly 
dealing with a tape catalog entry marked Recycle when the correct tape 
is in the drive. Again, when I erase the tape (mt rewind mt weof), I can 
run these three jobs in the same order as above without problem. I have 
not yet updated the firmware but I have my doubts since the firmware is 
fairly up-to-date. The peculiar error is the "waiting on Storage" -  I 
have not seen this before and when I Google it there is almost nothing 
found.

Kern, Dan, Bacula gurus, please have a look at this ... the problem is 
wearing thin for both the users and myself ... and thank you for an 
otherwise fantastic product that I have used for years.

**********************************************
Bruno Friedmann wrote:
> All I can talk to you, is that I've got last week a customer having lots of 
> trouble with
> their quantum sdlt (IBM release).
>
> We get a new tape 10 pack. Inside there's some bad tape. And if you try a 
> simple operation
> like loading tape and issue a mt -f /dev/st0 rewind or status or eject we 
> ended a I/O error
> Cartridge fault.
> The bad was we have to stop the serveur, remove the electrical power. Then 
> relauch.
>
> We update the server with lastest firmware. (There's one for the lsi internal 
> scsi card).
>
> Now the tape load, indicate an I/O error cartridge fault, but we keep the 
> drive working.
> An eject command is working ( without shutdown ).
>
> So my advise would be, try to find the lastest firmware hp update CD ( which 
> is 8.0 ).
> And boot with it, to double-check if there's any update ....
> It really help. (I see this on another customers with ML570 Proliant which 
> have disk trouble last saturday night).
>
> Hope this help you a bit.
>
> RA Cohen wrote:
>   
>> I have used Bacula with great success with Compaq Proliant DL380 G3s and 
>> the Compaq (Quantum) 20/40 and 40/80 DLT drives. These were mostly 
>> external drives plugged into the same SCSI channel as the drives on 
>> these machines. I never had problems that left me scratching my head 
>> and, incidentally, never ran these drives with SCSI terminators.
>>
>> Now I have had to upgrade several of my customers to the Quantum 160/320 
>> SDLT and there have been nothing but problems! OK, I understand it is 
>> best practice to add a separate SCSI controller for tape drives in 
>> general, and I have done so. I also believe these drives like to be 
>> terminated, and I have done that. But what in the world is going on when:
>>
>> 1. Correct tape is in drive, tape marked Recycle in catalog.
>> 2. Stat Dir reports the jobs scheduled to run that night correctly, and 
>> directed to the tape in the drive.
>> 3. Night comes - unfortunately I do not have the exact message here - 
>> but the key is the first job is "waiting on device ExternalQuantum" 
>> [ExternalQuantum] is obviously the drive. The other jobs are also just 
>> piled up behind as expected. If is issue "mount" I get the same "waiting 
>> on" message eventually.
>> 4. If I cancel everything, umount the drive, quit bacula, and run:
>>
>> 5. mt -f /dev/sa0 rewind
>> 6. mt -f  /dev/sa0 weof
>>
>> Re-enter bconsole and:
>>
>> 7. If I run each job manually (in the identical and correct order 
>> because the first job loads and last job unloads) without purging and 
>> deleting the volume I am in the same place exactly.
>> 8. If I purge and delete the volume and then manually run each job they 
>> all run perfectly.
>>
>> And I also should mention that this seems to be somewhat tape-dependent 
>> - sometimes they will recycle and accept the job correctly. But, the 
>> tapes that produce the error mentioned on step #3 seem also to be able 
>> to accept the jobs when done in the manner of step #8, above, so I 
>> cannot conclude these tapes are bad. And I also should note I am having 
>> this problem on two somewhat different systems:
>>
>> -Proliant G3 FreeBSD 6.2 server Bacula 2.2.8_2 with drive on separate 
>> SCSI controller and terminated. External drive array shelf attached to 
>> on-board controller as are the internal drives.
>> -Proliant G3 FreeBSD 6.2 server Bacula 2.2.8_2 drive on same on-board 
>> controller as internal drives and not terminated.
>>
>> Here is the drive stuff from bacula-sd.conf:
>>
>> # A FreeBSD tape drive
>> Device {
>>  Name = Quantum160
>>  Media Type = SDLT
>>  Archive Device = /dev/sa0
>>  AutomaticMount = yes
>>  AlwaysOpen = yes
>>  LabelMedia = yes
>>  Offline On Unmount = no
>>  Hardware End of Medium = no
>>  BSF at EOM = no
>>  Backward Space Record = no
>>  Backward Space File = no
>>  Fast Forward Space File = yes
>>  TWO EOF = no
>> }
>>
>> I have tried the alternate configuration in the Bacula docs with same 
>> result.
>>
>> I'm stumped - any and all help welcomed and appreciated.
>>
>>     
>
>
>   

-- 
Roy A Cohen
Network Advantage LLC
413.330.9568
www.net-vantage.com


-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users