Bacula-users

Re: [Bacula-users] Unable to position to end of data on device

2013-09-25 05:19:50
Subject: Re: [Bacula-users] Unable to position to end of data on device
From: Deepak <deepak AT palincorporates DOT com>
To: Radosław Korzeniewski <radoslaw AT korzeniewski DOT net>
Date: Wed, 25 Sep 2013 14:46:10 +0530
Quoting Radosław Korzeniewski <radoslaw AT korzeniewski DOT net>:

> Hello,
>
> 2013/9/24 Deepak <deepak AT palincorporates DOT com>:
>> Quoting Radosław Korzeniewski <radoslaw AT korzeniewski DOT net>:
>>
>>> Hello,
>>>
>>> 2013/9/24 Deepak <deepak AT palincorporates DOT com>:
>>>>
>>>>
>>>> I have removed lin_tape driver permanently loaded st driver and rebooted
>>>> system . now it is showing st devices as my st devices.
>>>>
>>>> [root@bacula ~]# lsscsi | grep tape
>>>> [1:0:2:0]    tape    IBM      ULT3580-TD4      B710  /dev/st0
>>>> [1:0:3:0]    tape    IBM      ULT3580-TD4      B710  /dev/st1
>>>> [1:0:4:0]    tape    IBM      ULT3580-TD4      B710  /dev/st5
>>>> [1:0:5:0]    tape    IBM      ULT3580-TD4      B710  /dev/st7
>>>> [2:0:4:0]    tape    IBM      ULT3580-TD4      B710  /dev/st2
>>>> [2:0:5:0]    tape    IBM      ULT3580-TD4      B710  /dev/st3
>>>> [2:0:6:0]    tape    IBM      ULT3580-TD4      B710  /dev/st4
>>>> [2:0:7:0]    tape    IBM      ULT3580-TD4      B710  /dev/st6
>>>>
>>>>
>>>> I have modified my devices configuration in bacula-sd file too.
>>>>
>>>> device configuration is below for autochanger device:
>>>>
>>>> #######################################################
>>>> Autochanger {
>>>>   Name = Autochanger
>>>>   Device = Tape-0, Tape-1, Tape-2, Tape-3
>>>>   Changer Command = "/usr/lib64/bacula/mtx-changer %c %o %S %a %d"
>>>>   Changer Device = /dev/changer
>>>
>>>
>>> Is it a right device? For mtx you need to use scsi generic device.
>>> Could you show your lsscsi -g output command, please?
>>>
>>
>>
>> below is output for lsscsi -g command:
>>
>> [root@bacula conf.d]# lsscsi -g | grep IBM
>> [0:0:0:0]    disk    IBM-ESXS ST9146852SS      B626  -         /dev/sg0
>> [0:0:1:0]    disk    IBM-ESXS ST9146852SS      B626  -         /dev/sg1
>> [1:0:0:1]    disk    IBM      1818      FAStT  0730  /dev/sdd   /dev/sg12
>> [1:0:0:31]   disk    IBM      Universal Xport  0730  -         /dev/sg13
>> [1:0:1:0]    tape    IBM      ULT3580-TD4      B710  /dev/st4   /dev/sg14
>> [1:0:2:0]    tape    IBM      ULT3580-TD4      B710  /dev/st5   /dev/sg15
>> [1:0:3:0]    tape    IBM      ULT3580-TD4      B710  /dev/st6   /dev/sg16
>> [1:0:4:0]    tape    IBM      ULT3580-TD4      B710  /dev/st7   /dev/sg17
>> [1:0:4:1]    mediumx IBM      03584L32         7440  /dev/sch1  /dev/sg18
>
> It is your autochanger device: /dev/sg18. I recommend you to use
> permanent device name from udev.
>
>> [1:0:5:1]    disk    IBM      1818      FAStT  0730  /dev/sde   /dev/sg19
>> [1:0:5:31]   disk    IBM      Universal Xport  0730  -         /dev/sg20
>> [1:0:6:1]    disk    IBM      1818      FAStT  0730  /dev/sdg   /dev/sg23
>> [1:0:6:31]   disk    IBM      Universal Xport  0730  -         /dev/sg24
>> [1:0:7:1]    disk    IBM      1818      FAStT  0730  /dev/sdi   /dev/sg27
>> [1:0:7:31]   disk    IBM      Universal Xport  0730  -         /dev/sg28
>> [2:0:0:1]    disk    IBM      1818      FAStT  0730  /dev/sdb   /dev/sg3
>> [2:0:0:31]   disk    IBM      Universal Xport  0730  -         /dev/sg4
>> [2:0:1:1]    disk    IBM      1818      FAStT  0730  /dev/sdc   /dev/sg5
>> [2:0:1:31]   disk    IBM      Universal Xport  0730  -         /dev/sg6
>> [2:0:2:0]    tape    IBM      ULT3580-TD4      B710  /dev/st0   /dev/sg7
>> [2:0:3:0]    tape    IBM      ULT3580-TD4      B710  /dev/st1   /dev/sg8
>> [2:0:4:0]    tape    IBM      ULT3580-TD4      B710  /dev/st2   /dev/sg9
>> [2:0:4:1]    mediumx IBM      03584L32         7440  /dev/sch0  /dev/sg10
>
> Well, a "second" autochanger device? :) I bet you misconfigured your
> tape library. In most cases it is not the problem.

I hope it is showing like that because My blade server has two hba  
cards and I have done zonning of tape for both of them.


>
>> [2:0:5:0]    tape    IBM      ULT3580-TD4      B710  /dev/st3   /dev/sg11
>> [2:0:6:1]    disk    IBM      1818      FAStT  0730  /dev/sdf   /dev/sg21
>> [2:0:6:31]   disk    IBM      Universal Xport  0730  -         /dev/sg22
>> [2:0:7:1]    disk    IBM      1818      FAStT  0730  /dev/sdh   /dev/sg25
>> [2:0:7:31]   disk    IBM      Universal Xport  0730  -         /dev/sg26
>> [root@bacula conf.d]#
>>
>> tape device and mediumx device is for tape.
>>
>>
>>
>>
>>>> }
>>>>
>>>> Device {
>>>>   Name = Tape-0
>>>>   Drive Index = 0
>>>>   Media Type = LTO-4
>>>> #  Archive Device = /dev/lin_tape/IBMtape0
>>>>   Archive Device = /dev/st0
>>>>   AutomaticMount = yes;               # when device opened, read it
>>>>   AlwaysOpen = yes;
>>>>   LabelMedia = yes;
>>>>   RemovableMedia = yes;
>>>>   RandomAccess = no;
>>>> #  Maximum File Size = 5GB
>>>>     Hardware End of Medium = No
>>>>     Fast Forward Space File = No
>>>> ## Changer Command = "/usr/lib64/bacula/mtx-changer %c %o %S %a %d"
>>>> ## Changer Device = /dev/sg0
>>>>   AutoChanger = yes
>>>> #  # Enable the Alert command only if you have the mtx package loaded
>>>>  Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
>>>> ## If you have smartctl, enable this, it has more info than tapeinfo
>>>> ## Alert Command = "sh -c 'smartctl -H -l error %c'"
>>>>   TWO EOF = yes
>>>>   MaximumOpenWait = 600
>>>> }
>>>
>>>
>>> I'm worried about your "Hardware End of Medium = No", "Fast Forward
>>> Space File = No" and "TWO EOF = yes" parameters. Did you test your
>>> configuration with btape? What was the result? This is my Device
>>> resource for IBM LTO-4 drive:
>>>
>>
>> For two tape drives my configuration is working file with btape test also
>> for two it is not working .
>>
>
> So, check all without additional parameters.
>
> Why do you add this parameters? Do you know what all these parameters means?

I will check without these parameters once we will resolve this issue.

>
>>
>>> Device {
>>>   Name = Drive-0
>>>   Drive Index = 0
>>>   Media Type = LTO-4-IBM
>>>   Archive Device =
>>>
>>> "/dev/tape/by-path/pci-0000:04:00.0-fc-0x2002000e111354e4:0x0000000000000000-nst"
>>>   AutomaticMount = yes;
>>>   AlwaysOpen = yes;
>>>   RemovableMedia = yes;
>>>   RandomAccess = no;
>>>   AutoChanger = yes
>>>   Maximum File Size = 8GB
>>>   Spool Directory = /var/spool/bacula
>>>   Maximum Spool Size = 800G
>>>   Maximum Job Spool Size = 128G
>>> #  Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
>>> #  If you have smartctl, enable this, it has more info than tapeinfo
>>> #  Alert Command = "sh -c 'smartctl -H -l error %c'"
>>> }
>>>
>>> It is working great for a few years now.
>>>
>>>>
>>>> ################################################
>>>>
>>>>
>>>> I have restarted bacula and started some jobs but I am getting below
>>>> error:
>>>>
>>>> ################################################
>>>>
>>>> 2013-09-24 16:33:02backup-dir JobId 37: Start Backup JobId 37,
>>>> Job=CONFIGBackup.2013-09-24_16.33.00_05
>>>> 2013-09-24 16:33:02backup-dir JobId 37: Using Device "Tape-0"
>>>> 2013-09-24 16:33:03backup-sd JobId 37: 3304 Issuing autochanger "load
>>>> slot
>>>> 3, drive 0" command.
>>>> 2013-09-24 16:38:04backup-sd JobId 37: Fatal error: 3992 Bad autochanger
>>>> "load slot 3, drive 0": ERR=Child died from signal 15: Termination.
>>>> Results=Program killed by Bacula (timeout)
>>>
>>>
>>> Autochanger definition is not working. I suspect it is a problem with
>>> wrong autochanger device. YMMV.
>>>
>>>>
>>>> 2013-09-24 16:38:04backup-fd JobId 37: Fatal error: job.c:2395 Bad
>>>> response
>>>> to Append Data command. Wanted 3000 OK data
>>>> , got 3903 Error append data
>>>>
>>>> 2013-09-24 16:38:04backup-dir JobId 37: Error: Bacula backup-dir 5.2.12
>>>> (12Sep12):
>>>>   Build OS:               x86_64-unknown-linux-gnu redhat Enterprise
>>>> release
>>>>   JobId:                  37
>>>>   Job:                    CONFIGBackup.2013-09-24_16.33.00_05
>>>>   Backup Level:           Full
>>>>   Client:                 "backup-fd" 5.2.12 (12Sep12)
>>>> x86_64-unknown-linux-gnu,redhat,Enterprise release
>>>>   FileSet:                "CONFIG" 2013-09-23 15:30:25
>>>>   Pool:                   "BACKUP" (From Job resource)
>>>>   Catalog:                "DefaultCatalog" (From Client resource)
>>>>   Storage:                "Autochanger" (From Pool resource)
>>>>   Scheduled time:         24-Sep-2013 16:33:00
>>>>   Start time:             24-Sep-2013 16:33:02
>>>>   End time:               24-Sep-2013 16:38:04
>>>>   Elapsed time:           5 mins 2 secs
>>>>   Priority:               10
>>>>   FD Files Written:       0
>>>>   SD Files Written:       0
>>>>   FD Bytes Written:       0 (0 B)
>>>>   SD Bytes Written:       0 (0 B)
>>>>   Rate:                   0.0 KB/s
>>>>   Software Compression:   None
>>>>   VSS:                    no
>>>>   Encryption:             no
>>>>   Accurate:               yes
>>>>   Volume name(s):
>>>>   Volume Session Id:      1
>>>>   Volume Session Time:    1380020355
>>>>   Last Volume Bytes:      547,871,450,112 (547.8 GB)
>>>>   Non-fatal FD errors:    1
>>>>   SD Errors:              1
>>>>   FD termination status:  Error
>>>>   SD termination status:  Error
>>>>   Termination:            *** Backup Error ***
>>>
>>>
>>>> 24-Sep 17:00 backup-dir JobId 38: Start Backup JobId 38,
>>>> Job=CONFIGBackup.2013-09-24_17.00.00_11
>>>> 24-Sep 17:00 backup-dir JobId 38: Using Device "Tape-0"
>>>> 24-Sep 17:03 backup-sd JobId 38: Warning: Volume "A00042L4" wanted on
>>>> "Tape-0" (/dev/st0) is in use by device "Tape-0" (/dev/st0)
>>>> 24-Sep 17:03 backup-sd JobId 38: Warning: mount.c:217 Open device
>>>> "Tape-0"
>>>> (/dev/st0) Volume "A00042L4" failed: ERR=dev.c:513 Unable to open device
>>>> "Tape-0" (/dev/st0): ERR=No medium found
>>>
>>>
>>> Well, if your autochanger is not working (see above comment) then "No
>>> medium found" is a right error for that. Did you verified what st
>>> (/dev/st0, /dev/st1, ... etc.) device correspond to what drive index
>>> in your configuration?
>>>
>>
>> please tell me how I can veryfy that my device corresponds to which drive
>> index.
>>
>
> I do it by simple mt -f /dev/[st0|st1|and so on] status with only one
> tape loaded into drive.
> If drive is online then it is a device for drive index. Then i swap a
> tape into another drive, repeat.
>
>>
>>>>
>>>> with mtx and mt tape drive is working fine manually I have manually
>>>> tested
>>>> mtx-changer command too.
>>>>
>>>
>>> Could you show your manual test procedure and output?
>>>
>>> For production setup I use a permanent device name provided by udev,
>>> not /dev/st* or /dev/sg*. It saved me from a lot of trouble.
>>>
>>> I fount you are using a lot of tape drives in you library, so I
>>> recommend to verify what device correspond to what drive index in your
>>> configuration. it is not usual that /dev/st0 has drive index = 5.
>>>
>>> best regards
>>> --
>>> Radosław Korzeniewski
>>> radoslaw AT korzeniewski DOT net
>>>
>>
>>  you are right I think this is due to naming persistence issue. can you sen
>> me a sample udev rule file to configure the same.
>>
>
> In most cases it is not required and udev already has all rules. Check
> it with udevadm, like:
> # udevadm info --query=all --name=/dev/st0
>
> best regards
> --
> Radosław Korzeniewski
> radoslaw AT korzeniewski DOT net
>

I got serial number and respective indexes from My IBM tape's  
management console. Thanks for information.

for naming persistence I have configured below rules in  
/etc/udev/rules.d/98-st.rules file.


SUBSYSTEM=="scsi_tape", ID_SCSI_SERIAL=="000789XXXX1", SYMLINK+="st/st0"
SUBSYSTEM=="scsi_tape", ID_SCSI_SERIAL=="000789XXXX2", SYMLINK+="st/st1"
SUBSYSTEM=="scsi_tape", ID_SCSI_SERIAL=="000789XXXX3", SYMLINK+="st/st2"
SUBSYSTEM=="scsi_tape", ID_SCSI_SERIAL=="000789XXXX4", SYMLINK+="st/st3"
SUBSYSTEM=="scsi_generic", ID_SCSI_SERIAL=="00000782XXXX0401",  
SYMLINK+="st/changer"


then I rebooted my system.

It creates all symlinks on specified place. :)

But still there is a issue all symlinks are pointing to a single  
generic device.

[root@bacula st]# ls -lah
total 0
drwxr-xr-x.  2 root root  140 Sep 25 14:37 .
drwxr-xr-x. 20 root root 5.8K Sep 25 14:39 ..
lrwxrwxrwx.  1 root root    7 Sep 25 14:37 changer -> ../sg24
lrwxrwxrwx.  1 root root    6 Sep 25 14:23 st0 -> ../st4
lrwxrwxrwx.  1 root root    6 Sep 25 14:23 st1 -> ../st4
lrwxrwxrwx.  1 root root    6 Sep 25 14:23 st2 -> ../st4
lrwxrwxrwx.  1 root root    6 Sep 25 14:23 st3 -> ../st4


and still I am getting this Bad Autochanger error.

Regards,
Deepak




------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users