Bacula-users

Re: [Bacula-users] Unable to position to end of data on device

2013-09-25 10:08:02
Subject: Re: [Bacula-users] Unable to position to end of data on device
From: Deepak <deepak AT palincorporates DOT com>
To: Radosław Korzeniewski <radoslaw AT korzeniewski DOT net>
Date: Wed, 25 Sep 2013 19:34:01 +0530
Quoting Radosław Korzeniewski <radoslaw AT korzeniewski DOT net>:

> Hello,
>
> 2013/9/25 Deepak <deepak AT palincorporates DOT com>:
>>>
>>> Well, a "second" autochanger device? :) I bet you misconfigured your
>>> tape library. In most cases it is not the problem.
>>
>>
>> I hope it is showing like that because My blade server has two hba cards
> and
>> I have done zonning of tape for both of them.
>
> OK.
>
>>
>>>
>>>> [2:0:5:0]    tape    IBM      ULT3580-TD4      B710  /dev/st3
> /dev/sg11
>>>> [2:0:6:1]    disk    IBM      1818      FAStT  0730  /dev/sdf
> /dev/sg21
>>>> [2:0:6:31]   disk    IBM      Universal Xport  0730  -         /dev/sg22
>>>> [2:0:7:1]    disk    IBM      1818      FAStT  0730  /dev/sdh
> /dev/sg25
>>>> [2:0:7:31]   disk    IBM      Universal Xport  0730  -         /dev/sg26
>>>> [root@bacula conf.d]#
>>>>
>>>> tape device and mediumx device is for tape.
>>>>
>>>>
>>>>
>>>>
>>>>>> }
>>>>>>
>>>>>> Device {
>>>>>>   Name = Tape-0
>>>>>>   Drive Index = 0
>>>>>>   Media Type = LTO-4
>>>>>> #  Archive Device = /dev/lin_tape/IBMtape0
>>>>>>   Archive Device = /dev/st0
>>>>>>   AutomaticMount = yes;               # when device opened, read it
>>>>>>   AlwaysOpen = yes;
>>>>>>   LabelMedia = yes;
>>>>>>   RemovableMedia = yes;
>>>>>>   RandomAccess = no;
>>>>>> #  Maximum File Size = 5GB
>>>>>>     Hardware End of Medium = No
>>>>>>     Fast Forward Space File = No
>>>>>> ## Changer Command = "/usr/lib64/bacula/mtx-changer %c %o %S %a %d"
>>>>>> ## Changer Device = /dev/sg0
>>>>>>   AutoChanger = yes
>>>>>> #  # Enable the Alert command only if you have the mtx package loaded
>>>>>>  Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
>>>>>> ## If you have smartctl, enable this, it has more info than tapeinfo
>>>>>> ## Alert Command = "sh -c 'smartctl -H -l error %c'"
>>>>>>   TWO EOF = yes
>>>>>>   MaximumOpenWait = 600
>>>>>> }
>>>>>
>>>>>
>>>>>
>>>>> I'm worried about your "Hardware End of Medium = No", "Fast Forward
>>>>> Space File = No" and "TWO EOF = yes" parameters. Did you test your
>>>>> configuration with btape? What was the result? This is my Device
>>>>> resource for IBM LTO-4 drive:
>>>>>
>>>>
>>>> For two tape drives my configuration is working file with btape test
> also
>>>> for two it is not working .
>>>>
>>>
>>> So, check all without additional parameters.
>>>
>>> Why do you add this parameters? Do you know what all these parameters
>>> means?
>>
>>
>> I will check without these parameters once we will resolve this issue.
>
> What issue you woluld like to resolve before that?


As I have explained in first mail of this thread. I have a volume  
already written after some time when bacula is trying to write it  
again then it is giving below error.


24-Sep 12:26 backup-dir JobId 30: Start Backup JobId 30,
Job=CONFIG.2013-09-24_12.26.00_54
24-Sep 12:26 backup-dir JobId 30: Using Device "Tape-0"
24-Sep 12:27 backup-sd JobId 30: Volume "A00041L4" previously written,  
moving to end
of data.
24-Sep 12:46 backup-sd JobId 30: Error: Unable to position to end of  
data on device
"Tape-0" (/dev/lin_tape/IBMtape0): ERR=dev.c:1208 read error on "Tape-0"
(/dev/lin_tape/IBMtape0). ERR=Input/output error.

24-Sep 12:46 backup-sd JobId 30: Marking Volume "A00041L4" in Error in  
Catalog.



you was saying that it all is due to lin_tape driver.
Now, I have removed lin_tape driver and I am using st only. I  
scheduled a same job again. It gives me below log messages :



2013-09-25 17:37:02backup-dir JobId 55: Start Backup JobId 55,  
Job=CONFIGBackup.2013-09-25_17.37.00_04
2013-09-25 17:37:02backup-dir JobId 55: Using Device "Tape-0"

2013-09-25 17:37:02backup-sd JobId 55: Volume "A00041L4" previously  
written, moving to end of data.




[YOU CAN SEE THAT EVEN AFTER more than 1:30 hr still this job is  
saying moving to the end of data. post that I canceled that job  
manually.]




2013-09-25 19:03:49backup-dir JobId 55: Fatal error: Network error  
with FD during Backup: ERR=Interrupted system call
2013-09-25 19:03:49backup-dir JobId 55: Error: Director's comm line to  
SD dropped.
2013-09-25 19:03:49backup-dir JobId 55: Fatal error: No Job status  
returned from FD.
2013-09-25 19:03:49backup-dir JobId 55: Bacula backup-dir 5.2.12 (12Sep12):
   Build OS:               x86_64-unknown-linux-gnu redhat Enterprise release
   JobId:                  55
   Job:                    CONFIGBackup.2013-09-25_17.37.00_04
   Backup Level:           Full
   Client:                 "backup-fd" 5.2.12 (12Sep12)  
x86_64-unknown-linux-gnu,redhat,Enterprise release
   FileSet:                "CONFIG" 2013-09-23 15:30:25
   Pool:                   "BACKUP" (From Job resource)
   Catalog:                "DefaultCatalog" (From Client resource)
   Storage:                "Autochanger" (From Pool resource)
   Scheduled time:         25-Sep-2013 17:37:00
   Start time:             25-Sep-2013 17:37:02
   End time:               25-Sep-2013 19:03:49
   Elapsed time:           1 hour 26 mins 47 secs
   Priority:               10
   FD Files Written:       0
   SD Files Written:       0
   FD Bytes Written:       0 (0 B)
   SD Bytes Written:       0 (0 B)
   Rate:                   0.0 KB/s
   Software Compression:   None
   VSS:                    no
   Encryption:             no
   Accurate:               yes
   Volume name(s):
   Volume Session Id:      1
   Volume Session Time:    1380110696
   Last Volume Bytes:      730,495,245,312 (730.4 GB)
   Non-fatal FD errors:    2
   SD Errors:              0
   FD termination status:  Error
   SD termination status:  Error
   Termination:            Backup Canceled



  I have around 680 GB of data on that volume . will it take that much  
time to move to EOD...???

>
>>>>
>>>>  you are right I think this is due to naming persistence issue. can you
>>>> sen
>>>> me a sample udev rule file to configure the same.
>>>>
>>>
>>> In most cases it is not required and udev already has all rules. Check
>>> it with udevadm, like:
>>> # udevadm info --query=all --name=/dev/st0
>>>
>>
>> I got serial number and respective indexes from My IBM tape's management
>> console. Thanks for information.
>
> Did you check if this correspond to mtx drive index? If you are sure, OK.
>
>>
>> for naming persistence I have configured below rules in
>> /etc/udev/rules.d/98-st.rules file.
>>
>>
>> SUBSYSTEM=="scsi_tape", ID_SCSI_SERIAL=="000789XXXX1", SYMLINK+="st/st0"
>> SUBSYSTEM=="scsi_tape", ID_SCSI_SERIAL=="000789XXXX2", SYMLINK+="st/st1"
>> SUBSYSTEM=="scsi_tape", ID_SCSI_SERIAL=="000789XXXX3", SYMLINK+="st/st2"
>> SUBSYSTEM=="scsi_tape", ID_SCSI_SERIAL=="000789XXXX4", SYMLINK+="st/st3"
>> SUBSYSTEM=="scsi_generic", ID_SCSI_SERIAL=="00000782XXXX0401",
>> SYMLINK+="st/changer"
>>
>>
>> then I rebooted my system.
>>
>> It creates all symlinks on specified place. :)
>
> TMTOWTDI :) If you are happy with that, OK.
>
>>
>> But still there is a issue all symlinks are pointing to a single generic
>> device.
>>
>
> Or, you are not?
>
>> [root@bacula st]# ls -lah
>> total 0
>> drwxr-xr-x.  2 root root  140 Sep 25 14:37 .
>> drwxr-xr-x. 20 root root 5.8K Sep 25 14:39 ..
>> lrwxrwxrwx.  1 root root    7 Sep 25 14:37 changer -> ../sg24
>> lrwxrwxrwx.  1 root root    6 Sep 25 14:23 st0 -> ../st4
>> lrwxrwxrwx.  1 root root    6 Sep 25 14:23 st1 -> ../st4
>> lrwxrwxrwx.  1 root root    6 Sep 25 14:23 st2 -> ../st4
>> lrwxrwxrwx.  1 root root    6 Sep 25 14:23 st3 -> ../st4
>>
>
> A device ../st4 above is a tape drive device not a generic one. The generic
> one is ../sg24 above.
>
> Check: mtx -f /dev/sg24 status
>
>>
>> and still I am getting this Bad Autochanger error.
>>
>
> I use a standard udev rules for persistent device naming, something like
> this:
>
> # udevadm info --query=all --name=/dev/sg3
> P: /devices/pseudo_0/adapter0/host2/target2:0:0/2:0:0:0/scsi_generic/sg3
> N: sg3
> S: char/21:3
> S: tape/by-id/scsi-350223344ab000000
> E: UDEV_LOG=3
> E:
> DEVPATH=/devices/pseudo_0/adapter0/host2/target2:0:0/2:0:0:0/scsi_generic/sg3
> E: MAJOR=21
> E: MINOR=3
> E: DEVNAME=/dev/sg3
> E: SUBSYSTEM=scsi_generic
> E: ID_SCSI=1
> E: ID_VENDOR=IBM
> E: ID_VENDOR_ENC=IBM\x20\x20\x20\x20\x20
> E: ID_MODEL=TS3200
> E: ID_MODEL_ENC=TS3200\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20
> E: ID_REVISION=0100
> E: ID_TYPE=generic
> E: ID_SERIAL=350223344ab000000
> E: ID_SERIAL_SHORT=50223344ab000000
> E: ID_WWN=0x50223344ab000000
> E: ID_WWN_WITH_EXTENSION=0x50223344ab000000
> E: ID_SCSI_SERIAL=XYZZY_A
> E: DEVLINKS=/dev/char/21:3 /dev/tape/by-id/scsi-350223344ab000000
>
> So, I use /dev/tape/by-id/scsi-350223344ab000000 as an autochanger device.
> The same method for tape drive devices.
>
> It works for me at many implementations with different Linux distributions.
>
> best regards
> --
> Radosław Korzeniewski
> radoslaw AT korzeniewski DOT net
>



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users