Bacula-users

Re: [Bacula-users] Dell TL1000 IBM3850-HH7

2017-04-09 16:07:54
Subject: Re: [Bacula-users] Dell TL1000 IBM3850-HH7
From: Jim Richardson <jim AT securit360 DOT com>
To: Kern Sibbald <kern AT sibbald DOT com>, Alan Brown <ajb2 AT mssl.ucl.ac DOT uk>, "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>, "Roberts, Ben" <ben.roberts AT gsacapital DOT com>, Simone Caronni <negativo17 AT gmail DOT com>
Date: Sun, 9 Apr 2017 20:06:41 +0000
Good afternoon everyone.

I wanted to update everyone on the solution to my issue.  My HBA showed up last 
week, installed on Wednesday.  Today I got an opportunity to test the 
configuration with Bacula.  The test was a success.  That said the HBA was not 
my only change.  I did use the ITDT "IBM Tape Diagnostic Tool" to reset all 
driver options to factory defaults.  I am now even able to get a successful 
test with the IBM lin_tape drivers installed by adding the Hardware End of 
Medium = No and Fast Forward Space File = No options.  I am sticking with the 
basic ST drivers for a cleaner "stock" configuration.   My final configuration 
is bone stock.  (See below)  I want to thank everyone who aided me with this.  
I look forward to assisting the community in the future.  Verbosity for the 
sake of future searches.

# Steps
1. Install non-RAID HBA
2. install lin_tape & lin_taped
3. install ITDT
4. run full set of tests in ITDT
5. factory default settings - Specifically the "Read Past Filemark" setting 
MUST BE no
     http://www-01.ibm.com/support/docview.wss?uid=ssg1S7002972&aid=1
6. remove lin_tape & lin_taped
7. rmmod st
8. modprob st
9. power cycle library
10. Bacula configuration
11. btape test


Device {
  Name = Drive-1                      #
  Drive Index = 0
  Media Type = LTO-7
  Archive Device = /dev/nst0
  AutomaticMount = yes;               # when device opened, read it
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes
  Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
}

# IBM lin_tape & ITDT
lin_tape-3.0.18-1.x86_64.rpm
lin_taped-3.0.18-rhel7.x86_64.rpm
install_itdt_se_Linuxx86_64_9.1.0.20161006

# uname -r
3.10.0-514.10.2.el7.x86_64

# rpm -qa | grep bacula
bacula-libs-sql-7.4.7-1.el7.centos.x86_64
bacula-storage-7.4.7-1.el7.centos.x86_64
bacula-director-7.4.7-1.el7.centos.x86_64
bacula-common-7.4.7-1.el7.centos.x86_64
bacula-console-7.4.7-1.el7.centos.x86_64
bacula-libs-7.4.7-1.el7.centos.x86_64
bacula-client-7.4.7-1.el7.centos.x86_64

# lsscsi -g
[3:0:1:0]    tape    IBM      ULT3580-HH7      G9Q1  /dev/st0   /dev/sg5
[3:0:1:1]    mediumx IBM      3572-TL          0071  -          /dev/sg6

# dmesg | grep scsi
[335932.085723] st 3:0:0:0: Attached scsi tape st0
[335933.799341] ch 3:0:0:1: Attached scsi changer ch0
[336285.185770] st 3:0:0:0: Attached scsi tape st0
[336399.340252] scsi 3:0:1:0: Sequential-Access IBM      ULT3580-HH7      G9Q1 
PQ: 0 ANSI: 6
[336399.340264] scsi 3:0:1:0: SSP: handle(0x0009), 
sas_addr(0x500169772b7d1011), phy(3), device_name(0x7769015010107d2b)
[336399.340269] scsi 3:0:1:0: SSP: enclosure_logical_id(0x5d4ae52075329f00), 
slot(4)
[336399.340274] scsi 3:0:1:0: qdepth(254), tagged(1), simple(0), ordered(0), 
scsi_level(7), cmd_que(1)
[336399.341522] scsi 3:0:1:0: CDB: Mode Sense(6) 1a 00 19 00 40 00
[336399.341547] mpt2sas_cm0:    handle(0x0009), ioc_status(scsi data 
underrun)(0x0045), smid(1)
[336399.341555] mpt2sas_cm0:    scsi_status(check condition)(0x02), 
scsi_state(autosense valid )(0x01)
[336399.344259] scsi 3:0:1:0: TLR Enabled
[336399.373158] st 3:0:1:0: Attached scsi tape st0
[336399.373321] st 3:0:1:0: Attached scsi generic sg5 type 1
[336402.965182] scsi 3:0:1:1: Medium Changer    IBM      3572-TL          0071 
PQ: 0 ANSI: 3
[336402.965199] scsi 3:0:1:1: SSP: handle(0x0009), 
sas_addr(0x500169772b7d1011), phy(3), device_name(0x7769015010107d2b)
[336402.965204] scsi 3:0:1:1: SSP: enclosure_logical_id(0x5d4ae52075329f00), 
slot(4)
[336402.965211] scsi 3:0:1:1: qdepth(254), tagged(1), simple(0), ordered(0), 
scsi_level(4), cmd_que(1)
[336402.965989] scsi 3:0:1:1: CDB: Mode Sense(6) 1a 00 19 00 40 00
[336402.965999] mpt2sas_cm0:    handle(0x0009), ioc_status(scsi data 
underrun)(0x0045), smid(1)
[336402.966007] mpt2sas_cm0:    scsi_status(check condition)(0x02), 
scsi_state(autosense valid )(0x01)
[336403.256009] ch 3:0:1:1: Attached scsi changer ch0
[336403.256255] ch 3:0:1:1: Attached scsi generic sg6 type 8
[336437.466706] mpt2sas_cm0:    scsi_status(check condition)(0x02), 
scsi_state(autosense valid )(0x01)
[336476.004643] mpt2sas_cm0:    handle(0x0009), ioc_status(scsi data 
underrun)(0x0045), smid(1)
[336476.004650] mpt2sas_cm0:    scsi_status(check condition)(0x02), 
scsi_state(autosense valid )(0x01)
[336476.006236] mpt2sas_cm0:    handle(0x0009), ioc_status(scsi data 
underrun)(0x0045), smid(1)
[336476.006243] mpt2sas_cm0:    scsi_status(check condition)(0x02), 
scsi_state(autosense valid )(0x01)
[336631.889098] mpt2sas_cm0:    scsi_status(check condition)(0x02), 
scsi_state(autosense valid )(0x01)
[337108.032951] st 3:0:1:0: Attached scsi tape st0
[337368.095820] mpt2sas_cm0:    scsi_status(check condition)(0x02), 
scsi_state(autosense valid )(0x01) 

# tapeinfo -f /dev/nst0
Product Type: Tape Drive
Vendor ID: 'IBM     '
Product ID: 'ULT3580-HH7     '
Revision: 'G9Q1'
Attached Changer API: No
SerialNumber: '1097000515'
MinBlock: 1
MaxBlock: 8388608
SCSI ID: 1
SCSI LUN: 0
Ready: yes
BufferedMode: yes
Medium Type: 0x78
Density Code: 0x5c
BlockSize: 0
DataCompEnabled: yes
DataCompCapable: yes
DataDeCompEnabled: yes
CompType: 0xff
DeCompType: 0xff
BOP: yes
Block Position: 0
Partition 0 Remaining Kbytes: -1
Partition 0 Size in Kbytes: -1
ActivePartition: 0
EarlyWarningSize: 0
NumPartitions: 0
MaxPartitions: 3


Jim Richardson

-----Original Message-----
From: Kern Sibbald [mailto:kern AT sibbald DOT com] 
Sent: Saturday, April 1, 2017 10:28 AM
To: Jim Richardson <jim AT securit360 DOT com>; Alan Brown <ajb2 AT mssl.ucl.ac 
DOT uk>; bacula-users AT lists.sourceforge DOT net
Subject: Re: [Bacula-users] Dell TL1000 IBM3850-HH7



On 03/31/2017 10:27 PM, Jim Richardson wrote:
> Alan,
>
> I certainly understand what you are saying.  My days of coding full time are 
> long over.  I am just a guy trying to get a good opensource community 
> supported solution in place.  Kern will certainly have more pull then I will.
>
> Kern can you give us the official line on lin_tape support?
I am pretty well booked until the end of the year, mostly backporting Bacula 
Enterprise to the community and preparing to implement Aligned volumes and then 
the Cloud SD plugins for the community.

After that, I am not 100% sure what I will be doing, but high on my list is 
implementing Aligned volumes for tapes (actually VTLs).

The best bet for getting new tape drivers is to convince Bacula Systems that it 
is important.  At the moment, it is not on the roadmap and hardly on the radar 
screen.  I am sure of that because I define (with agreement of the major 
players) the Bacula Systems Roadmap.

By the way, if you use the default Tape Device resource, Bacula will not read 
to find the end of the tape data, it will do a single ioctl function, and if 
the driver knows how to get there fast it will do it.

Best regards,
Kern

>
> Jim Richardson
>
> -----Original Message-----
> From: Alan Brown [mailto:ajb2 AT mssl.ucl.ac DOT uk]
> Sent: Friday, March 31, 2017 3:08 PM
> To: bacula-users AT lists.sourceforge DOT net
> Subject: Re: [Bacula-users] Dell TL1000 IBM3850-HH7
>
> On 30/03/17 10:20, Kern Sibbald wrote:
>> Hello Jim,
>>
>> Thanks for the feeback.  See below ...
>>
>>
>> On 03/30/2017 02:13 AM, Jim Richardson wrote:
>>> Team,
>>>
>>> I have an update on my progress.  With the deadline for our new 
>>> environment looming I started writing bash scripts to process the 
>>> tape jobs as I mentioned before.  I started to receive strange 
>>> errors; the tape wouldn't always start after loading. By that I mean 
>>> I would receive an EIO over and over again until it woke up.  Then 
>>> occasionally it would get an EIO when the tape finished, well, 
>>> anything - writing, reading, rewinding, etc.  I started researching 
>>> errors and Kern's last comments about MTSETDRVBUFFER, MT_ST_SYSV, 
>>> and MT_ST_ASYNC_WRITES.  What I found may be the issue.  I have been 
>>> using a RAID controller - a PERC H810 out of the back of a chained 
>>> Dell MD1220.  I found a post explaining that the only supported 
>>> configuration for the Dell TL1000 & IBM 3850 uses a plain (non-raid) 
>>> SAS HBA. Users report strange errors and behaviors.  I have ordered 
>>> a new 12DNW.  Once it arrives, I will retest.
>> Hmm. Your setup sounds a bit complicated, and that sometimes means 
>> problems for Bacula.  Good luck with your new HBA.
>>> Alan - Sounds like butterflies, roses, and sunshine.
> Indeed, however we're going to have to change soon anyway - support for 
> changers is altering - with the new "ch" driver and the existing "st"
> driver hasn't been touched in over 12 years.
>
>
> I've raised a ticket with Bacula Systems about supporting the IBM 
> Lin_tape driver and pointed them to the IBM Tape Drive Programming 
> manual - 
> https://www-945.ibm.com/support/fixcentral/swg/selectFixes?parent=Tape
> +drivers+and+software&product=ibm/Storage_Tape/Tape+device+drivers&rel
> ease=1.0&platform=Linux&function=all#Others
>
> There are some interesting extensions in LTO5 onwards which should 
> make life easier for positioning, etc.
> (The 'append open' command being one of the more useful commands which 
> appears to be available for all versions of LTO.)
>
>
> As for what's currently tripping up Bacula, Section 4 (linux) says:
>
> If the read function reaches end of the data on the tape, input/output 
> error (EIO) is returned and ASC, ASCQ keys (obtained by request sense
> IOCTLs) indicate the end of data. IBMtape also conforms to all SCSI 
> standard read operation rules, such as fixed block versus variable 
> block
>
> The lesson there is that using "read" to seek to EOD isn't the best 
> way of doing it (especially when you have 'append open', which will do 
> it for you automagically)
>
> There's a lot of difference between a basic dumb tape drive and a 
> servoed intelligent setup like AIT, SDLT, LTO and the other half inch 
> formats which know exactly where they are on the tape at all times.
>
> Sure, you can _use_ the smart drives in dumb mode but you'll get far 
> better performance if you can make use of the smart features
>
> I'm running a bunch of tests on various drives at the moment, and as 
> such should be able to see how well the lin_tape driver supports 
> non-IBM stuff. (As far as I can see in the source there's no test to 
> check if a drive is specifically IBM or not and the code is all GPL)
>
>
>>> I learned a long time ago - because "I" can make something work - "I"
>>> will own it indefinitely.  When I tire of that, we move to a 
>>> solution that everyone can make work.  Unless Bacula's Team 
>>> officially supports it, I won't force it :)
> I agree, but it's in everyone's interest for Bacula to support the 
> driver. (Think of it like mysql vs postgres support, etc)
>
>
>> Good philosophy -- it causes a lot less grief.
>>
>> Best regards,
>> Kern
>>
>>> Jim Richardson
>>>
>>> -----Original Message-----
>>> From: Alan Brown [mailto:ajb2 AT mssl.ucl.ac DOT uk]
>>> Sent: Tuesday, March 21, 2017, 2:18 PM
>>> To: Jim Richardson <jim AT securit360 DOT com>; Kern Sibbald 
>>> <kern AT sibbald DOT com>; bacula-users AT lists.sourceforge DOT net
>>> Cc: Simone Caronni <negativo17 AT gmail DOT com>; Roberts, Ben 
>>> <ben.roberts AT gsacapital DOT com>; Alan Brown <ajb2 AT mssl.ucl.ac DOT uk>
>>> Subject: Re: [Bacula-users] Dell TL1000 IBM3850-HH7
>>>
>>> On 19/03/17 17:02, Jim Richardson wrote:
>>>
>>>>      I am not interested in the IBM driver if I can get the ST to work.
>>> I can understand why, but....
>>>
>>>
>>> There would be significant advantage in using the IBMtape driver 
>>> over the generic ST driver if Bacula could be modified to handle its 
>>> oddity on forward/back spacing.
>>>
>>>
>>> The IBMtape driver supports multipathing to tape drives and robots 
>>> (Character and Generic devices) which the linux standard ST and SG 
>>> drivers don't have. For 99.9% of installations that's irrelevant, 
>>> but if you have a multipathed fabric (iscsi, FC or SAS) then it 
>>> provides a _major_ leap in robustness and gets rid of the issue of a 
>>> single drive or changer showing up as multiple /dev/st* units
>>>
>>> The reason for this being a nuisance under ST driver is when you're 
>>> using udev and pointing to the drive using /dev/tape/by-id (which is 
>>> the only reliable way to get to a drive on a fabric), because on any 
>>> fabric disturbance udev may repoint that symbolic link to a 
>>> different
>>> /dev/nst*
>>> - at which point things start breaking badly.
>>>
>>> Remember, drives have to be unlocked by the initiator WWID that 
>>> locked them and all locks are additive, so what happens is that when 
>>> the secondary FC controller issues an unlock and eject command, the 
>>> drive doesn't remove the lock that's been set by the primary FC 
>>> controller, thern fails to eject and generates an error. The robot 
>>> will then throw another error ("removal prevented") which may (or 
>>> may
>>> not) require manual acknowledgement before it can access other 
>>> drives and as a reult the night's backup sequence comes to a an early halt.
>>>
>>> (This is also the usual cause of "My tapes won't eject" problems in 
>>> robots on SAN fabrics)
>>>
>>> The IBMtape driver also has a lot more debugging, logging and 
>>> monitoring capablities than the generic ST driver - which is rather 
>>> long in the tooth, to say the least.... :)
>>>
>>> As far as I've been able to determine, the IBMtape driver doesn't 
>>> care if the drives it's talking to are actually IBM or HP ones 
>>> (they're the only 2 LTO drive makers left now), so if Bacula could 
>>> use it, the driver is a worthwhile addition to any LTO-based backup 
>>> system.
>>>
>>>
>>>
>>>
>>> CONFIDENTIALITY: This email (including any attachments) may contain 
>>> confidential, proprietary and privileged information, and 
>>> unauthorized disclosure or use is prohibited. If you received this 
>>> email in error, please notify the sender and delete this email from 
>>> your system. Thank you.
>>
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>