Bacula-users

[Bacula-users] WG: AW: Problem on Solaris platform

2009-04-29 03:22:15
Subject: [Bacula-users] WG: AW: Problem on Solaris platform
From: "Fahrer, Julian" <julian AT fahrer DOT net>
To: <bacula-users AT lists.sourceforge DOT net>
Date: Wed, 29 Apr 2009 09:16:03 +0200
It might also the bus on the changer side. Maybe try it on different hardware?

You don't need to install anything to compile bacula on solaris 10. It ships 
with everything (at least for every option I compile bacula with).
Let me know if you need anything for that.

-----Ursprüngliche Nachricht-----
Von: Arno Lehmann [mailto:al AT its-lehmann DOT de] 
Gesendet: Mittwoch, 29. April 2009 00:06
An: Fahrer, Julian
Betreff: Re: AW: [Bacula-users] Problem on Solaris platform

Hi,

28.04.2009 23:22, Fahrer, Julian wrote:
> Hi,
> 
> that really looks like an hardware issue. Did you only "check" the cables or 
> did you try new ones? I have seen similar problems with old RAID-System and 
> have been told that this just happens. The old SCSI-Buses just brake over the 
> time.

Well, this system is not located close to me... the guys operating it 
exchanged the *external* cabling, but did not touch the internal 
cabling of the library (quite naturally).

Actually, the HBA and cables in use now are old, but the problem 
existed with new HBA and cables, too.

>> The machine is running Bacula version 2.2.8 (yes, I know it's 
>> outdated, but that's what blastwave offers, and compiling is not an 
>> option currently...).
> 
> Why not? Compiles out of the box ;)

Sure - once you've got the build environment. And that's not something 
that should be installed on that box.

(I'm preparing a Solaris build ystem here, but with low priority).

> Kinds regards

Thanks!

Arno

> Julian
> 
> -----Ursprüngliche Nachricht-----
> Von: Arno Lehmann [mailto:al AT its-lehmann DOT de] 
> Gesendet: Dienstag, 28. April 2009 22:26
> An: bacula-users
> Betreff: [Bacula-users] Problem on Solaris platform
> 
> Hello,
> 
> I'm experiencing a problem on a machine running Sun's Solaris. It's an 
>   x86 machine running Solaris 10:
> 
> # uname -a
> SunOS blah.domain.tld 5.10 Generic_137138-09 i86pc i386 i86pc
> 
> The machine is running Bacula version 2.2.8 (yes, I know it's 
> outdated, but that's what blastwave offers, and compiling is not an 
> option currently...). The tape library is a two-drive Qualstar one 
> using AIT-5 drives connected to a parallel-SCSI HBA.
> 
> Some times - not easily reproduced - the tape drive or autochanger 
> seems to have problems like these:
> 
> 16-Apr 02:49 blah-sd JobId 2097: Error: block.c:569 Write error at 
> 472:3118 on device "AIT5-drv1" (/dev/rmt/1cbn). ERR=I/O\ error.
> 16-Apr 02:49 blah-sd JobId 2097: Error: Error writing final EOF to 
> tape. This Volume may not be readable.
> dev.c:1669 ioctl MTWEOF error on "AIT5-drv1" (/dev/rmt/1cbn). ERR=I/O 
> error.
> 16-Apr 02:49 blah-sd JobId 2097: End of medium on Volume "A00131" 
> Bytes=470,705,485,824 Blocks=7,296,401 at 16-Apr-2009 02\:49.
> 16-Apr 02:49 blah-sd JobId 2097: 3307 Issuing autochanger "unload slot 
> 19, drive 1" command.
> 16-Apr 02:50 blah-sd JobId 2097: 3995 Bad autochanger "unload slot 19, 
> drive 1": ERR=Child exited with code 1
> Results=/dev/rmt/1cbn: no tape loaded or drive offline
> Unloading drive 1 into Storage Element 19...mtx: Request Sense: Long 
> Report=yes
> mtx: Request Sense: Valid Residual=no
> mtx: Request Sense: Error Code=70 (Current)
> mtx: Request Sense: Sense Key=Illegal Request
> mtx: Request Sense: FileMark=no
> mtx: Request Sense: EOM=no
> mtx: Request Sense: ILI=no
> mtx: Request Sense: Additional Sense Code = 3B
> mtx: Request Sense: Additional Sense Qualifier = 90
> mtx: Request Sense: Field in Error = 00
> mtx: Request Sense: BPV=no
> mtx
> 
> At first this looks like a "regular" tape or drive error. 
> Unfortunately, as also seen, it also affects the autoloader which 
> can't unload the tape.
> 
> In the system log, I find errors like these at that times:
> 
> Apr 27 20:40:04 blah.domain.tld itmpt: [ID 556182 kern.info] itmpt0: 
> target 2 fallback from Ultra Wide to Ultra Narrow
> 
> *** this seems to indicate a serious SCSI problem, probably 
> hardware-related.
> 
> Apr 27 20:40:04 blah.domain.tld scsi: [ID 107833 kern.warning] 
> WARNING: 
> /pci@0,0/pci8086,25e2@2/pci8086,3500@0/pci8086,3510@0/pci10b5,8114@0/pci103c,322a@8/st@2,0
>  
> (st7):
> Apr 27 20:40:04 blah.domain.tld        SCSI transport failed: reason 
> 'reset': giving up
> 
> *** then the bus is reset
> 
> Apr 27 20:40:07 blah.domain.tld scsi: [ID 107833 kern.warning] 
> WARNING: 
> /pci@0,0/pci8086,25e2@2/pci8086,3500@0/pci8086,3510@0/pci10b5,8114@0/pci103c,322a@8/st@2,0
>  
> (st7):
> Apr 27 20:40:07 blah.domain.tld        Error for Command: 
> rezero/rewind           Error Level: Fatal
> Apr 27 20:40:07 blah.domain.tld scsi: [ID 107833 kern.notice] 
> Requested Block: 3460                      Error Block: 3460
> Apr 27 20:40:07 blah.domain.tld scsi: [ID 107833 kern.notice]  Vendor: 
> SONY                               Serial Number:
> Apr 27 20:40:07 blah.domain.tld scsi: [ID 107833 kern.notice]  Sense 
> Key: Not Ready
> Apr 27 20:40:07 blah.domain.tld scsi: [ID 107833 kern.notice]  ASC: 
> 0x4 (LUN not ready), ASCQ: 0x0, FRU: 0x0
> 
> The cabling has been checked, double- and triple-checked by now. The 
> HBA has been replaced. Mtx typically works correctly, only after 
> things are broken some manual intervention is required, i.e. the 
> library needs a pwer-cycle or manual unloading of the affected cartridge.
> 
> There is no pattern I can see regarding the tape cartridges, but I'm 
> quite sure the problem always affects the same drive (though those two 
> drives are not necessarily evenly loaded).
> 
> I think it's either a hardware problem with the tape library, or a 
> SCSI related driver issue.
> 
> As I neither have another Qualstar or AIT device here, nor an other 
> Sun machine for comparison, I'd like to know if any of you have seen 
> similar problems. Or know better than I do what Solaris is actually 
> complaining about :-)
> 
> Some advice which manufacturer - Sun or Qualstar - to first contact 
> would also be appreciated!
> 
> Thanks,
> 
> Arno
> 
> 

-- 
Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück
www.its-lehmann.de

------------------------------------------------------------------------------
Register Now & Save for Velocity, the Web Performance & Operations 
Conference from O'Reilly Media. Velocity features a full day of 
expert-led, hands-on workshops and two days of sessions from industry 
leaders in dedicated Performance & Operations tracks. Use code vel09scf 
and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>
  • [Bacula-users] WG: AW: Problem on Solaris platform, Fahrer, Julian <=