ADSM-L

Re: [ADSM-L] TSM 8.1.1 on Linux crash

2018-01-09 12:17:22
Subject: Re: [ADSM-L] TSM 8.1.1 on Linux crash
From: Martin Janosik <martin.janosik AT CZ.IBM DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 9 Jan 2018 18:14:08 +0100
Hello there,

errno 16 usually corresponds to 'device busy'. I have observed this error
mostly on systems with shared library (shared=yes) and with storage agents
that had some problems to communicate with TSM server.
Are you facing the issue only on a single tape drive, or all? Are they
zoned by any chance also to other systems, i.e. 2nd cluster node?
Can you send few lines from actlog related to the drive with error before
failure occured?

M. Janosik

"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 2018-01-09
17:58:09:

> From: Remco Post <r.post AT PLCS DOT NL>
> To: ADSM-L AT VM.MARIST DOT EDU
> Date: 2018-01-09 17:59
> Subject: [ADSM-L] TSM 8.1.1 on Linux crash
> Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
>
> Hi All,
>
> over here we have a few TSM servers on Linux (RHEL 7.4) TSM 8.1.1.
> 021 (to be upgraded to .100 soon) and we see something new that we
> never saw on AIX. If for some reason a tape gets left in a drive
> (IBM 3592) the only way to get the drive working again in to reboot
> the drive. Until then TSM is unable to open the drive (error 16).
> Unfortunately we seem to be able to reliably cause severe issues in
> TSM by rebooting a tapedrive:
>
> kernel: NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s!
[dsmserv:6021]
>
> The only way out is to reboot the entire server.
>
> We never had such issues with TSM on AIX… is this something Linux-
> specific that we can hang TSM in non-interuptable routines (kernel
> space) by simply rebooting a tape drive?
>
> --
>
>  Met vriendelijke groeten/Kind Regards,
>
> Remco Post
> r.post AT plcs DOT nl
> +31 6 248 21 622
>

ADSM.ORG Privacy and Data Security by KimLaw, PLLC