I/O errors on VTL

spiffy

ADSM.ORG Member
Joined
Feb 9, 2007
Messages
374
Reaction score
1
Points
0
Hi all

We recently implemented 2 EMC VTL's (DL4200) in our environment. We did quite a bit of remediation work prior to getting hooking up the VTL which included updating the HBA drivers/firmware (driver is 9.1.7.18), updated TSM to 5.4.5.1 (on windows server 2003 SP2), ran new fiber cables into a new fiber switch (EMC DCX switch) and what not.
We still have our physical library connected too.

Since all this, i have been getting lots of i/o write errors on both the Virtual tapes (drives emulate 3592 JIA drives with compression enabled) and the physical tapes (LTO2 drives and tapes)

I can see in the TSM activity log the piles of write and read errors, and where it knocks the tapes into readonly mode, and i usually just go in and mark them readw.

Now sometimes, i do see corresponding ql2300 errors in my windows event viewer, but not all the time.

I have no clue where i need to start here. I am using the recommended HBA drivers, i am at a stable version of TSM.
I will probably open a call with both IBM, and EMC to see what we can track down, but i was wondering if anyone here may have some hints on what to look at.

thanks
 
Could you paste the actlog details describing the I/O erros.If you are in Windows 2003 there are some Microsoft KB patches for the HBA,you can find them in Qlogic driver download section.
 
I have the KB957910 installed on both my TSM servers for the HBA driver version (this hotfix updates the diskdump.sys and storport.sys files)


here is a snippet from the actlog from today

12/05/2009 14:37:15 ANR0408I Session 28106 started for server DCRTSM (Windows)
(Tcp/Ip) for library sharing. (SESSION: 28106)
12/05/2009 14:37:15 ANR0409I Session 28106 ended for server DCRTSM (Windows).
(SESSION: 28106)
12/05/2009 14:37:27 ANR8311E An I/O error occurred while accessing drive
DCRDB12 (mt5.12.0.4) for WRITE operation, errno = 121.
(PROCESS: 475)
12/05/2009 14:37:27 ANR8311E An I/O error occurred while accessing drive
DCRDB19 (mt5.19.0.4) for READ operation, errno = 121.
(PROCESS: 474)
12/05/2009 14:37:27 ANR1411W Access mode for volume DB0728 now set to
"read-only" due to write error. (PROCESS: 475)
12/05/2009 14:37:27 ANR0515I Process 475 closed volume DB0728. (PROCESS: 475)
12/05/2009 14:37:27 ANR8468I 3592 volume DB0728 dismounted from drive DCRDB12
(mt5.12.0.4) in library DCRVLIBB. (PROCESS: 475)
12/05/2009 14:37:28 ANR8337I 3592 volume DB0461 mounted in drive DCRDB16
(mt5.16.0.4). (PROCESS: 475)
12/05/2009 14:37:28 ANR0513I Process 475 opened output volume DB0461.
(PROCESS: 475)
12/05/2009 14:37:28 ANR0408I Session 28107 started for server DCRTSM (Windows)
(Tcp/Ip) for library sharing. (SESSION: 28107)
12/05/2009 14:37:28 ANR0408I Session 28108 started for server DCRTSM (Windows)
(Tcp/Ip) for library sharing. (SESSION: 28108)
12/05/2009 14:37:28 ANR0408I Session 28109 started for server DCRTSM (Windows)
(Tcp/Ip) for library sharing. (SESSION: 28107)
12/05/2009 14:37:28 ANR0409I Session 28109 ended for server DCRTSM (Windows).
(SESSION: 28107)
12/05/2009 14:37:28 ANR0409I Session 28108 ended for server DCRTSM (Windows).
(SESSION: 28108)
12/05/2009 14:37:28 ANR8468I 3592 volume DA0113 dismounted from drive DCRDA18
(mt4.18.0.4) in library DCRVLIBA. (SESSION: 28095)
12/05/2009 14:37:28 ANR0409I Session 28107 ended for server DCRTSM (Windows).
(SESSION: 28107)
12/05/2009 14:37:47 ANR8311E An I/O error occurred while accessing drive
DCRDB01 (mt5.1.0.4) for READ operation, errno = 1117.
(PROCESS: 477)
12/05/2009 14:38:02 ANR8311E An I/O error occurred while accessing drive
DCRDB20 (mt5.20.0.4) for READ operation, errno = 121.
(PROCESS: 473)
12/05/2009 14:38:02 ANR8311E An I/O error occurred while accessing drive
DCRDB15 (mt5.15.0.4) for WRITE operation, errno = 121.
(PROCESS: 473)
12/05/2009 14:38:02 ANR1411W Access mode for volume DB0708 now set to
"read-only" due to write error. (PROCESS: 473)
12/05/2009 14:38:02 ANR0515I Process 473 closed volume DB0708. (PROCESS: 473)
12/05/2009 14:38:02 ANR8468I 3592 volume DB0708 dismounted from drive DCRDB15
(mt5.15.0.4) in library DCRVLIBB. (PROCESS: 473)
12/05/2009 14:38:03 ANR8337I 3592 volume DB0256 mounted in drive DCRDB17
(mt5.17.0.4). (PROCESS: 473)
12/05/2009 14:38:03 ANR0513I Process 473 opened output volume DB0256.
(PROCESS: 473)
12/05/2009 14:38:15 ANR8311E An I/O error occurred while accessing drive
DCRDB19 (mt5.19.0.4) for READ operation, errno = 1117.
(PROCESS: 474)
12/05/2009 14:38:23 ANR8325I Dismounting volume DA0364 - 2 minute mount
retention expired.
12/05/2009 14:38:23 ANR0408I Session 28110 started for server DCRTSM (Windows)
(Tcp/Ip) for library sharing. (SESSION: 28110)
12/05/2009 14:38:23 ANR0409I Session 28110 ended for server DCRTSM (Windows).
(SESSION: 28110)
12/05/2009 14:38:23 ANR8336I Verifying label of 3592 volume DA0364 in drive
DCRDA05 (mt4.5.0.4). (SESSION: 27552)
12/05/2009 14:38:23 ANR8468I 3592 volume DA0364 dismounted from drive DCRDA05
(mt4.5.0.4) in library DCRVLIBA. (SESSION: 27552)
12/05/2009 14:38:29 ANR8325I Dismounting volume DA0039 - 2 minute mount
retention expired.
12/05/2009 14:38:29 ANR0408I Session 28111 started for server DCRTSM (Windows)
(Tcp/Ip) for library sharing. (SESSION: 28111)
12/05/2009 14:38:29 ANR0409I Session 28111 ended for server DCRTSM (Windows).
(SESSION: 28111)
12/05/2009 14:38:29 ANR8336I Verifying label of 3592 volume DA0039 in drive
DCRDA13 (mt4.13.0.4). (SESSION: 27901)
12/05/2009 14:38:29 ANR8468I 3592 volume DA0039 dismounted from drive DCRDA13
(mt4.13.0.4) in library DCRVLIBA. (SESSION: 27901)
12/05/2009 14:38:46 ANR8311E An I/O error occurred while accessing drive
DCRDB19 (mt5.19.0.4) for READ operation, errno = 1117.
(PROCESS: 474)
12/05/2009 14:39:10 ANR0403I Session 28104 ended for node NAMF217059-SQL
(SQL-BACKTRACK). (SESSION: 28104)
12/05/2009 14:39:13 ANR0406I Session 28112 started for node NAMF217059-SQL
(SQL-BACKTRACK) (Tcp/Ip namf217059(59274)).
(SESSION: 28112)
12/05/2009 14:39:21 ANR8311E An I/O error occurred while accessing drive
DCRDB13 (mt5.13.0.4) for WRITE operation, errno = 121.
(PROCESS: 476)
12/05/2009 14:39:21 ANR1411W Access mode for volume DB0713 now set to
"read-only" due to write error. (PROCESS: 476)
12/05/2009 14:39:21 ANR0515I Process 476 closed volume DB0713. (PROCESS: 476)


I also have a TSM trace file i ran and the actlog from that time (a couple weeks back i think is when i ran it)
THe errors come up during any tape function (reclamation, migration, etc etc)
 
Back
Top