--0-2006260309-1084985180=:67156
Content-Type: text/plain; charset=us-ascii
is your fibre switch showing many errors on your hba and tape drive ports?
drivers/firmware up to date for the hba's, switches, etc?. whats the up time
on this media server? seems like the drives are fine as you have a grand total
of 4 hard/soft errors across youre drives.. all four errors are soft errors
which are 'recoverable'.
our 84's were due to hard tape errors... syslog should show some type of
error related to the transport error... you probably can check with hba and
tape vendors about the error codes returned
Steve Mickeler <steve AT warning DOT ca> wrote:
netstat -k shows:
st32,err:
Soft Errors 0 Hard Errors 0 Transport Errors 5 Vendor IBM
Product ULTRIUM-TD2 Revision Revision 38D0 Serial No
st31,err:
Soft Errors 2 Hard Errors 0 Transport Errors 25 Vendor IBM
Product ULTRIUM-TD2 Revision Revision 38D0 Serial No
st30,err:
Soft Errors 2 Hard Errors 0 Transport Errors 21 Vendor IBM
Product ULTRIUM-TD2 Revision Revision 38D0 Serial No
st33,err:
Soft Errors 0 Hard Errors 0 Transport Errors 10 Vendor IBM
Product ULTRIUM-TD2 Revision Revision 38D0 Serial No
# modinfo | grep tape
51 78234000 12d04 33 1 st (SCSI tape Driver 1.216)
I'm also seeing Transport Errors on all the drives on all the SSO boxes.
On Wed, 19 May 2004, K Chapman wrote:
> its trying to write the end of file mark and fails... if this is
> solaris, you can do netstat -k and look for st,
> it will show you error counts on the drive... you can also look in the
> syslog, it should show the driver returning the write error along with
> the other info. your getting errors across all your drives and with
> diff tapes... we had something similar and it turned out to be really
> bad drives (and drive tech, damn exabyte)
>
> Steve Mickeler wrote:
> Ive been experiencing quite a few error 84 (media write error) lately.
>
> A job will start running and then I'll notice in the activity monitor that
> the "KB per Second" number for some of the streams shows the same number
> ie: 17804 at which point I know that those jobs are going to fail and I'll
> end up with an error 84 for those streams. The job will then start again,
> sometimes using the same media, sometimes a new media, but it will
> generally succeed the second time.
>
> Any ideas as to what is causing the first job to fail ?
>
>
>
> from the bptm log:
>
>
> 00:03:49.218 [7267] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.15845)
> on drive index 2
>
> 00:21:49.411 [7267] <16> io_ioctl: ioctl (MTWEOF) failed on media id
> 000041, drive index 2, I/O error (bptm.c.15845)
>
> 00:21:49.412 [7267] <2> log_media_error: successfully wrote to error file
> - 05/19/04 00:21:49 000041 2 WRITE_ERROR
>
> 00:21:49.412 [7267] <2> check_error_history: called from bptm line 15869,
> EXIT_Status = 84
>
> 00:21:49.979 [7267] <2> check_error_history: drive index = 2, media id =
> 000041, time = 05/19/04 00:21:49, both_match = 0, media_match = 1,
> drive_match = 0
>
> ---------------------------------------------------------------------------------
>
> 00:31:45.025 [11412] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.15845)
> on drive index 0
>
> 00:49:45.221 [11412] <16> io_ioctl: ioctl (MTWEOF) failed on media id
> 000041, drive index 0, I/O error (bptm.c.15845)
>
> 00:49:45.222 [11412] <2> log_media_error: successfully wrote to error file
> - 05/19/04 00:49:45 000041 0 WRITE_ERROR
>
> 00:49:45.222 [11412] <2> check_error_history: called from bptm line 15869,
> EXIT_Status = 84
>
> 00:49:45.786 [11412] <2> check_error_history: drive index = 0, media id =
> 000041, time = 05/19/04 00:49:45, both_match = 0, media_match = 2,
> drive_match = 0
>
> 00:49:45.992 [11412] <8> check_error_history: FREEZING media id 000041, it
> has had at least 3 errors in the last 12 hour(s)
>
> ---------------------------------------------------------------------------------
>
> 00:36:44.063 [15640] <2> io_ioctl: command (2)MTBSF 1 from (bptm.c.17369)
> on drive index 2
>
> 00:36:44.071 [15640] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.17395)
> on drive index 2
>
> 00:54:44.272 [15640] <16> io_ioctl: ioctl (MTWEOF) failed on media id
> 000079, drive index 2, I/O error (bptm.c.17395)
>
> 00:54:44.274 [15640] <2> log_media_error: successfully wrote to error file
> - 05/19/04 00:54:44 000079 2 WRITE_ERROR
>
> ---------------------------------------------------------------------------------
>
> 01:03:15.049 [19343] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.15845)
> on drive index 1
>
> 01:21:15.243 [19343] <16> io_ioctl: ioctl (MTWEOF) failed on media id
> 000017, drive index 1, I/O error (bptm.c.15845)
>
> 01:21:15.244 [19343] <2> log_media_error: successfully wrote to error file
> - 05/19/04 01:21:15 000017 1 WRITE_ERROR
>
> 01:21:15.244 [19343] <2> check_error_history: called from bptm line 15869,
> EXIT_Status = 84
>
> 01:21:15.813 [19343] <2> check_error_history: drive index = 1, media id =
> 000017, time = 05/19/04 01:21:15, both_match = 0, media_match = 0,
> drive_match = 2
>
> 01:22:20.672 [19343] <8> check_error_history: DOWN'ing drive index 1, it
> has had at least 3 errors in last 12 hour(s)
>
>
> ---------------------------------------------------------------------------------
>
> 06:31:57.233 [13507] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.15845)
> on drive index 0
>
> 06:49:57.432 [13507] <16> io_ioctl: ioctl (MTWEOF) failed on media id
> 000033, drive index 0, I/O error (bptm.c.15845)
>
> 06:49:57.433 [13507] <2> log_media_error: successfully wrote to error file
> - 05/19/04 06:49:57 000033 0 WRITE_ERROR
>
> 06:49:57.433 [13507] <2> check_error_history: called from bptm line 15869,
> EXIT_Status = 84
>
> 06:49:58.006 [13507] <2> check_error_history: drive index = 0, media id =
> 000033, time = 05/19/04 06:49:57, both_match = 0, media_match = 0,
> drive_match = 1
>
>
> _______________________________________________
> Veritas-bu maillist - Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
>
> aaarrrggghhh!!!!
> FreeBSD rocks
>
> ---------------------------------
> Do you Yahoo!?
> SBC Yahoo! - Internet access at a great low price.
aaarrrggghhh!!!!
FreeBSD rocks
---------------------------------
Do you Yahoo!?
SBC Yahoo! - Internet access at a great low price.
--0-2006260309-1084985180=:67156
Content-Type: text/html; charset=us-ascii
<DIV>is your fibre switch showing many errors on your hba and tape drive
ports? drivers/firmware up to date for the hba's, switches, etc?.
whats the up time on this media server? seems like the drives are fine as
you have a grand total of 4 hard/soft errors across youre drives.. all
four errors are soft errors which are 'recoverable'.</DIV>
<DIV> </DIV>
<DIV>our 84's were due to hard tape errors... syslog should
show some type of error related to the transport error... you probably can
check with hba and tape vendors about the error codes
returned<BR><BR><B><I>Steve Mickeler <steve AT warning DOT ca></I></B>
wrote:</DIV>
<BLOCKQUOTE class=replbq style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px;
BORDER-LEFT: #1010ff 2px solid"><BR>netstat -k shows:<BR><BR>st32,err:<BR>Soft
Errors 0 Hard Errors 0 Transport Errors 5 Vendor IBM<BR>Product ULTRIUM-TD2
Revision Revision 38D0 Serial No<BR><BR>st31,err:<BR>Soft Errors 2 Hard Errors
0 Transport Errors 25 Vendor IBM<BR>Product ULTRIUM-TD2 Revision Revision 38D0
Serial No<BR><BR>st30,err:<BR>Soft Errors 2 Hard Errors 0 Transport Errors 21
Vendor IBM<BR>Product ULTRIUM-TD2 Revision Revision 38D0 Serial
No<BR><BR>st33,err:<BR>Soft Errors 0 Hard Errors 0 Transport Errors 10 Vendor
IBM<BR>Product ULTRIUM-TD2 Revision Revision 38D0 Serial No<BR><BR># modinfo |
grep tape<BR>51 78234000 12d04 33 1 st (SCSI tape Driver 1.216)<BR><BR>I'm also
seeing Transport Errors on all the drives on all the SSO boxes.<BR><BR><BR>On
Wed, 19 May 2004, K Chapman wrote:<BR><BR>> its trying to write the end of
file mark and fails... if this is<BR>> solaris, you can do netstat -k and
look
for st<TAPE # instance driver>,<BR>> it will show you error counts on the
drive... you can also look in the<BR>> syslog, it should show the driver
returning the write error along with<BR>> the other info. your getting
errors across all your drives and with<BR>> diff tapes... we had something
similar and it turned out to be really<BR>> bad drives (and drive tech, damn
exabyte)<BR>><BR>> Steve Mickeler <STEVE AT WARNING DOT CA>wrote:<BR>>
Ive been experiencing quite a few error 84 (media write error)
lately.<BR>><BR>> A job will start running and then I'll notice in the
activity monitor that<BR>> the "KB per Second" number for some of the
streams shows the same number<BR>> ie: 17804 at which point I know that
those jobs are going to fail and I'll<BR>> end up with an error 84 for those
streams. The job will then start again,<BR>> sometimes using the same media,
sometimes a new media, but it will<BR>> generally succeed the second
time.<BR>><BR>> Any ideas as to what is causing the first job to fail
?<BR>><BR>><BR>><BR>> from the bptm log:<BR>><BR>><BR>>
00:03:49.218 [7267] <2> io_ioctl: command (0)MTWEOF 1 from
(bptm.c.15845)<BR>> on drive index 2<BR>><BR>> 00:21:49.411 [7267]
<16> io_ioctl: ioctl (MTWEOF) failed on media id<BR>> 000041, drive
index 2, I/O error (bptm.c.15845)<BR>><BR>> 00:21:49.412 [7267] <2>
log_media_error: successfully wrote to error file<BR>> - 05/19/04 00:21:49
000041 2 WRITE_ERROR<BR>><BR>> 00:21:49.412 [7267] <2>
check_error_history: called from bptm line 15869,<BR>> EXIT_Status =
84<BR>><BR>> 00:21:49.979 [7267] <2> check_error_history: drive
index = 2, media id =<BR>> 000041, time = 05/19/04 00:21:49, both_match = 0,
media_match = 1,<BR>> drive_match = 0<BR>><BR>>
---------------------------------------------------------------------------------<BR>><BR>>
00:31:45.025
[11412] <2> io_ioctl: command (0)MTWEOF 1 from (bptm.c.15845)<BR>> on
drive index 0<BR>><BR>> 00:49:45.221 [11412] <16> io_ioctl: ioctl
(MTWEOF) failed on media id<BR>> 000041, drive index 0, I/O error
(bptm.c.15845)<BR>><BR>> 00:49:45.222 [11412] <2> log_media_error:
successfully wrote to error file<BR>> - 05/19/04 00:49:45 000041 0
WRITE_ERROR<BR>><BR>> 00:49:45.222 [11412] <2> check_error_history:
called from bptm line 15869,<BR>> EXIT_Status = 84<BR>><BR>>
00:49:45.786 [11412] <2> check_error_history: drive index = 0, media id
=<BR>> 000041, time = 05/19/04 00:49:45, both_match = 0, media_match =
2,<BR>> drive_match = 0<BR>><BR>> 00:49:45.992 [11412] <8>
check_error_history: FREEZING media id 000041, it<BR>> has had at least 3
errors in the last 12 hour(s)<BR>><BR>>
---------------------------------------------------------------------------------<BR>><BR>>
00:36:44.063 [15640]
<2> io_ioctl: command (2)MTBSF 1 from (bptm.c.17369)<BR>> on drive
index 2<BR>><BR>> 00:36:44.071 [15640] <2> io_ioctl: command
(0)MTWEOF 1 from (bptm.c.17395)<BR>> on drive index 2<BR>><BR>>
00:54:44.272 [15640] <16> io_ioctl: ioctl (MTWEOF) failed on media
id<BR>> 000079, drive index 2, I/O error (bptm.c.17395)<BR>><BR>>
00:54:44.274 [15640] <2> log_media_error: successfully wrote to error
file<BR>> - 05/19/04 00:54:44 000079 2 WRITE_ERROR<BR>><BR>>
---------------------------------------------------------------------------------<BR>><BR>>
01:03:15.049 [19343] <2> io_ioctl: command (0)MTWEOF 1 from
(bptm.c.15845)<BR>> on drive index 1<BR>><BR>> 01:21:15.243 [19343]
<16> io_ioctl: ioctl (MTWEOF) failed on media id<BR>> 000017, drive
index 1, I/O error (bptm.c.15845)<BR>><BR>> 01:21:15.244 [19343]
<2> log_media_error: successfully wrote to error file<BR>> - 05/19/04
01:21:15
000017 1 WRITE_ERROR<BR>><BR>> 01:21:15.244 [19343] <2>
check_error_history: called from bptm line 15869,<BR>> EXIT_Status =
84<BR>><BR>> 01:21:15.813 [19343] <2> check_error_history: drive
index = 1, media id =<BR>> 000017, time = 05/19/04 01:21:15, both_match = 0,
media_match = 0,<BR>> drive_match = 2<BR>><BR>> 01:22:20.672 [19343]
<8> check_error_history: DOWN'ing drive index 1, it<BR>> has had at
least 3 errors in last 12 hour(s)<BR>><BR>><BR>>
---------------------------------------------------------------------------------<BR>><BR>>
06:31:57.233 [13507] <2> io_ioctl: command (0)MTWEOF 1 from
(bptm.c.15845)<BR>> on drive index 0<BR>><BR>> 06:49:57.432 [13507]
<16> io_ioctl: ioctl (MTWEOF) failed on media id<BR>> 000033, drive
index 0, I/O error (bptm.c.15845)<BR>><BR>> 06:49:57.433 [13507]
<2> log_media_error: successfully wrote to error file<BR>> - 05/19/04
06:49:57 000033 0
WRITE_ERROR<BR>><BR>> 06:49:57.433 [13507] <2>
check_error_history: called from bptm line 15869,<BR>> EXIT_Status =
84<BR>><BR>> 06:49:58.006 [13507] <2> check_error_history: drive
index = 0, media id =<BR>> 000033, time = 05/19/04 06:49:57, both_match = 0,
media_match = 0,<BR>> drive_match = 1<BR>><BR>><BR>>
_______________________________________________<BR>> Veritas-bu maillist -
Veritas-bu AT mailman.eng.auburn DOT edu<BR>>
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu<BR>><BR>><BR>>
aaarrrggghhh!!!!<BR>> FreeBSD rocks<BR>><BR>>
---------------------------------<BR>> Do you Yahoo!?<BR>> SBC Yahoo! -
Internet access at a great low
price.</BLOCKQUOTE><BR><BR>aaarrrggghhh!!!!<br>FreeBSD rocks<p>
<hr size=1><font face=arial size=-1>Do you Yahoo!?<br><a
href="http://pa.yahoo.com/*http://us.rd.yahoo.com/evt=24311/*http://promo.yahoo.com/sbc/">SBC
Yahoo!</a> - Internet access at a great low price.
--0-2006260309-1084985180=:67156--
|