TSM Backupset for nodes bigger than 2GB most of the time fails

Josep

ADSM.ORG Member
Joined
Apr 29, 2010
Messages
13
Reaction score
0
Points
0
Our offsite backup is based in backupset TSM feature. We create, on weekends, one backupset for every node. All works fine but in bigger nodes, the backupset fails:

04/18/2015 23:48:03 ANR8503E
A failure occurred in writing to volume
N:\ivcrdc.ost. (SESSION: 34176, PROCESS: 1
04/18/2015 23:48:03 ANR3513E GENERATE BACKUPSET: Output error
encountered in
accessing data storage. (SESSION: 34176,
PROCESS: 150)
04/18/2015 23:48:03 ANR3503E Generation of backup set for IVCRD
IVCRDC.219999472 (data type File) failed.
(SESSION:
34176, PROCESS: 150)
...
...
04/19/2015 03:31:17 ANR8503E A failure occurred in writing to
volume
N:\29367013.ost. (SESSION: 34183, PROCESS:
157)
04/19/2015 03:31:17 ANR3513E GENERATE BACKUPSET: Output error
encountered in
accessing data storage. (SESSION: 34183,
PROCESS: 157)
04/19/2015 03:31:18 ANR3503E Generation of backup set for
ZMSTORE01.DIPCAS.ES
as ZMSTORE01.219999486 (data type File)
failed. (SESSION:
34183, PROCESS: 157).

This issue happened to us with TSM Server 6.4 and with recently new fresh installed TSM Server 7.1.1.100 (the messages above have occurred in the last version). All of them over Windows Server 2008 R2 64 Bits.

Workaround avoiding concurrency between backupset generation and reclaim, deduplication, expiration do not resolve the issue.
Problems with storage access are discarded too.

Any contribute ideas or suggestions will be appreciated.
 
According to this technote: http://www-01.ibm.com/support/docview.wss?uid=swg21646562, it's an issue at the OS level when accessing that disk. The technote has information on tracing the problem to get the OS return code. The technote was written for AIX, but applies for Windows. The only difference is that when you find the OS error code, you will need to use "net help #" to find out what it means. Just as an example, lets say the error code was 21, you would do:

Code:
C:\>net helpmsg 21

The device is not ready.
 
Thank you for your response.
I trace backupset generation and catch this:
*********************************************************
03:49:50.247 [70][pvr.c][13124][AgentThread]:pVR I/O agent (70) processing WRITE request.
03:49:50.247 [70][pvrfil64.c][1805][FileWrite]:Writing 32768 bytes to FILE volume N:\pansusos_wofima.ost.
03:49:50.247 [69][afrtrv.c][1729][AfGetRtrvOrder]:Reporting rtrvOrder volId: 387, seqNum: 586, offset: 0 in pool 4 for bitfile: 25861320.
03:49:50.247 [70][pvrfil64.c][1835][FileWrite]:Current file position is 1468689783790 bytes.
03:49:50.247 [69][bfrtrv.c][5062][bfGetRtrvInfoExt]:rc 0 from AfGetRtrvOrder getting NON-ACTIVE, restoreorder1 = 387, restoreorder2_3 = 586, restoreorder4_5 = 0.
03:49:50.248 [70][pspvrfio.c][574][PvrFioWrite]:fakeFull is False
03:49:50.248 [69][bfrtrv.c][5218][bfGetRtrvInfoExt]:Bitfile 23701694 found, setting hint to 25861320.
03:49:50.248 [69][bfaggrut.c][4366][BfGetBitfileExtents]:Looking for extents of 23701694 in pool 4
03:49:50.248 [70][pspvrfio.c][620][PvrFioWrite]:Write error with LastError = 665
03:49:50.249 [70][output.c][7474][PutConsoleMsg]:ANR8503E A failure occurred in writing to volume N:\pansusos_wofima.ost.~
03:49:50.249 [70][pvr.c][13563][AgentThread]:pVR I/O agent (70) finished WRITE request; rc=2813.
03:49:50.249 [70][pvr.c][13077][AgentThread]:pVR I/O agent (70) waiting for next request.
03:49:50.250 [68][output.c][7474][PutConsoleMsg]:ANR3513E GENERATE BACKUPSET: Output error encountered in accessing data storage.~
03:49:50.250 [68][bfgenset.c][2780][WriteObjSetStream]:Exit, rc=3017
03:49:50.250 [68][bfgenset.c][3726][SendObjSetData]:exit, rc=3017
***********************************************************
Doing that you say in TSM server Windows OS:
***********************************************************
T:\>net helpmsg 665

The requested operation could not be completed due to a file system limitation
***********************************************************
Seems is not TSM blame, is Windows OS blame. Looking for the error I found this KB:
https://support.microsoft.com/en-us/kb/967351
...But, after applied, same error occurs in backupset generation.

My action plan is to search a little more about the Windows OS error "...file system limitation", wait a little if TSM support say somthing and if all of that does not work I will migrate TSM to Linux (after 12 years with Windows).
 
Try a new file device class and put the volume size at 2GB (or lower). You will have multiple volumes for the backupset, but it should work.
 
Try a new file device class and put the volume size at 2GB (or lower). You will have multiple volumes for the backupset, but it should work.

Great! It Works!
Two changes made: maximum volume size in backupset devclass is now 512GB and reduced number of writting simultaneus processes from 5 to 2. All the backupset has been succesfully created. I'm going to try with 3 or 4 writting processes looking for accelerate whole offsite generation.

Thank you very much for your ideas. I'm really happy.
 
Back
Top