bhaven41
ADSM.ORG Member
Hi,
We have production GPFS Linux systems (22 in cluster).
We take falshcopy for GPFS filesystems (DS8K) and mount it to a dedicated Linux mounting server(1 node GPFS cluster) for TSM to backup.
This is the FS mounted, 1.9TB with around 10 millions small files. (and growing)
/dev/rsidev 4.3T 1.9T 2.5T 44% /opt/rsi
This is not production server itself however I can not use ba client (only 1 filesystem - no multisession) and I can not use Image backup (not supported).
So using following mmbackup command to back it up.
/usr/lpp/mmfs/bin/mmbackup /dev/rsidev -n /opt/tivoli/tsm/client/ba/bin/<project_name>/mmbackup.ctrl -t full -s /var/tmp
# more /opt/tivoli/tsm/client/ba/bin/<project_name>/mmbackup.ctrl
# backup server
serverName=<<servername in dsm.sys)
# backup clients
clientName=<<gpfs client name>
# number of processes per client
numberOfProcessesPerClient=6
This backup is running for 40+ hours using Lan-free and 6 sessions. (PROBLEM #1)
At last it fails with following error. (PROBLEM #2)
mmexectsmcmd: Command failed. Examine previous error messages to determine cause.
mmexectsmcmd: Command failed. Examine previous error messages to determine cause.
mmexectsmcmd: Command failed. Examine previous error messages to determine cause.
<<client name>>: TSM dsmc selective command failed to run.
Process 2 on client <<client name>> failed in processing its list of files.
<<client name>>: TSM dsmc selective command failed to run.
Process 3 on client <<client name>> failed in processing its list of files.
<<client name>>: TSM dsmc selective command failed to run (see file /var/mmfs/mmbackup/opt/rsi_090624_05:54:28_4/dsmerror.log)
.
Process 4 on client <<client name>> failed in processing its list of files.
*** glibc detected *** /usr/lpp/mmfs/bin/tsbackup: realloc(): invalid next size: 0x0000000055a9b000 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3be0a73adb]
/lib64/libc.so.6(realloc+0x1d0)[0x3be0a75c30]
/lib64/libc.so.6(realloc+0x3e)[0x3be0a75a9e]
/usr/lpp/mmfs/bin/tsbackup[0x40009644]
======= Memory map: ========
40000000-4000e000 r-xp 00000000 fd:00 2492080 /usr/lpp/mmfs/bin/tsbackup
40110000-40112000 rwxp 00010000 fd:00 2492080 /usr/lpp/mmfs/bin/tsbackup
40112000-4021f000 rwxp 40112000 00:00 0
.......................
...................
2480336 /usr/lib64/libstdc++.so.5.0.7
2b03235e2000-2b03235eb000 rwxp 000c1000 fd:00 2480336 /usr/lib64/libstdc++.so.5.0.7
2b03235eb000-2b03235fc000 rwxp 2b03235eb000 00:00 0
2b03235fc000-2b0323603000 r-xp 00000000 fd:00 2492119 /usr/lpp/mmfs/lib/libgpfs.so
2b0323603000-2b0323702000 ---p 00007000 fd:00 2492119 /usr/lpp/mmfs/lib/libgpfs.so
2b0323702000-2b0323704000 rwxp 00006000 fd:00 2492119 /usr/lpp/mmfs/lib/libgpfs.so
2b0323704000-2b0323707000 rwxp 2b0323704000 00:00 0
7fff8761c000-7fff877a5000 rwxp 7fffffe76000 00:00 0 [stack]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso]
/usr/lpp/mmfs/bin/mmbackup: line 379: 22148: Abort
mmbackup: Command failed. Examine previous error messages to determine cause.
In errorlog file,
# more /var/mmfs/mmbackup/opt/rsi_090624_05:54:28_4/dsmerror.log
06/25/2009 20:12:06 ANS1228E Sending of object '/opt/tivoli/tsm/client/ba/bin/<<project_name>>/*' failed
06/25/2009 20:12:06 ANS4005E Error processing '/opt/tivoli/tsm/client/ba/bin/<<project_name>>/*': file not found
06/25/2009 22:11:48 ANS1804E Selective Backup processing of '/opt/rsi/.mmbuTrans4' finished with failures
I do not understand why mmbackup is trying to backup '/opt/tivoli/tsm/client/ba/bin/<<project_name>>/*'
(PROBLEM #3)
Have you seen these kind of problems earlier?
Is the memory dump because of /opt/tivoli/tsm/client/ba/bin/<<project_name>>/*'?
In TSM Server's actlog, I can see 6 sessions completed with messages
06/16/2009 22:48:59 ANE4971I (Session: 1777, Node: <<node_name>.) LanFree data
bytes: 225.72 GB (SESSION: 1777)
All the sessions are getting completed with near to 225GB.
Which means (225*6= 1.35TB) however my filesystems is of 1.9TB
((PROBLEM #4))
As per my understanding, when starting 6 sessions with mmbackup, it equally devides data into 6 filelists.
Please help me understand these.
Thank you in advance.
We have production GPFS Linux systems (22 in cluster).
We take falshcopy for GPFS filesystems (DS8K) and mount it to a dedicated Linux mounting server(1 node GPFS cluster) for TSM to backup.
This is the FS mounted, 1.9TB with around 10 millions small files. (and growing)
/dev/rsidev 4.3T 1.9T 2.5T 44% /opt/rsi
This is not production server itself however I can not use ba client (only 1 filesystem - no multisession) and I can not use Image backup (not supported).
So using following mmbackup command to back it up.
/usr/lpp/mmfs/bin/mmbackup /dev/rsidev -n /opt/tivoli/tsm/client/ba/bin/<project_name>/mmbackup.ctrl -t full -s /var/tmp
# more /opt/tivoli/tsm/client/ba/bin/<project_name>/mmbackup.ctrl
# backup server
serverName=<<servername in dsm.sys)
# backup clients
clientName=<<gpfs client name>
# number of processes per client
numberOfProcessesPerClient=6
This backup is running for 40+ hours using Lan-free and 6 sessions. (PROBLEM #1)
At last it fails with following error. (PROBLEM #2)
mmexectsmcmd: Command failed. Examine previous error messages to determine cause.
mmexectsmcmd: Command failed. Examine previous error messages to determine cause.
mmexectsmcmd: Command failed. Examine previous error messages to determine cause.
<<client name>>: TSM dsmc selective command failed to run.
Process 2 on client <<client name>> failed in processing its list of files.
<<client name>>: TSM dsmc selective command failed to run.
Process 3 on client <<client name>> failed in processing its list of files.
<<client name>>: TSM dsmc selective command failed to run (see file /var/mmfs/mmbackup/opt/rsi_090624_05:54:28_4/dsmerror.log)
.
Process 4 on client <<client name>> failed in processing its list of files.
*** glibc detected *** /usr/lpp/mmfs/bin/tsbackup: realloc(): invalid next size: 0x0000000055a9b000 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3be0a73adb]
/lib64/libc.so.6(realloc+0x1d0)[0x3be0a75c30]
/lib64/libc.so.6(realloc+0x3e)[0x3be0a75a9e]
/usr/lpp/mmfs/bin/tsbackup[0x40009644]
======= Memory map: ========
40000000-4000e000 r-xp 00000000 fd:00 2492080 /usr/lpp/mmfs/bin/tsbackup
40110000-40112000 rwxp 00010000 fd:00 2492080 /usr/lpp/mmfs/bin/tsbackup
40112000-4021f000 rwxp 40112000 00:00 0
.......................
...................
2480336 /usr/lib64/libstdc++.so.5.0.7
2b03235e2000-2b03235eb000 rwxp 000c1000 fd:00 2480336 /usr/lib64/libstdc++.so.5.0.7
2b03235eb000-2b03235fc000 rwxp 2b03235eb000 00:00 0
2b03235fc000-2b0323603000 r-xp 00000000 fd:00 2492119 /usr/lpp/mmfs/lib/libgpfs.so
2b0323603000-2b0323702000 ---p 00007000 fd:00 2492119 /usr/lpp/mmfs/lib/libgpfs.so
2b0323702000-2b0323704000 rwxp 00006000 fd:00 2492119 /usr/lpp/mmfs/lib/libgpfs.so
2b0323704000-2b0323707000 rwxp 2b0323704000 00:00 0
7fff8761c000-7fff877a5000 rwxp 7fffffe76000 00:00 0 [stack]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso]
/usr/lpp/mmfs/bin/mmbackup: line 379: 22148: Abort
mmbackup: Command failed. Examine previous error messages to determine cause.
In errorlog file,
# more /var/mmfs/mmbackup/opt/rsi_090624_05:54:28_4/dsmerror.log
06/25/2009 20:12:06 ANS1228E Sending of object '/opt/tivoli/tsm/client/ba/bin/<<project_name>>/*' failed
06/25/2009 20:12:06 ANS4005E Error processing '/opt/tivoli/tsm/client/ba/bin/<<project_name>>/*': file not found
06/25/2009 22:11:48 ANS1804E Selective Backup processing of '/opt/rsi/.mmbuTrans4' finished with failures
I do not understand why mmbackup is trying to backup '/opt/tivoli/tsm/client/ba/bin/<<project_name>>/*'
(PROBLEM #3)
Have you seen these kind of problems earlier?
Is the memory dump because of /opt/tivoli/tsm/client/ba/bin/<<project_name>>/*'?
In TSM Server's actlog, I can see 6 sessions completed with messages
06/16/2009 22:48:59 ANE4971I (Session: 1777, Node: <<node_name>.) LanFree data
bytes: 225.72 GB (SESSION: 1777)
All the sessions are getting completed with near to 225GB.
Which means (225*6= 1.35TB) however my filesystems is of 1.9TB
((PROBLEM #4))
As per my understanding, when starting 6 sessions with mmbackup, it equally devides data into 6 filelists.
Please help me understand these.
Thank you in advance.