ADSM-L

Re: [ADSM-L] Backup fails with no error message

2014-07-08 12:19:40
Subject: Re: [ADSM-L] Backup fails with no error message
From: Matthew Glanville <matthew.glanville AT KODAK DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 8 Jul 2014 12:17:48 -0400
I have a guess as to what it is.

Unmount that filesystem /main/UT

Then see if there are any files or directories in /main/UT

You may have mounted that filesystem over top of them, and what is in
there may be what is actually causing the issue.

Also this could be why the backup of /main/ut/ -subdir=yes works, it has a
different starting point.

If there are files /directories there, delete them, then mount /main/UT
back
or maybe keep them, if they are important...

I have seen this before on Solaris 9 OS backup a few years ago
Matthew Glanville | WWIS GI Server team |
Eastman Kodak Co. | 343 State St | Rochester, NY 14650-1232 |
matthew.glanville AT kodak DOT com | 585 724-7523 Office |
www.kodak.com


"ADSM: Dist Stor Manager" <ADSM-L AT vm.marist DOT edu> wrote on 07/08/2014
10:18:08 AM:

> From:
>
> Thomas Denier <Thomas.Denier AT JEFFERSON DOT EDU>
>
> To:
>
> ADSM-L AT vm.marist DOT edu
>
> Date:
>
> 07/08/2014 10:18 AM
>
> Subject:
>
> Re: Backup fails with no error message
>
> Sent by:
>
> "ADSM: Dist Stor Manager" <ADSM-L AT vm.marist DOT edu>
>
> Andy,
>
> The failure did not occur when I ran the backup with service
> tracing. Further testing
> revealed that the failure no longer occurred even in the absence of
> tracing. I don't
> know whether the circumventions mentioned in my original e-mail ever
> had any real
> effect.
>
> Thomas Denier
> Thomas Jefferson University Hospital
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On
> Behalf Of Andrew Raibeck
> Sent: Monday, July 07, 2014 4:19 PM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: [ADSM-L] Backup fails with no error message
>
> Thomas,
>
> Run the failing backup command and this time add these parameters:
>
> -traceflags=service -tracefile=/sometracefilename
>
> For example:
>
> dsmc inc /main/UT -servername=DC1P1_MAIN -traceflags=service -
> tracefile=/tsmtrace.out
>
> Name the trace file whatever you want, just make sure ot put it in a
> file system with room for a potentially large trace file.
>
> Note: If you anticipate GB and GB of output, you can add the option
> -tracemax=1024 to wrap the trace file at 1 GB. The risk is, if
> whatever happens is not immediately causing the backup to stop, the
> needed trace lines could be written over due to wrapping. But based
> on your description, off-hand I'd say the backup stops when the
> problem occurs so the risk due to wrapping should be low.
>
> After the backup finishes with the RC 12, scan the trace "GlobalRC"
> (without the quotes) and you should find lines like these:
>
> 07/07/2014 16:12:14.122 [003772] [3812] : ..\..\common\ut
> \GlobalRC.cpp ( 428): msgNum = 1076 changed the Global RC.
> 07/07/2014 16:12:14.122 [003772] [3812] : ..\..\common\ut
> \GlobalRC.cpp ( 429): Old values: rc = 0, rcMacroMax = 0, rcMax = 0.
> 07/07/2014 16:12:14.122 [003772] [3812] : ..\..\common\ut
> \GlobalRC.cpp ( 443): New values: rc = 12, rcMacroMax = 12, rcMax = 12.
>
> This will show you which message is driving the RC change. In my
> example, "msgNum = 1076" corresponds to ANS1076E
>
> Based on the message, you might be able to figure out the rest; but
> at the least you have a trace file you can send in to support.
>
> Regards,
>
> - Andy
>
>
____________________________________________________________________________
>
> Andrew Raibeck | Tivoli Storage Manager Level 3 Technical Lead |
> storman AT us.ibm DOT com
>
> IBM Tivoli Storage Manager links:
> Product support:
> http://www.ibm.com/support/entry/portal/Overview/Software/Tivoli/
> Tivoli_Storage_Manager
>
> Online documentation:
>
https://www.ibm.com/developerworks/mydeveloperworks/wikis/home/wiki/Tivoli
> +Documentation+Central/page/Tivoli+Storage+Manager
> Product Wiki:
>
https://www.ibm.com/developerworks/mydeveloperworks/wikis/home/wiki/Tivoli
> +Storage+Manager/page/Home
>
> "ADSM: Dist Stor Manager" <ADSM-L AT vm.marist DOT edu> wrote on 2014-07-07
> 15:59:57:
>
> > From: Thomas Denier <Thomas.Denier AT JEFFERSON DOT EDU>
> > To: ADSM-L AT vm.marist DOT edu,
> > Date: 2014-07-07 16:00
> > Subject: Backup fails with no error message Sent by: "ADSM: Dist Stor
> > Manager" <ADSM-L AT vm.marist DOT edu>
> >
> > We have an AIX system on which backups of a specific file system
> > terminate with exit status 12 but with no error message indicating a
> > reason for this exit status.
> > If I execute the command
> >
> > dsmc inc /main/UT -servername=DC1P1_MAIN
> >
> > as root, I will see typical messages about the number of files
> > processed and about specific files being backed up, followed by the
> > usual summary messages. The exit status will be 12. The summary
> > statistics will show a number of files
> examined
> > equal to about half the number of files present in the file system.
> > There will not
> > be any error message explaining the exit status or the failure to
> > examine
> the
> > entire file system.
> >
> > The DCIP1_MAIN stanza in dsm.sys has some unusual features because it
> > is
> used
> > to back up one of the resource groups for a clustered environment. The
> stanza
> > includes three 'domain' statements listing the file systems in the
> > resource group.
> > The stanza includes a 'nodename' option specifying the node name that
> owns the
> > backup files from the resource group. The stanza includes an 'asnode'
> option
> > specifying the node name used to authenticate sessions from the
> > cluster
> node
> > involved (we and the system vendor were not able to agree on an
> acceptable
> > arrangement for storing a TSM password within the resource group).
> > This stanza works fine for the other file systems in the same resource
> > group,
> and
> > worked fine for /main/UT up until June 26.
> >
> > I have found two ways to circumvent the problem. One circumvention is
> > to
> run
> > the command
> >
> > dsmc inc /main/UT/ -subdir=y -servername=DC1P1_MAIN
> >
> > to back up the top level directory of the file system rather than the
> > file system as such. An 'lsfs' command shows nothing unusual about the
> > file system;
> it is
> > a jfs2 file system, like all the other file systems, and uses the same
> mount
> > options as the other file systems. The other circumvention is to add
> > an 'exclude.dir' line for a specific subdirectory of /main/UT to the
> > include/exclude file. The subdirectory came under suspicion because it
> > was last updated a
> few
> > hours after the last fully successful backup.
> >
> > The client code is TSM 6.4.1.0. The client OS is AIX 7.1. The TSM
> > server is TSM
> > 6.2.5.0 running under zSeries Linux.
> >
> > Does anyone recognize this as a known problem? If not, does anyone
> > have suggestions for presenting the problem to TSM support? I am
> > having difficulty imagining any kind of productive interaction if I
> > don't have a message identifier to report.
> >
> > Thomas Denier
> > Thomas Jefferson University Hospital
> > The information contained in this transmission contains privileged and
> > confidential information. It is intended only for the use of the
> > person named above. If you are not the intended recipient, you are
> > hereby notified that any review, dissemination, distribution or
> > duplication of this communication is strictly prohibited. If you are
> > not the intended recipient, please contact the sender by reply email
> > and destroy all copies of the original message.
> >
> > CAUTION: Intended recipients should NOT use email communication for
> > emergent or urgent health care matters.
> >
> The information contained in this transmission contains privileged
> and confidential information. It is intended only for the use of the
> person named above. If you are not the intended recipient, you are
> hereby notified that any review, dissemination, distribution or
> duplication of this communication is strictly prohibited. If you are
> not the intended recipient, please contact the sender by reply email
> and destroy all copies of the original message.
>
> CAUTION: Intended recipients should NOT use email communication for
> emergent or urgent health care matters.