ADSM-L

Re: [ADSM-L] Long running backup pinning log

2010-03-09 15:56:07
Subject: Re: [ADSM-L] Long running backup pinning log
From: Zoltan Forray/AC/VCU <zforray AT VCU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 9 Mar 2010 15:54:34 -0500
Problem/cause solved - it is not the issue with systemstate backups.

It is a new app with a HUGE (142GB) log file that is constantly changing
(folks always forget CHANGINGRETRIES 0 which IMHO should be the default).
It is taking 3-hours for each compress/send pass of this one file.

Anybody know what the heck WEBINATOR (thank you Ahhhhhhnold for all
products that are named *NATOR) is?  Something tells me that a 142GB log
file isn't normal but then I am not the product owner/administrator.

03/09/2010 02:47:08 --- SCHEDULEREC OBJECT BEGIN WEBSERVERS 03/09/2010
01:00:00
03/09/2010 02:47:09 Incremental backup of volume '\\adata\c$'
03/09/2010 02:47:09 Incremental backup of volume '\\adata\d$'
03/09/2010 02:47:09 Incremental backup of volume 'SYSTEMSTATE'
03/09/2010 02:47:16 Performing a full, TSM backup of object 'COM+ REGDB
Writer' component 'COM+ REGDB' using shadow copy.
03/09/2010 02:47:16 Performing a full, TSM backup of object 'Registry
Writer' component 'Registry' using shadow copy.
03/09/2010 02:47:16 Performing a full, TSM backup of object 'System
Writer' component 'System Files' using shadow copy.
03/09/2010 02:47:16 Performing a full, TSM backup of object 'Event Log
Writer' component 'Event Logs' using shadow copy.
03/09/2010 02:47:16 Performing a full, TSM backup of object 'IIS Metabase
Writer' component 'IISMETABASE' using shadow copy.
03/09/2010 02:47:16 Performing a full, TSM backup of object 'WMI Writer'
component 'Windows Managment Instrumentation' using shadow copy.
03/09/2010 02:47:25 Comparing local system files with active version on
the server.  This may take a long time depending on the number of files...
03/09/2010 02:47:25 Directory-->                   0
\\adata\c$\WINDOWS\repair\Backup\BootableSystemState\ComRegistrationDatabase
[Sent]

[snippage]

03/09/2010 02:48:02 Backup of object 'SystemState' component 'System
State' finished successfully.
03/09/2010 02:48:04 Successful incremental backup of
'ADATA\SystemState\NULL\System State\SystemState'

03/09/2010 02:48:04 Normal File-->             3,626
\\adata\c$\adsm.sys\ASR\asr.sif [Sent]
03/09/2010 02:48:04 Normal File-->            16,044
\\adata\c$\adsm.sys\ASR\asrpnp.sif [Sent]
03/09/2010 02:48:04 Normal File-->             2,486
\\adata\c$\adsm.sys\ASR\tsmasr.cmd [Sent]
03/09/2010 02:48:04 Normal File-->             1,470
\\adata\c$\adsm.sys\ASR\tsmasr.opt [Sent]
03/09/2010 02:48:04 Normal File-->            15,384
\\adata\c$\adsm.sys\ASR\waitforevent.exe [Sent]
03/09/2010 02:48:04 Successful incremental backup of 'Automated System
Recovery'

03/09/2010 02:48:04 Directory-->                   0 \\adata\d$\Webinator
[Sent]
03/09/2010 02:48:04 Normal File-->             1,816
\\adata\d$\Webinator\license.key [Sent]
03/09/2010 02:48:04 Directory-->                   0
\\adata\d$\Webinator\Texis\ada [Sent]
03/09/2010 02:48:04 Directory-->                   0
\\adata\d$\Webinator\Texis\dlkmstest  Changed
03/09/2010 02:48:04 Retry # 1  Directory-->                   0
\\adata\d$\Webinator [Sent]
03/09/2010 02:48:04 Retry # 1  Normal File-->             1,816
\\adata\d$\Webinator\license.key [Sent]
03/09/2010 02:48:04 Retry # 1  Directory-->                   0
\\adata\d$\Webinator\Texis\ada [Sent]
03/09/2010 02:48:05 Retry # 1  Directory-->                   0
\\adata\d$\Webinator\Texis\dlkmstest [Sent]
03/09/2010 02:48:05 Directory-->                   0
\\adata\d$\Webinator\Texis\vcu  Changed
03/09/2010 02:48:05 Retry # 2  Directory-->                   0
\\adata\d$\Webinator\Texis\dlkmstest [Sent]
03/09/2010 02:48:05 Retry # 1  Directory-->                   0
\\adata\d$\Webinator\Texis\vcu [Sent]
03/09/2010 02:48:07 Normal File-->        13,245,910
\\adata\d$\Webinator\Texis\monitor.log [Sent]
03/09/2010 06:05:14 Normal File-->   153,358,399,985
\\adata\d$\Webinator\Texis\vortex.log  Changed
03/09/2010 09:57:06 Retry # 1  Normal File-->   153,358,411,322
\\adata\d$\Webinator\Texis\vortex.log  Changed
03/09/2010 13:40:09 Retry # 2  Normal File-->   153,358,459,153
\\adata\d$\Webinator\Texis\vortex.log  Changed

[snippage]



From:
David McClelland <tsm AT NETWORKC.CO DOT UK>
To:
ADSM-L AT VM.MARIST DOT EDU
Date:
03/08/2010 10:32 AM
Subject:
Re: [ADSM-L] Long running backup pinning log
Sent by:
"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>



Zoltan,

It's the large gap between Systemstate backup finished and d$ finished
that
interests me. Particularly as '72,395 total objects inspected' doesn't
seem
like a very large number (for a Windows server, that's probably not a
whole
lot more than the standard system files), and almost certainly not a
candidate to gain maximum benefit from journaling. The ANE4691 message you
paraphrase is "Total number of bytes transferred" which has always
suggested
to me 'after compression'.

I'd definitely have a look at what's really going on in verbose mode,
checking retries and also how slow/fast your network data path  is to the
TSM Server and whether the 93.82GB is simply taking an awfully long time
to
transfer. 87% compression is remarkably good which I also find suspicious:
is the client attempting to back up large sparse open database files on
the
D$ by any chance through a slow network connection?

/DMc
London, UK

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Zoltan Forray/AC/VCU
Sent: 08 March 2010 14:53
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Long running backup pinning log

Thanks for the heads-up.

Unfortunately, from reading the APAR, this box does not seem to have the
symptoms. There is no significant gap in backing up each component of
systemstate,  unless this is hidden because of QUIET. We have changed this
to VERBOSE to get more details on where it is slogging.

>From what I see in the dsmsched log, this is what the last backup that
finished (todays is still running), says:

01:43 - Backup started
01:47 - Systemstate backup finished
12:21 - d$ drive backup finished
12:25 - c$ drive backup finished

Total objects examined - 72,395
Total objects backed up - 1,725
Total data backed-up (to paraphase) - 93.82Gb

I did note the "Object compressed by"  is 87%.  Which brings me to the
question - Is the "total amount backed up" after compression?

My guess it lots of time spent in compression (IBM - how about some
compression algorithm controls (low - medium - high).  Haven't ruled out
"retries" - will see when verbose is on.

I don't see journaling making that much of a difference.  The server has
2GB RAM of which 900M was free when I checked it this morning, so it
doesn't seem to be constrained.  The backup is currently running and the
dsm service was using 10-20% CPU.
Zoltan Forray
TSM Software & Hardware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
zforray AT vcu DOT edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://infosecurity.vcu.edu/phishing.html



From:
Andrew Raibeck <storman AT US.IBM DOT COM>
To:
ADSM-L AT VM.MARIST DOT EDU
Date:
03/05/2010 04:32 PM
Subject:
Re: [ADSM-L] Long running backup pinning log
Sent by:
"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>



Hi Zoltan,

Is the long-running backup working on system state? I haven't found a
direct cause --> effect relationship to client APAR IC63094, but take a
look at that APAR and see if, after applying the fix, the problem you are
seeing goes away.

I should mention to all: IC63094 describes a problem with long-running
system state backups. This APAR may be of interest to you.

Best regards,

Andy

Andy Raibeck
IBM Software Group
Tivoli Storage Manager Client Product Development
Level 3 Team Lead
Internal Notes e-mail: Andrew Raibeck/Hartford/IBM@IBMUS
Internet e-mail: storman AT us.ibm DOT com

IBM Tivoli Storage Manager support web page:
http://www.ibm.com/support/entry/portal/Overview/Software/Tivoli/Tivoli_Stor

age_Manager



The only dumb question is the one that goes unasked.
The command line is your friend.
"Good enough" is the enemy of excellence.

"ADSM: Dist Stor Manager" <ADSM-L AT vm.marist DOT edu> wrote on 2010-03-05
16:26:01:

> [image removed]
>
> Long running backup pinning log
>
> Zoltan Forray/AC/VCU
>
> to:
>
> ADSM-L
>
> 2010-03-05 16:26
>
> Sent by:
>
> "ADSM: Dist Stor Manager" <ADSM-L AT vm.marist DOT edu>
>
> Please respond to "ADSM: Dist Stor Manager"
>
> I recently started having issues with my log going over 80% (5.5.3).
Doing
> a "show logpinned" usually points to a long (18-hours) running backup
> session for one particular node.
>
> This is starting to happen more and more frequently.
>
> Once I kill the session, the log utilization drops to almost nothing
since
> there has been a DB backup run earlier in the day.
>
> Any suggestions on how to handle this besides killing the session?  The
> node is a standard 2K3 box using the 6.1.2.0 client.  It is just a very
> slow/busy box.  The long backups (times vary from 11-19 hours) often
dump
> 120GB of data.
> Zoltan Forray
> TSM Software & Hardware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> zforray AT vcu DOT edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://infosecurity.vcu.edu/phishing.html