Re: [ADSM-L] TSM: Backing Up Large Files
2012-07-23 16:36:44
It is my personal experience that a backup of a single very large file
that takes too long (hours) will pin the log and not let go until
either the file finishes copying or the backup is cancelled.
Problem is once the log goes past 81% it starts delaying transactions
and everything gets worse.
I too am waiting to upgrade to 6.x in the hopes that it will fix this
issue (among others). Some cases I have solved by changing the backup
strategy (eg instead of a single large file, several smaller ones) or
moving the schedule to a time where there is less network/TSM
activity.
Sorry I can't offer more help.
On Mon, Jul 23, 2012 at 11:13 AM, Nast, Jeff P.
<Jeff.Nast AT essentiahealth DOT org> wrote:
> Hi Charles,
>
> I recently discovered same thing. We are on TSM Server 5.5.5.2. I don't
> have an answer yet...
>
> I was able to correlate the TSM Client log with the TSM Server activity
> log. See if you can correlate the messages on your client and server
> with the same time stamp.
>
> Here is what I see in the TSM Server activity log that correlate back to
> the same messages that you are seeing in the client log...
> ------------------------------------------------------------
> 07/11/12 05:22:15 ANR2998W The server log is 81 percent full. The
> server has
> cancelled the oldest transaction in the log.
> (SESSION:
> 106387)
>
> 07/11/12 05:22:15 ANR0524W Transaction failed for session 104222
> for node
> LAB6_DB_AIX (AIX) - data transfer
> interrupted. (SESSION:
> 104222)
>
> 07/11/12 05:22:15 ANR2997W The server log is 81 percent full. The
> server
> will delay transactions by 3 milliseconds.
> (SESSION:
> 106920)
>
> 07/11/12 05:22:21 ANR0483W Session 104222 for node LAB6_DB_AIX
> (AIX)
> terminated - forced by administrator.
> (SESSION: 104222)
> ------------------------------------------------------------
>
> So the question is, why cancel backup sessions when the log is at 81%?
> Can I change that threshold?
>
> I have a feeling that this will no longer happen once we migrate to TSM
> Server v6.x...
>
> -Jeff Nast
> Senior Systems Administrator - Storage
> Essentia Health, Duluth MN
>
>
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT vm.marist DOT edu] On Behalf
> Of
> Welton, Charles
> Sent: Monday, July 23, 2012 10:39 AM
> To: ADSM-L AT vm.marist DOT edu
> Subject: [ADSM-L] TSM: Backing Up Large Files
>
> Hello:
>
> I need some on advice on how to handle backing up large files, more
> specifically, a 4 GB file. I am running a small TSM instance running
> version 5.4.2.0 and the client is also running 5.4.2.0. This is what
> the client log says when trying to backup file:
>
> 07/23/2012 09:51:42 Retry # 1 Normal File--> 4,457,963,520
> \\ami-hph-pacs\d$\Program Files\RamSoft\DB4\PACS46REST.FDB **
> Unsuccessful **
> 07/23/2012 09:51:42 ANS1809W A session with the TSM server has been
> disconnected. An attempt will be made to reestablish the connection.
> 07/23/2012 09:51:57 ... successful
> 07/23/2012 10:24:00 Retry # 2 Normal File--> 4,457,963,520
> \\ami-hph-pacs\d$\Program Files\RamSoft\DB4\PACS46REST.FDB **
> Unsuccessful **
> 07/23/2012 10:24:00 ANS1809W A session with the TSM server has been
> disconnected. An attempt will be made to reestablish the connection.
> 07/23/2012 10:24:15 ... successful
>
> It retries about five times and then fails. Here is the output of "q
> option" from my TSM instance:
>
>
> Server Option Option Setting Server Option Option
> Setting
>
> ----------------- -------------------- -----------------
> --------------------
>
> CommTimeOut 3,600 IdleTimeOut 240
>
> BufPoolSize 262144 LogPoolSize 512
>
> DateFormat 1 (mm/dd/yyyy) TimeFormat 1 (hh:mm:ss)
>
> NumberFormat 1 (1,000.00) MessageFormat 1
>
> Language AMENG Alias Halt HALT
>
> MaxSessions 100 ExpInterval 0
>
> ExpQuiet Yes EventServer Yes
>
> ReportRetrieve No DISPLAYLFINFO No
>
> MirrorRead DB Normal MirrorRead LOG Normal
>
> MirrorWrite DB Parallel MirrorWrite LOG Parallel
>
> VolumeHistory volhist.out Devconfig devcnfg.out
>
> TxnGroupMax 256 MoveBatchSize 1000
>
> MoveSizeThresh 2048 RestoreInterval 1,440
>
> DisableScheds No NOBUFPREfetch No
>
> AuditStorage Yes REQSYSauthoutfile Yes
>
> SELFTUNEBUFpools- Yes DBPAGEShadow Yes
>
> ize
>
> DBPAGESHADOWFile DBPGSHDW.BDT MsgStackTrace On
>
> QueryAuth None LogWarnFullPerCe- 90
>
> nt
>
> ThroughPutDataTh- 0 ThroughPutTimeTh- 0
>
> reshold reshold
>
> NOPREEMPT ( No ) Resource Timeout 60
>
> TEC UTF8 Events No AdminOnClientPort Yes
>
> NORETRIEVEDATE No IMPORTMERGEUsed Yes
>
> DNSLOOKUP Yes NDMPControlPort 10,000
>
> NDMPPortRange 0,0 SHREDding Automatic
>
> SanRefreshTime 0
>
> CommMethod TCPIP CommMethod NAMEDPIPE
>
> CommMethod HTTP ADSMGROUPname ADSMSERVER
>
> SECUREPipes No NPAUDITSuccess No
>
> NPAUDITFailure No NPBUFfersize 8192
>
> TcpPort 1500 TcpAdminport 1500
>
> TCPWindowsize 64512 TCPNoDelay Yes
>
> HttpPort 1580 HttpsPort 1543
>
> NamedPipeName \\.\PIPE\ADSMPIPE ShmPort 1
>
> Message Interval 1 FileExit
>
> FileTextExit UserExit
>
> AcsAccessId AcsTimeoutX 1
>
> AcsLockDrive No AcsQuickInit Yes
>
> SNMPSubagentPort 1521 SNMPSubagentHost 127.0.0.1
>
> SNMPHeartBeatInt 5 TECHost
>
> TECPort 0 UNIQUETECevents No
>
> UNIQUETDPTECeven- No AssistVCRRecovery Yes
>
> ts
>
> AdRegister No AdUnRegister No
>
> AdSetDC AdComment
>
> SHAREDLIBIDLE No 3494Shared No
>
> SANdiscovery On
>
> ... and here is "q status" output from my TSM instance:
>
>
> Storage Management Server for Windows - Version 5, Release 4, Level 2.0
>
>
>
>
>
> Server Name: HTSP-TSM1_SERVER1
>
> Server host name or IP address: 10.80.2.128
>
> Server TCP/IP port number: 1500
>
> Server URL:
>
> Crossdefine: Off
>
> Server Password Set: Yes
>
> Server Installation Date/Time: 02/28/2002 13:56:50
>
> Server Restart Date/Time: 11/22/2011 08:26:20
>
> Authentication: On
>
> Password Expiration Period: 9,999 Day(s)
>
> Invalid Sign-on Attempt Limit: 0
>
> Minimum Password Length: 0
>
> WEB Admin Authentication Time-out (minutes): 9,999
>
> Registration: Closed
>
> Subfile Backup: No
>
> Availability: Enabled
>
> Accounting: On
>
> Activity Log Retention: 31 Day(s)
>
> Activity Log Number of Records: 228861
>
> Activity Log Size: 31 M
>
> Activity Summary Retention Period: 30 Day(s)
>
> License Audit Period: 1 Day(s)
>
> Last License Audit: 07/22/2012 21:25:25
>
> Server License Compliance: Valid
>
> Central Scheduler: Active
>
> Maximum Sessions: 100
>
> Maximum Scheduled Sessions: 90
>
> Event Record Retention Period: 31 Day(s)
>
> Client Action Duration: 5 Day(s)
>
> Schedule Randomization Percentage: 10
>
> Query Schedule Period: 2 Hour(s)
>
> Maximum Command Retries: 10
>
> Retry Period: Client
>
> Scheduling Modes: Any
>
> Log Mode: Normal
>
> Database Backup Trigger: Disabled
>
> BufPoolSize: 262,144 K
>
> Active Receivers: CONSOLE ACTLOG NTEVENTLOG
>
> Configuration manager?: Off
>
> Refresh interval: 60
>
> Last refresh date/time:
>
> Context Messaging: Off
>
> Server-free Status: Off
>
> Server-free Batch Size: 200
>
> Table of Contents (TOC) Load Retention: 120 Minute(s)
>
> Machine Globally Unique ID:
> 70.e3.b0.f1.8c.64.11.db.ae.3d.00.1-
>
> 4.5e.23.fe.99
>
> Archive Retention Protection: Off
>
> Encryption Strength: AES
>
> I made a few changes that I thought would help, but hasn't so far. I
> changed the "Retry Period" from a specified time to "Client". I also
> added a client option to the client option set called "CHANGINGRETRIES"
> and set the value to "50". Is there a way to change the minutes between
> retries? Can someone please point me the right direction?
>
> Any suggestions would be greatly appreciated!
>
> Thank you...
>
>
> Charles
>
> This email contains information which may be PROPRIETARY IN NATURE OR
> OTHERWISE PROTECTED BY LAW FROM DISCLOSURE and is intended only for the
> use of the addresses(s) named above. If you have received this email in
> error, please contact the sender immediately.
|
|
|