Backing up GB's of data over the pond taking weeks

rdemaat

ADSM.ORG Member
Joined
Aug 19, 2008
Messages
92
Reaction score
1
Points
0
Hello all, I have several file servers in remote locations. Beijing. Madrid. Portugal. And many more. And lots of US remote file servers too. Chicago. Houston. North Carolina. And many more. I'm in the US. Michigan. These backups are taking days and weeks to complete. Even after the initial full, incrementals crawl. Some sites are better than others. For some reason Beijing, although slow, is considerably better than Madrid and Portugal. Is the answer simply get (buy) a bigger pipe? I have started new backups of these sites onto a new TSM 7.1 Windows 2012 server. That's all its receiving. Right now there are only 5 sites going to this new server. Looking at 2 Portugal sessions that have been running for 2 days and 21 hours they have transferred not quite 12 GB's. Is there anything from the TSM settings side that might help. There is just one storage pool. It has SATA disk behind it. Here is the OPT file on the Portugal server. All the others are pretty much the same. Thanks in advance.

COMMmethod TCPIP
TCPPort 1500
*TCPServeraddress 10.62.18.250
TCPServeraddress 10.62.19.81
txnbytelimit 25600
tcpbuffsize 512
tcpwindowsize 63
largecommbuffers NO
tcpnodelay YES
resourceutilization 4
useunicodefilenames NO
compression YES
memoryefficientbackup NO
commrestartduration 480
commrestartinterval 120
MAXCMDRetries 5
CHANGINGRetries 1
SCHEDLOGRETENTION 7
ERRORLOGRETENTION 7
SCHEDMODE PROMPTED
MANAGEDSERVICES WEBCLIENT SCHEDULE
SCHEDLOGNAME "C:\Program Files\Tivoli\TSM\baclient\dsmsched.log"
QUIET
ERRORLOGNAME "C:\Program Files\Tivoli\TSM\baclient\dsmerror.log"
PASSWORDACCESS GENERATE
DEDUPLICATION YES
Dirmc prodstd1
INCLUDE "*" prodstd1
INCLUDE "*:\...\*.pst" prodpst
EXCLUDE "*:\...\Spool\...\*"
INCLUDE "*:\...\System32\Spool\...\*" prodstd1
EXCLUDE "*:\...\pagefile.sys"
EXCLUDE "*:\...\MSDOS.SYS"
EXCLUDE "*:\...\IO.SYS"
EXCLUDE "*:\...\*.dsm"
EXCLUDE "*:\...\ntuser.dat*"
EXCLUDE "*:\...\UsrClass.dat*"
EXCLUDE.DIR "*:\...\Recycler"
EXCLUDE.DIR "*:\...\macintosh volume"
EXCLUDE.DIR "*:\...\microsoft uam volume"
EXCLUDE.DIR "*:\...\SYSTEM32\CONFIG"
EXCLUDE.DIR "*:\Temp"
EXCLUDE.DIR "*:\Tsm_images"
EXCLUDE.DIR "*:\SMS_PKGD"
EXCLUDE.DIR "*:\SMSPKGD$"
EXCLUDE.DIR "*:\SOFTWARE\SMSPKG"
EXCLUDE.DIR "*:\Install"
EXCLUDE.DIR "*:\logs"
EXCLUDE.DIR "*:\store"
EXCLUDE.COMPRESS "*:\...\*.Z"
EXCLUDE.COMPRESS "*:\...\*.jpg"
EXCLUDE.COMPRESS "*:\...\*.zip"
EXCLUDE.DIR "*:\...\Temporary Internet Files"
* Exclude *:\...\sqldata\...\*
* Exclude *:\app\sqlsrvr\admin\...\*

NODENAME naportfs01
DOMAIN "\\naportfs01\g$"
DOMAIN "\\naportfs01\h$"
 
How big are the 'pipes' between your remote sites and Michigan? It is not necessarily 'just getting a bigger pipe' issue.

How many remote servers are you backing up per site?

Have you looked into node de-duplication?
 
Can you show the backup stats for one of these nodes?
 
moon-buddy: I'll have to get back to you on the current pipe size. I'll need to ask a network person. Any specific questions I should ask?
Each site is just 1 'file server' getting backed up to me.
And isn't the DEDUPLICATION YES in the opt file causing node de-dupe?


marclant: are these the stats you are referring too? These are for Chicago. This ones a bit faster but still slow. Pretty sure these would be for a nightly incremental of Chicago. 32 GB's. 23 hours. I don't have (or see) any of these stats for Beijing, Madrid, and Portugal. I think because they haven't officially finished yet.

5/20/15 6:23:01 PM GMT-04:00 ANE4952I (Session: 2991, Node: NACHICFS01) Total number of objects inspected: 322,142 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4954I (Session: 2991, Node: NACHICFS01) Total number of objects backed up: 479 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4958I (Session: 2991, Node: NACHICFS01) Total number of objects updated: 3 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4960I (Session: 2991, Node: NACHICFS01) Total number of objects rebound: 0 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4957I (Session: 2991, Node: NACHICFS01) Total number of objects deleted: 0 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4970I (Session: 2991, Node: NACHICFS01) Total number of objects expired: 14 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4959I (Session: 2991, Node: NACHICFS01) Total number of objects failed: 2 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4197I (Session: 2991, Node: NACHICFS01) Total number of objects encrypted: 0 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4982I (Session: 2991, Node: NACHICFS01) Total objects deduplicated: 0 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4914I (Session: 2991, Node: NACHICFS01) Total number of objects grew: 0 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4916I (Session: 2991, Node: NACHICFS01) Total number of retries: 21 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4977I (Session: 2991, Node: NACHICFS01) Total number of bytes inspected: 422.76 GB (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4975I (Session: 2991, Node: NACHICFS01) Total number of bytes processed: 32.79 GB (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4984I (Session: 2991, Node: NACHICFS01) Total bytes before deduplication: 0 B (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4198I (Session: 2991, Node: NACHICFS01) Total bytes after deduplication: 0 B (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4961I (Session: 2991, Node: NACHICFS01) Total number of bytes transferred: 32.79 GB (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4963I (Session: 2991, Node: NACHICFS01) Data transfer time: 82,032.14 sec (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4966I (Session: 2991, Node: NACHICFS01) Network data transfer rate: 419.17 KB/sec (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4967I (Session: 2991, Node: NACHICFS01) Aggregate data transfer rate: 400.11 KB/sec (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4968I (Session: 2991, Node: NACHICFS01) Objects compressed by: 4% (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4981I (Session: 2991, Node: NACHICFS01) Deduplication reduction: 0.00% (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4976I (Session: 2991, Node: NACHICFS01) Total data reduction ratio: 92.25% (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANE4964I (Session: 2991, Node: NACHICFS01) Elapsed processing time: 23:52:18 (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANR0403I Session 2991 ended for node NACHICFS01 (WinNT). (SESSION: 2991)
5/20/15 6:23:01 PM GMT-04:00 ANR0406I Session 2992 started for node NACHICFS01 (WinNT) (Tcp/Ip nachicfs01.na.haworthinc.com(49716)). (SESSION: 2992)
5/20/15 6:23:01 PM GMT-04:00 ANR2507I Schedule DAILY1830 for domain STANDARD started at 05/19/2015 18:30:00 for node NACHICFS01 completed successfully at 05/20/2015 18:23:01. (SESSION: 2992)
5/20/15 6:23:01 PM GMT-04:00 ANR0403I Session 2992 ended for node NACHICFS01 (WinNT). (SESSION: 2992)
5/20/15 6:23:01 PM GMT-04:00 ANR0406I Session 2993 started for node NACHICFS01 (WinNT) (Tcp/Ip nachicfs01.na.haworthinc.com(49717)). (SESSION: 2993)
5/20/15 6:23:01 PM GMT-04:00 ANR0403I Session 2993 ended for node NACHICFS01 (WinNT). (SESSION: 2993)
 
Yes, that's what I was referring to.

As suspected for a WAN, the network throughput is slow:
Network data transfer rate: 419.17 KB/sec

When the TSM Client sends data verbs to the server, it counts from the time it starts sending until the server acknowledges that it received it. The sum of the time of taken for all the data verbs is "Data transfer time". So take bytes transferred divided by data transfer time and you get your network throughput.

The Elapsed processing time is not much longer than the data transfer time, which means that the bulk of the time is spent sending data.

Looks like you are already using client side compression and client side dedup.

I think the next place to focus on is the network connection, see if you can tweak it to go faster.
 
RDEMAAT,

Sorry I did not notice that NODE de-dup has been turned on already.

As MARCLANT said, look at the WAN link. However, would you still ask the network folks on 'pipe' sizes. You may also want to ask if the the WAN is switched, shared or dedicated. The latter offers a faster connection even if ALL speed ratings are the same for all three.
 
Back
Top