• Please help support our sponsors by considering their products and services.
    Our sponsors enable us to serve you with this high-speed Internet connection and fast webservers you are currently using at ADSM.ORG.
    They support this free flow of information and knowledge exchange service at no cost to you.

    Please welcome our latest sponsor Tectrade . We can show our appreciation by learning more about Tectrade Solutions
  • Community Tip: Please Give Thanks to Those Sharing Their Knowledge.

    If you receive helpful answer on this forum, please show thanks to the poster by clicking "LIKE" link for the answer that you found helpful.

  • Community Tip: Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING)

    Click the link above to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This notice will disappear after you have made at least 3 posts.

Filespace collocation groups versus collaction groups?

ldmwndletsm

ADSM.ORG Member
#1
If you have a lot of file systems on a client, and you wanted to collocate this stuff would it be better to create a bunch of collocation file space groups for the same node (each would obviously list only one file space)? Or would it be better to create multiple nodes (multiple stanzas) wherein each stanza handles multiple, specific file systems (e.g. 'domain /filesystem' in dsm.sys), and each node is then added to a different collocation (node) group?

We have a lot of file systems on this one host, but half of them will be written to their own storage pool (A), and only one or two other clients will use that same pool, but they have little data. The other half of the file systems will use a different pool (B), but so will a number of other clients. So pool B will be used by a bunch of clients. We will be using two stanzas (two nodes) for this. Each stanza will use the 'domain /filesystem' statement to explicitly list what is to be backed up.

I'm not too worried right now about pool A since there's so few nodes on there, and most of the file systems are not frequently written to except when their first created, and most daily activity occurs on the newest file system. Once it's close to filling up, a new file system is then created and we move on from there, so we just keep adding more. However, a little space is left on each one so changes can occur later and do, but again, most of the current activity occurs on the latest file system.

My concern here with pool B is that when we first start out, the first-time incrementals, while they will be the largest, will complete okay since we won't add more file systems to the corresponding stanza than we can accomplish in one night. However, as time goes on, and more and more file systems are added, the amount of time that it might take to walk a filesystem for the nightly incrementals, back it up, move to the next file system, walk that, etc, etc. might take too long.

Could we expedite this by using collocation? We will be writing to disk and then the data will be moved to tape (no disk cache), but it will not stick around long on disk (maybe a day or so). So it may be moot since it will go to disk first?

Splitting things out by file space seems a mess since there's so many and all the file spaces have to be the same for the node name for that collocation group, right? So we'd end up with one collocation group for each file space, each listing the same node and a unique file system. Kinda messy.
 

RecoveryOne

ADSM.ORG Senior Member
#2
The time it takes to scan and process a client filesystem has nothing* to do with where the actual backed up data is stored, as the client is querying the TSM database.

There are options you can undertake to speed up the backup process such as resourceutilization. Client performance tuning is more of an art than a science.

Colocation is designed to help you use your backup server storage more efficiently, facilitate faster restores, or the 'best of both' so to speak. For example, I have a set of servers that don't perform any data reduction and rarely delete data, I have defined them into a colocation group to get the best capacity of my tape resources so I'm not always running a reclaim on 60% utilized tapes. Here's a good doc on colocation: https://www.ibm.com/support/knowled...0/com.ibm.itsm.srv.doc/t_colloc_planning.html
By using groups I was able to free up 20 or so LTO6 volumes in my primary tape pool.

If I'm understanding your concern correctly, you are worried about a client that has say 20 filesystems with several million files underneath each going to one 'node name'. So, resource utilization will help in scanning and backing up those files. However, there could be a point where you will need to set up multiple agents and tie them together with proxy and target node definitions. Then have each proxy agent process a subset of those files assuming the server and disk IO can keep up.

There are other 3rd party products that help serialize the TSM backup operations. One that I am familiar with, but do not have deployed is this: http://www.general-storage.com/PRODUCTS/dsmISI-MAGS/dsmisi-mags.html

Hope this helps.

*Unless talking about some TDP products, then yes where some control files for say VM's are stored it can affect backup performance.
 

Advertise at ADSM.ORG

If you are reading this, so are your potential customer. Advertise at ADSM.ORG right now.

UpCloud high performance VPS at $5/month

Get started with $25 in credits on Cloud Servers. You must use link below to receive the credit. Use the promo to get upto 5 month of FREE Linux VPS.

The Spectrum Protect TLA (Three-Letter Acronym): ISP or something else?

  • Every product needs a TLA, Let's call it ISP (IBM Spectrum Protect).

    Votes: 18 19.6%
  • Keep using TSM for Spectrum Protect.

    Votes: 57 62.0%
  • Let's be formal and just say Spectrum Protect

    Votes: 10 10.9%
  • Other (please comement)

    Votes: 7 7.6%

Forum statistics

Threads
31,583
Messages
134,647
Members
21,649
Latest member
worblehat
Top