One Client schedule or multiple better

proehl1

Newcomer
Joined
Oct 5, 2009
Messages
3
Reaction score
0
Points
0
New TSM admin except for working with it awhile back perhaps 12+ years ago on the mainframe.

My question is how to best schedule nodes to maximize the backup window. We have 600+ nodes backing up and have MAXSCHedsessions set to 40 and MAXSessions to 150. Does it matter if you just schedule all nodes to start at the same time as opposed to dividing nodes among multiple schedules spread over the backup window. We have a single client schedule that starts at 8PM with a 12 hour duration. Is this the way most would have their client scheduling setup? i.e. just let them all go at once. Would dividing the clients up using multiple schedules perhaps work better?

Currently backups are running fine most nights but sometimes we have large numbers of nodes Missed and are having a dificult time determing the reason.
 
Why is the MAXSCHEDsessions set so low? I would up that to at least 80%. I would also create a handful of schedules that start throughout the night/early morning and I would schedule them according to their backup time. With 600+ nodes I hope you consider splitting them over at least two instances of TSM. That seems like too many for one instance.
 
We have resourceutilization set to 8 for some nodes; we have MAXSCHEDsessions set low so as not to hit the max for allowed sessions if they run concurrently.

We are considering splitting off onto another instance but haven't done so yet.

Other than being able to have certain clients backup at a particular time are there other advantages to having a handful of schedules that start throughout the night/early morning?
 
There are two main ways of distributing load across the available backup window. The first (and probably best) is to subdivide the backup window into sub-windows. Here I run a multiple windows, each of which opens every 2 hours (e.g. 6pm, 8pm etc). That gets me a rough balance throughout the window.

The second relies on the use of the polling scheduling mode. You can set a randomisation percentage at the server which distributes client session startup across the "duration" setting for each client schedule. That means you don't get bursts of nodes hitting the server at the start of their window.

I use both techniques. Gets you an even spread of sessions.
 
We have resourceutilization set to 8 for some nodes; we have MAXSCHEDsessions set low so as not to hit the max for allowed sessions if they run concurrently.

I would question this logic.

For me, I set MAXSCHED over what I have scheduled for on certain windows. This avoids nodes timing out and eventually missing the backup window.
 
Way back when I did some of this on the mainframe, with a lot fewer clients then, I recall having done it more or less as is being suggested, i.e. we sub-dvided the backup window and had multiple schedules. I don't recall why we had done it that way so I am looking for some wisdom on what method works best and why.

As I understand the logic applied to our current scheduling approach, we believe 40 hosts backing up at any given time is optimal. Trying to keep this level of activity going throughout the backup window until all complete, we schedule all hosts to start at the beginning of the window and use MaXSCHEDsessions to throttle what actually runs to that limit. And we observe that 40 hosts are active at any given point. To allow for non-backup sessions and extra sessions due to hosts running 4-6 sessions based on resourceutilization of 8 we set the Maxsessions quite a bit higher. BUT -we do however have sessions that time out and hosts miss the backup window.

We use server prompted scheduling. Is that just not wise? Is client polling to be able to use radomization the preferred way to go for most?
 
For me prompted scheduling is a double-edged sword. It does permit you to run client actions on hosts in a timely fashion (e.g. to have an operator rerun a failed backup), however you lose much of your ability to stop sessions bursting onto the server.

If you were to increase various timeouts and increase the duration of your schedules you could continue to gain the advantage of prompted schedules. There's a downside to that though - cruddy sessions hanging around. How much hassle this generates will depend entirely on your client base though...ymmv.

For me, though, it just isn't worth it. Typically here we don't restart a backup using the TSM server. It generally failed for a reason, and that reason is probably beyond the ability of an operator to fix. Fortunately our failures are infrequent. The main exception to this is a failed tape write - however we have a fairly good degree of control over our media (using private volumes to track media quality) so this doesn't happen much (in fact in the last 11 months its happened exactly once). In the event that a critical client does fail its backup, that backup is respawned from the client (i.e. the TSM scheduler isn't involved).

There are people with a different personal preference - but for mine its better to use polling mode, to define a matched set of schedule durations with a randomisation percentage. In my case the durations vary depending on the class of schedule, but for most baclient incrementals its a 4 hour duration with a 10% randomisation (i.e. all schedules will start within 0.4 hours of their startup window opening). The vast majority of clients finish within 2 hours of the window opening.
 
Back
Top