ADSM-L

BKSTG behaviour when trying to copy a very large existing pool

1997-03-06 10:21:37
Subject: BKSTG behaviour when trying to copy a very large existing pool
From: Sheelagh Treweek <sheelagh.treweek AT COMPUTING-SERVICES.OXFORD.AC DOT UK>
Date: Thu, 6 Mar 1997 15:21:37 +0000
We have a large site-backup pool (about 25 million files and almost
2TB data) and have always maintained the primary tape copy and a
single copy storage pool. We have started to write a second copy
storage pool to be held (a long way) offsite.

To start the copy, we have been writing disk-to-tape for the copy
pools and then migrating disk-to-tape. So the recent data is secure.

Catching up the backlog is proving a more interesting challenge than
anticipated. The BKSTG process has a little think (for about 40 minutes)
popping out little messages like:

  "... Removable volume M00281 is required for storage pool backup."
  "... 3590 volume M00281 is expected to be mounted (R/O)."

for lots of tapes, before actually mounting a tape and starting any
tape-to-tape operations. Curiously, a little later some of the volumes
are repeated. Could it be that the BKSTG is trying to catch up a node
at a time? [ - we do not have collocation enabled -] I have looked at
the node occupancy figures and there are a few nodes completely up to
date on the third copy but not enough to confirm the suggestion.

It would appear that primary pool tapes which have been most recently
written to are being requested first, which seems emimently sensible
to me.

At the present rate of a few GB/day, it is going to take a *very*
long time to achieve this. Can anyone offer any tips/suggestions to
help me optimise this operation. Has anyone else done any similar
large-scale catchup operation? Did it ever finish?

We have a limited time-window/resources just now with a busy regular
workload and just 4 tape decks. It will be a little easier in a few
weeks time when we go up to 8 tape decks. Quite often the process(es)
get pre-empted by restore/file-recall operations and with a 40-minute
lead time to restart this is a bit of a headache.

[I have learned that BKSTG gets pre-empted before a RECLAMATION operation
(even though the reclamation sometimes restarts and the bkstg doesn't!)
I think Dwight was asking about process priority a few days ago - I
have never seen any order documented though; DB BACKUP and MIGRATION must
also have higher priority than BKSTG. A definitive list would be helpful.]


Thanks a lot for any insight. [RS6000/3494-3590s/AIX-4.1.5/ADSM 2.1.5.12]

Regards, Sheelagh

------------------------------------------------------------------------------
Sheelagh Treweek                         Email: sheelagh.treweek AT oucs.ox.ac 
DOT uk
Sheelagh Treweek                         Email: sheelagh.treweek AT oucs.ox.ac 
DOT uk
Oxford University Computing Services     Tel:   +44 (0)1865 273205
13 Banbury Road, Oxford, OX2 6NN, UK     Fax:   +44 (0)1865 273275
------------------------------------------------------------------------------
=========================================================================
<Prev in Thread] Current Thread [Next in Thread>
  • BKSTG behaviour when trying to copy a very large existing pool, Sheelagh Treweek <=