So, how does everyone get reasonable performance out of the bpduplicate
utility? I've just been watching it and it struck me that it's a particularly
stupid piece of software. I presume that this must be deliberate to encourage
the purchasing of vault[1].
I have a bpduplicate job running at the moment, it's busy shoe shining the tape
head in the destination drive, and probably the source drive as well.
I have a single DLT7k in a library that I would like to use to duplicate each
day's backups for offsite storage. I have 8 DLT7k drives in the source library,
the one I use for the daily backups. So I kick off a bpduplicate job with the
following arguments:
bpduplicate -dstunit $COPYLIB \
-dp $OFFSITEPOOL \
-hoursago $DUPLICATIONPERIOD \
-fail_on_error 0 \
-mpx \
-v >> $TMPFILE 2>&1
Bpduplicate goes away and mounts a single tape in the destination library and a
single tape in the source library and proceeds to step through the backup
images one at a time, one tape at a time. I set the destination library so that
it would accept multiple retention levels per tape and multiplexed data
streams. We use multiplexed data streams during the evening backups in order to
keep the drives streaming.
I initially thought that the single destination drive would be the rate
limiting factor, but it isn't. Not by a long chalk. The limiting factor is
bpduplicate reading multiplexed streams from a single source drive to a single
destination drive, one at a time. This means that when it's duplicating a
highly multiplexed stream, the source drive is scanning the tape at full speed,
the destination drive is only receiving the odd chunk of data to write and as a
result, stopping and starting. It's even worse if the images on the source
drive are small with the source drive shoe shining as well as the destination
drive.
I did attempt to get multiple bpduplicate jobs to write data from several tapes
to the single destination drive, but they appear just to lock each other out
while only one process makes use of the destination drive.
When the destination storage unit is set to allow multiplexed data streams, why
doesn't bpduplicate mount multiple source tapes and run at the speed of the
destination drives? Or at least allow multiple duplication jobs to write to the
destination drives? After all, multiplexing is designed to allow multiple
client systems to write to a single drive, why doesn't it work while
duplicating?
The only solution I can think of at the moment is to create and make use of an
intermediate disk storage unit, but I basically don't have the space for that
and I thought the whole purpose of multiplexing data streams to tape was to do
away with the need for large disk pools.
[1] I don't have any confidence that vault will improve my duplication
throughput, it appears only to manage source/destination *pairs* of drives,
which unless someone knows better, leaves me exactly where I am just now.
--
Colin Smith
European Unix systems administrator
EMEA Global Infrastructure Solutions
Jays Close, Viables Industrial Estate,
Basingstoke, Hampshire, RG22 4PD, UK
|