Hi Tam,
I can't say exactly what's going on at your place, but here are some of my
experiences with staging that might help you out a bit.
I don't think staging sorts the list of eligible savesets after age or
anything, it just makes a list of what falls in the category you've defined,
and then starts. This might result in the newest saveset being staged first, or
in any order as long as they are eligible.
If your staging was interrupted by the tape drive failing(or some other
reason), you could end up with savesets both on a tape volume and disk volume.
So the next time staging starts, it tries to stage a saveset from disk to a
tape volume which already has the saveset, and then fails with a message like:
"volume not eligible" in daemon.raw or daemon.log
The way staging works is that it is actually a cloning followed by a deletion
of the cloned savesets from the original volume. So if staging is interrupted
for some reason, it never gets to the deletion part, and you're left with
identical savesets on two different volumes (disk and tape). And the next time
it starts staging it might try to use the same volume again and then fails and
very often hangs.
Furthermore, if you have backups going directly to tape, and the staging
decides to start when the tape drive is busy doing a backup job, you staging
will also hang. This is very likely to happen if you only have one tape drive.
I've actually stopped using staging because of the above mentioned issues.
Especially at smaller installations with only a single tape drive, this can
easily foul up and hang pretty much everything. Instead of staging, I'm using
scheduled cloning and then setting the clone retention times on tape volumes to
how long I want to keep the data. On the disk device, I've just set browse and
retention to a couple of weeks(client browse and retention setting). This way I
get to decide when the data is moved to tape, which is usually a couple of
times every week. And since the retention times of the saveset clones on the
disk device are so short, they are removed after a couple of weeks.
HTH
/tony
-----Original Message-----
From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] On
Behalf Of tammclaughlin
Sent: Monday, May 06, 2013 5:33 PM
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: [Networker] Staging: Looks like the wrong save sets have been staged
I have an issue with a staging policy where the most recent save sets have been
staged rather than the oldest.
Let me explain this:
Some background:
Networker 7.6.09
backup to adv_files and then clone to tape staging from adv_files to tape.
staging policy: start: 95%, stop: 91%, oldest, max days: 7, recover: 5 days:
check fs: 120minutes
We have a faulty tape drive so currently running on 1 drive until the
replacement arrives.
As the weekend backups were full backups, on Friday, I forced staging by
changing the thresholds to ensure I had as much free space as I could get for
the weekend. This was to allow the backup jobs to be cloned to tape with
minimal contention if staging set in.
So today I saw that the backups had hung as networker was waiting on a stage
tape that was 100% full.
When I investigated I found that nsrstage was not running and the filesystems
were within the threshold limits.
What was happening was that a job was trying to clone a save set just created
but the save set was now on a stage tape.
It could not load the stage tape because there was only 1 drive which had the
"destination" tape for the clone.
So why did the stage tape have most recent save sets?
I looked at the volumes were on the filesystem which seems to give some clues.
filesystem: /diskbackup1
Total: 16TB, 1.8TB free (currently)
volume: size on disk monthly backup size
notes: 13T 6.8T
unix: 17G 300G
linux: 79M 400G
The most recent save sets staged were from the volumes unix and linux and very
few from the volume notes. In fact almost all of the unix and linux save sets
have been staged.
Now some of the largest save sets from notes are 500GB so it's possible that
all of the linux save sets can be staged in the time it takes to stage just one
notes save set.
I expected the staging police to compile a list of the oldest stage sets across
all devices and then move to tape which would mean that the most recent would
still be on disk. So it seems that staging is selecting save sets in a
different manner.
Could it be that it treats each volume separately?
Could it be looking for the oldest save sets in each volume and starts to stage
them. While still writing the larger notes save sets, it goes back to look at
other volumes and takes from other volumes as the notes is still busy with such
a large save set?
Another possibility is that the next filesystem check starts while it is still
staging and cannot read from the notes volume as it is still being used and
takes from the unix/linux volume instead and just keep taking until it meets
the required threshold?
Thanks.
+----------------------------------------------------------------------
|This was sent by tam.mclaughlin AT gmail DOT com via Backup Central.
|Forward SPAM to abuse AT backupcentral DOT com.
+----------------------------------------------------------------------
|