Networker

Re: [Networker] How does nsrclone order a list of SSIDs for cloning?

2009-12-28 20:00:19
Subject: Re: [Networker] How does nsrclone order a list of SSIDs for cloning?
From: Anacreo <anacreo AT GMAIL DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 28 Dec 2009 18:56:08 -0600
Not really on topic... but maybe this workflow will help you in your flow
design...

With our VTL I've switched all of our backups (except for database) to 1
monthly full (last weekend of the month) and daily incrementals all set to a
4 week retention.

Then we have a script (which runs at the beginning of the month) that sets
that one full backup to an extended period, 13 months.

Then there is another script that runs monthly (although I may push this up
to weekly) that nsrstages any data that is still on the VTL over 6 weeks off
to tape...

This allows me to regulate how full I keep the EDL by adjusting that 6 week
period to any arbitrary number.


Here is the pushing script which keeps 4 sessions going concurrently
(remember to leave tape drives free for recoveries...)  If you can use this
or have any improvements/feedback please let me know.

---- FreeEDL.sh -----
#!/usr/bin/sh

# This gets executed every day at 10:00am, after conclusion of backups
#

progname=`basename $0`
pidcount=`pgrep $progname| wc -l`
if [ $pidcount -gt 1 ]; then
  echo "$progname is already running." 1>&2;
  exit 1
fi

SOURCEPOOL="SourceEDLPool"
STAGEPOOL="TargetRetentionPool"
TIMEPERIOD="6 weeks ago"
MAX_CLONE_SESSIONS=4
stage_sessions=0
pidcount=0

QUERY="!ssrecycle,first=0,!incomplete,pool=${SOURCEPOOL},savetime<${TIMEPERIOD},location=EDL"

get_ssid ()
{
  SSID=`mminfo  -r "volume,ssid,cloneid" -q "$QUERY" -o "mo" | sed -e 1d |
awk -e '{ printf "%s\t%s/%s\n",$1,$2,$3 }`
}

stage_ssids ()
{
 TMPFILE="/tmp/nsrstage.${volume}.log"
 # echo "Would stage ${ssidlist}"
 /usr/sbin/nsrstage -s uschi1leg01 -b ${STAGEPOOL} -v -m -S ${ssidlist} 1>
${TMPFILE} 2>&1 &
 LASTPID="$!"
 echo "\t${LASTPID} is cloning ${ssidlist}."
 RUNNINGPIDS="${RUNNINGPIDS} ${LASTPID}"
 TMPFILES="${TMPFILES} ${TMPFILE}"
 pidcount=`expr ${pidcount} + 1`
}

seen_ssid ()
{
  if [ -z "$full_ssid_list" ]; then
    return 1;
  fi

  for i in $full_ssid_list
  do
    if [ "$1" -eq "$i" ]; then
      return 0;
    fi
  done
  return 1;
}

# Determin if PIDs are still running.
check_pids ()
{
  newRUNNINGPIDS=""
  pidcount=0

  for i in $RUNNINGPIDS
  do
    if [ -d /proc/$i ]; then
      newRUNNINGPIDS="${newRUNNINGPIDS} $i"
      pidcount=`expr ${pidcount} + 1`
    fi
  done
  RUNNINGPIDS=$newRUNNINGPIDS
}

next_stage ()
{
  while [ $pidcount -ge $MAX_CLONE_SESSIONS ]
  do
    sleep 10
    check_pids
  done
  # nsrim -X
}

# Get SSID's that need to be staged.
get_ssid

# Only run if SSID's are found.
if [ "${SSID}" ]; then

  ssid_count=`echo "${SSID}" | wc -l`
  set - ${SSID}

  echo "Starting Stage"
  if [ -z "${volume}" ]; then
    volume=$1
  fi


  volume_count=0
  full_ssid_list=""

  while [ $# -ge 1 ]; do
    if seen_ssid $2; then
        shift; shift;
        continue
    else
        dummy=
        # echo "Processing $1 $2"
    fi
    if [ "${volume}" = "$1" ]; then
      volume=$1
      ssidlist="${ssidlist} $2"
    else
      volume_count=`expr $volume_count + 1`
      stage_sessions=`expr $stage_sessions + 1`
      next_stage
      stage_ssids
      volume=$1
      ssidlist="$2"
    fi
    volume=$1
    full_ssid_list="$full_ssid_list $2"
    shift; shift;
  done

  next_stage
  stage_ssids
  wait

fi

echo "Looking through ${TMPFILES}"
if [ -n "${TMPFILES}" ]; then
  for i in ${TMPFILES}
  do
    if [ -r "${i}" ]; then
      echo "From ${i}:"
      cat $i
      rm $i
    fi
  done
fi

# get_ssid_details

exit

------ end -----




Below is for NDMP which is slightly more harrowing because there is no
nsrstage command for NDMP, so we must clone, ensure that the clone worked,
and then delete the original...
This will work with more then 1 session in parallel, however we've had a
data mover crash following some patches so I throttled it down to 1 since we
don't require 2 copies going at once...

----- FreeEDLNDMP.sh ----

#!/usr/bin/bash
###
# I don't normally script BASH... but
#   the standard sh won't retrieve the exit
#   status of background processes
###

progname=`basename $0`
pidcount=`pgrep $progname| wc -l`
if [ $pidcount -gt 2 ]; then
  echo "$progname is already running." 1>&2;
  exit 1
fi


SOURCEPOOL="Source NDMP VTL Pool"
CLONEPOOL="Destination NDMP Tape Pool"
TIMEPERIOD="4 weeks ago"
MAX_CLONE_SESSIONS=1
clone_sessions=0
pidcount=0

QUERY="!ssrecycle,first=0,!incomplete,pool=${SOURCEPOOL},savetime<${TIMEPERIOD},location=SL500"

get_ssid ()
{
  SSID=`mminfo  -r "volume,ssid,cloneid" -q "$QUERY" -o "mo" | sed -e '1d'`
}

clone_ssids ()
{
 TMPFILE="/tmp/nsrstage.${volume}.log"
 /usr/sbin/nsrndmp_clone -s uschi1leg01 -b ${CLONEPOOL} -v -S ${ssidlist} 1>
${TMPFILE} 2>&1 &
 LASTPID="$!"
 pidtracker="${pidtracker}${LASTPID} $ssidlist"$'\n'
 echo $'\t'"${LASTPID} is cloning ${ssidlist}."
 RUNNINGPIDS="${RUNNINGPIDS} ${LASTPID}"
 TMPFILES="${TMPFILES} ${TMPFILE}"
 pidcount=`expr ${pidcount} + 1`
}

seen_ssid ()
{
  if [ -z "$full_ssid_list" ]; then
    return 1;
  fi

  for i in $full_ssid_list
  do
    if [ "$1" = "$i" ]; then
      return 0;
    fi
  done
  return 1;
}

# Determin if PIDs are still running.
check_pids ()
{
  newRUNNINGPIDS=""
  pidcount=0

  for i in $RUNNINGPIDS
  do
    if [ -d /proc/$i ]; then
      newRUNNINGPIDS="${newRUNNINGPIDS} $i"
      pidcount=`expr ${pidcount} + 1`
    else
      wait $i
      lastexit=$?
      deletessids=`echo "${pidtracker}" | grep "^${i} " | cut -d' ' -f2-`
      if [ $lastexit -eq 0 ]; then
        echo $'\t'"$i was successful, deleting $deletessids."
        for d in $deletessids
        do
          nsrmm -yd -S $d
        done
        echo "success: nsrmm -yd -S $deletessids" >> /tmp/deletetracker
      else
        echo $'\t'"$i was not successful ($lastexit). NOT deleting
$deletessids."
        echo "failed: nonsrmm -yd -S $deletessids" >> /tmp/deletetracker
      fi
    fi
  done
  RUNNINGPIDS=$newRUNNINGPIDS
}

next_clone ()
{
  while [ $pidcount -ge $MAX_CLONE_SESSIONS ]
  do
    sleep 10
    check_pids
  done
  # nsrim -X
}

# Get SSID's that need to be cloned.
get_ssid

# Only run if SSID's are found.
if [ "${SSID}" ]; then

  ssid_count=`echo "${SSID}" | wc -l`
  echo "Found "$ssid_count" ssid's..."
  set - ${SSID}
  # strike the header row
  # shift; shift; shift

  echo "Starting Clone"
  if [ -z "${volume}" ]; then
    volume=$1
  fi


  volume_count=0
  full_ssid_list=""

  while [ $# -ge 1 ]; do
    CLONEID=$3

    if seen_ssid $2; then
        echo "Skipping $1 $2"
        shift; shift; shift;
        continue
    else
        dummy=
        echo "Processing $1 $2/${CLONEID}"
    fi
    if [ "${volume}" = "$1" ]; then
      volume=$1
      ssidlist="${ssidlist} $2/${CLONEID}"
    else
      volume_count=`expr $volume_count + 1`
      clone_sessions=`expr $clone_sessions + 1`
      next_clone
      clone_ssids
      volume=$1
      ssidlist="$2/${CLONEID}"
    fi
    volume=$1
    full_ssid_list="$full_ssid_list $2/${CLONE_ID}"
    shift; shift; shift;
  done
  wait

fi

exit

echo "Looking through ${TMPFILES}"
if [ -n "${TMPFILES}" ]; then
  for i in ${TMPFILES}
  do
    if [ -r "${i}" ]; then
      echo "From ${i}:"
      cat $i
      rm $i
    fi
  done
fi

# get_ssid_details

exit


On Mon, Dec 28, 2009 at 12:47 PM, bbartick <
networker-forum AT backupcentral DOT com> wrote:

> brerrabbit wrote:
> >
> > bbartick wrote:
> > > We recently rolled out a VTL and I'm in the process of scripting the
> cloning process. From the testing I've done, it seems as though nsrclone
> does not clone the SSIDs in the order you pass to the command.
> > > I'd love to clone the oldest savesets first so that I can roll them off
> the VTL first.
> > > I was thinking I could do something similar to:
> > >
> > > mminfo -ot -q "pool=VTL1,copies=1,!incomplete,savetime<21 days ago" -r
> ssid --> output to a file $FILE
> > >
> > >
> > > Then I will clone the ssids
> > >
> > > nohup nsrclone -b "$POOL" -y "$RETENTION" -J "$NODE" -S -f $FILE &
> > >
> > > I was surprised to see my savesets aren't being cloned in the order of
> the savesets in the file.
> > > Can anyone tell me how nsrclone determines which savesets will be
> cloned first?
> > >
> > > Thank you,
> > >
> > > Brett
> >
> >
> > I don't believe that this info is publicly available, we've looked into
> this previously and it wasn't.  In fact, we had an Networker resident from
> EMC on site and he couldn't get the info either.  Maybe things have changed,
> I've got a mod to our clone script I'd like to implement and having that
> info would simplify things greatly.
> >
> > --brerrabit
>
>
> My quess is that the internal logic has something to do with minimizing the
> number of tape changes. It looks like if you are cloning a particular
> saveset, the system will try and find other savesets from the mounted tape
> that might need to be cloned. With a physical tape, this makes perfect
> sense, but with a VTL and quick tape changes, maybe there should be an
> option to nsrclone to obey the provided list of saveset IDs.
> BTW: my pseudo code above had a type-o. copies should be "2", not "1".
>
> +----------------------------------------------------------------------
> |This was sent by bbartick AT us.nomura DOT com via Backup Central.
> |Forward SPAM to abuse AT backupcentral DOT com.
> +----------------------------------------------------------------------
>
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type
> "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this
> list. You can access the archives at
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER