Veritas-bu

[Veritas-bu] Restore priority

2001-10-12 22:03:04
Subject: [Veritas-bu] Restore priority
From: larry.kingery AT veritas DOT com (Larry Kingery)
Date: Fri, 12 Oct 2001 22:03:04 -0400 (EDT)
Yes, but if a drive goes down for some reason, then all N-1 drives
will end up being used for backup.

As for the scheduling issue, jobs are allocated to storage units in
alphabetical order, filling up the first and then continuing onto the
second.  A simple way to balance the load in the example would be to
make two storage units per media server with 5 drives each, and name
them such that the first is on one media server, the second is one the
other, etc.  Obviously the less drives you use per storage unit, the
more "load balanced" you will be.  BUT, remember that the more stu's
you have, the more work the scheduler is going to have to do at times,
so keep it reasonable.

Then again, if your media server and infrastructure can't handle 10
drives at once at capacity, I don't see why you'd create a stu with 10
drives (why run 10 drives each at 80% instead of 8 drives at 100%?).
And if it can, why would one care about load balancing (why is running
one server at N% capacity less preferable than two servers at N/2%)?

L

White, Steve writes:
> But what Larry was saying isn't that you should leave a drive DOWN, but
> that when you create the Storage Unit(s) which represent your robot and
> specify how many drives they should have, make sure the total number of
> drives is one less than you actually have.  The drive is up, but since
> it's not configured in a storage unit, it will not be used for backups.
> Restores don't use storage units so they can use the drive if there are
> no others available.
> 
> Steve White 
> 
> -----Original Message----- 
> From: Anthony Soprano [ mailto:Anthony.Soprano AT home DOT com
> <mailto:Anthony.Soprano AT home DOT com> ] 
> Sent: Friday, October 12, 2001 4:18 PM 
> To: Larry Kingery; Jason Ahrens 
> Cc: Veritas BU 
> Subject: RE: [Veritas-bu] Restore priority 
> 
> 
> This works well, but keep in mind that if a drive is DOWN it will wind
> up 
> being the one NOT used for backups, so the restore drive will be
> available 
> but DOWN. 
> 
> On a similar note, I wish that in an SSO setup with 2 media servers and
> 10 
> drives, the workload got distributed evenly by the scheduler.  Currently
> it 
> will try to hand off the jobs in such a way that the first media server
> may 
> grab all the drives for use and media server 2 will be left out in the
> cold 
> with no available drives.  To compensate you lower the drive count in
> the 
> st_unit config, BUT this greatly limits how SSO provides resource
> sharing. 
> SO if I setup each st_unit to use 5 drives for backups each media server
> 
> will get a fair shake.  However if a media server goes down or is
> otherwise 
> unable to do backups the hardsetting of 5 per will prevent SSO from 
> recovering from the media server loss.  This can be changed thru st_unit
> 
> config in the case of a failure manually, but... 
> 
> If there were some kind of supply (drives) vs. demand (jobs+media
> servers) 
> mechanism in NBU you could say: 
> 
> If demand for drives is 20 (as above, 2hosts x 10drives) then distribute
> the 
> drives in manner X. 
> If demand for drives is 10 (as above one host down) then distribute the 
> drives in manner Y. 
> 
> Or if the jobs just got handed out to the media servers in a round-robin
> 
> manner it might get past this hoarding of resources by the first media 
> server. 
> 
> 
> A.S. 
> 
> -----Original Message----- 
> From: veritas-bu-admin AT mailman.eng.auburn DOT edu 
> [ mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu
> <mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu> ]On Behalf Of Larry 
> Kingery 
> Sent: Thursday, October 11, 2001 10:39 AM 
> To: Jason Ahrens 
> Cc: Veritas BU 
> Subject: Re: [Veritas-bu] Restore priority 
> 
> 
> Configure your storage unit to use less than the number of 
> physical drives.  Since restores do not use storage units, 
> they will be able to use the unused drives. 
> 
> > 
> > Yesterday night, we had a system go down. The system was rebuilt and a
> 
> > restore was started to bring back the data. 
> > 
> > The restore was started just minutes after the backup window opened.
> This 
> > means taht the queue was full of backup jobs. It also appeared that
> the 
> > restore did not take priority over the backups, and it would have
> taken 
> > hours to wait for the queue to clear. This was not acceptable and we
> ended 
> > up killing all the backup jobs so the restore could happen, and
> requeued 
> all 
> > the backups. 
> > 
> > I'm thinking there has to be a better way. 
> > 
> > How can I instruct NetBackup to consider restores at a higher priority
> 
> than 
> > backups, so that when the first free drive and required tape becomes
> free, 
> > the restore will occur. I would't ask that backups in progress are 
> > halted/suspended for the restore, just that the restore go to the top
> of 
> the 
> > queue. 
> > 
> > Thanks 
> > 
> > Jason 
> > 
> > -- 
> > Jason Ahrens 
> > Systems Administrator/Backup Specialist 
> > PSINet Limited 
> > http://www.psi.ca <http://www.psi.ca>  
> > The Internet SuperCarrier 
> > 
> > _______________________________________________ 
> > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu 
> > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> <http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu>  
> > 
> 
> _______________________________________________ 
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu 
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> <http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu>  
> 
> _______________________________________________ 
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu 
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> <http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu>  
> 
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
> <HTML>
> <HEAD>
> 
> <META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2653.12">
> <TITLE>RE: [Veritas-bu] Restore priority</TITLE>
> </HEAD>
> <BODY>
> 
> <P><FONT SIZE=2>But what Larry was saying isn't that you should leave a drive 
> DOWN, but that when you create the Storage Unit(s) which represent your robot 
> and specify how many drives they should have, make sure the total number of 
> drives is one less than you actually have.&nbsp; The drive is up, but since 
> it's not configured in a storage unit, it will not be used for backups. 
> Restores don't use storage units so they can use the drive if there are no 
> others available.</FONT></P>
> 
> <P><FONT SIZE=2>Steve White</FONT>
> </P>
> 
> <P><FONT SIZE=2>-----Original Message-----</FONT>
> <BR><FONT SIZE=2>From: Anthony Soprano [<A HREF="mailto:Anthony.Soprano AT 
> home DOT com">mailto:Anthony.Soprano AT home DOT com</A>]</FONT>
> <BR><FONT SIZE=2>Sent: Friday, October 12, 2001 4:18 PM</FONT>
> <BR><FONT SIZE=2>To: Larry Kingery; Jason Ahrens</FONT>
> <BR><FONT SIZE=2>Cc: Veritas BU</FONT>
> <BR><FONT SIZE=2>Subject: RE: [Veritas-bu] Restore priority</FONT>
> </P>
> <BR>
> 
> <P><FONT SIZE=2>This works well, but keep in mind that if a drive is DOWN it 
> will wind up</FONT>
> <BR><FONT SIZE=2>being the one NOT used for backups, so the restore drive 
> will be available</FONT>
> <BR><FONT SIZE=2>but DOWN.</FONT>
> </P>
> 
> <P><FONT SIZE=2>On a similar note, I wish that in an SSO setup with 2 media 
> servers and 10</FONT>
> <BR><FONT SIZE=2>drives, the workload got distributed evenly by the 
> scheduler.&nbsp; Currently it</FONT>
> <BR><FONT SIZE=2>will try to hand off the jobs in such a way that the first 
> media server may</FONT>
> <BR><FONT SIZE=2>grab all the drives for use and media server 2 will be left 
> out in the cold</FONT>
> <BR><FONT SIZE=2>with no available drives.&nbsp; To compensate you lower the 
> drive count in the</FONT>
> <BR><FONT SIZE=2>st_unit config, BUT this greatly limits how SSO provides 
> resource sharing.</FONT>
> <BR><FONT SIZE=2>SO if I setup each st_unit to use 5 drives for backups each 
> media server</FONT>
> <BR><FONT SIZE=2>will get a fair shake.&nbsp; However if a media server goes 
> down or is otherwise</FONT>
> <BR><FONT SIZE=2>unable to do backups the hardsetting of 5 per will prevent 
> SSO from</FONT>
> <BR><FONT SIZE=2>recovering from the media server loss.&nbsp; This can be 
> changed thru st_unit</FONT>
> <BR><FONT SIZE=2>config in the case of a failure manually, but...</FONT>
> </P>
> 
> <P><FONT SIZE=2>If there were some kind of supply (drives) vs. demand 
> (jobs+media servers)</FONT>
> <BR><FONT SIZE=2>mechanism in NBU you could say:</FONT>
> </P>
> 
> <P><FONT SIZE=2>If demand for drives is 20 (as above, 2hosts x 10drives) then 
> distribute the</FONT>
> <BR><FONT SIZE=2>drives in manner X.</FONT>
> <BR><FONT SIZE=2>If demand for drives is 10 (as above one host down) then 
> distribute the</FONT>
> <BR><FONT SIZE=2>drives in manner Y.</FONT>
> </P>
> 
> <P><FONT SIZE=2>Or if the jobs just got handed out to the media servers in a 
> round-robin</FONT>
> <BR><FONT SIZE=2>manner it might get past this hoarding of resources by the 
> first media</FONT>
> <BR><FONT SIZE=2>server.</FONT>
> </P>
> <BR>
> 
> <P><FONT SIZE=2>A.S.</FONT>
> </P>
> 
> <P><FONT SIZE=2>-----Original Message-----</FONT>
> <BR><FONT SIZE=2>From: veritas-bu-admin AT mailman.eng.auburn DOT edu</FONT>
> <BR><FONT SIZE=2>[<A HREF="mailto:veritas-bu-admin AT mailman.eng.auburn DOT 
> edu">mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu</A>]On Behalf Of 
> Larry</FONT>
> <BR><FONT SIZE=2>Kingery</FONT>
> <BR><FONT SIZE=2>Sent: Thursday, October 11, 2001 10:39 AM</FONT>
> <BR><FONT SIZE=2>To: Jason Ahrens</FONT>
> <BR><FONT SIZE=2>Cc: Veritas BU</FONT>
> <BR><FONT SIZE=2>Subject: Re: [Veritas-bu] Restore priority</FONT>
> </P>
> <BR>
> 
> <P><FONT SIZE=2>Configure your storage unit to use less than the number 
> of</FONT>
> <BR><FONT SIZE=2>physical drives.&nbsp; Since restores do not use storage 
> units,</FONT>
> <BR><FONT SIZE=2>they will be able to use the unused drives.</FONT>
> </P>
> 
> <P><FONT SIZE=2>&gt;</FONT>
> <BR><FONT SIZE=2>&gt; Yesterday night, we had a system go down. The system 
> was rebuilt and a</FONT>
> <BR><FONT SIZE=2>&gt; restore was started to bring back the data.</FONT>
> <BR><FONT SIZE=2>&gt;</FONT>
> <BR><FONT SIZE=2>&gt; The restore was started just minutes after the backup 
> window opened. This</FONT>
> <BR><FONT SIZE=2>&gt; means taht the queue was full of backup jobs. It also 
> appeared that the</FONT>
> <BR><FONT SIZE=2>&gt; restore did not take priority over the backups, and it 
> would have taken</FONT>
> <BR><FONT SIZE=2>&gt; hours to wait for the queue to clear. This was not 
> acceptable and we ended</FONT>
> <BR><FONT SIZE=2>&gt; up killing all the backup jobs so the restore could 
> happen, and requeued</FONT>
> <BR><FONT SIZE=2>all</FONT>
> <BR><FONT SIZE=2>&gt; the backups.</FONT>
> <BR><FONT SIZE=2>&gt;</FONT>
> <BR><FONT SIZE=2>&gt; I'm thinking there has to be a better way.</FONT>
> <BR><FONT SIZE=2>&gt;</FONT>
> <BR><FONT SIZE=2>&gt; How can I instruct NetBackup to consider restores at a 
> higher priority</FONT>
> <BR><FONT SIZE=2>than</FONT>
> <BR><FONT SIZE=2>&gt; backups, so that when the first free drive and required 
> tape becomes free,</FONT>
> <BR><FONT SIZE=2>&gt; the restore will occur. I would't ask that backups in 
> progress are</FONT>
> <BR><FONT SIZE=2>&gt; halted/suspended for the restore, just that the restore 
> go to the top of</FONT>
> <BR><FONT SIZE=2>the</FONT>
> <BR><FONT SIZE=2>&gt; queue.</FONT>
> <BR><FONT SIZE=2>&gt;</FONT>
> <BR><FONT SIZE=2>&gt; Thanks</FONT>
> <BR><FONT SIZE=2>&gt;</FONT>
> <BR><FONT SIZE=2>&gt; Jason</FONT>
> <BR><FONT SIZE=2>&gt;</FONT>
> <BR><FONT SIZE=2>&gt; --</FONT>
> <BR><FONT SIZE=2>&gt; Jason Ahrens</FONT>
> <BR><FONT SIZE=2>&gt; Systems Administrator/Backup Specialist</FONT>
> <BR><FONT SIZE=2>&gt; PSINet Limited</FONT>
> <BR><FONT SIZE=2>&gt; <A HREF="http://www.psi.ca"; 
> TARGET="_blank">http://www.psi.ca</A></FONT>
> <BR><FONT SIZE=2>&gt; The Internet SuperCarrier</FONT>
> <BR><FONT SIZE=2>&gt;</FONT>
> <BR><FONT SIZE=2>&gt; _______________________________________________</FONT>
> <BR><FONT SIZE=2>&gt; Veritas-bu maillist&nbsp; -&nbsp; Veritas-bu AT 
> mailman.eng.auburn DOT edu</FONT>
> <BR><FONT SIZE=2>&gt; <A 
> HREF="http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu"; 
> TARGET="_blank">http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu</A></FONT>
> <BR><FONT SIZE=2>&gt;</FONT>
> </P>
> 
> <P><FONT SIZE=2>_______________________________________________</FONT>
> <BR><FONT SIZE=2>Veritas-bu maillist&nbsp; -&nbsp; Veritas-bu AT 
> mailman.eng.auburn DOT edu</FONT>
> <BR><FONT SIZE=2><A 
> HREF="http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu"; 
> TARGET="_blank">http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu</A></FONT>
> </P>
> 
> <P><FONT SIZE=2>_______________________________________________</FONT>
> <BR><FONT SIZE=2>Veritas-bu maillist&nbsp; -&nbsp; Veritas-bu AT 
> mailman.eng.auburn DOT edu</FONT>
> <BR><FONT SIZE=2><A 
> HREF="http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu"; 
> TARGET="_blank">http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu</A></FONT>
> </P>
> 
> </BODY>
> </HTML>
-- 
Larry Kingery 
              All wiyht. Rho sritched mg kegtops awound?

<Prev in Thread] Current Thread [Next in Thread>