Veritas-bu

[Veritas-bu] Question posed to ACSLS/STK8500 users.

2006-12-08 15:16:50
Subject: [Veritas-bu] Question posed to ACSLS/STK8500 users.
From: jpiszcz at lucidpixels.com (Justin Piszcz)
Date: Fri, 8 Dec 2006 15:16:50 -0500 (EST)
Nope, only 1 NIC.  And even so yeah I do specify that in the vm.conf just 
incase.  

Justin.

On Fri, 8 Dec 2006, Mike Dunn (veritas-bu) wrote:

> Hmmm, do your media server's have multiple NIC, and are you using IP
> multipathing software? (like in.mpathd under Solaris)  If so, then make
> sure that you have set the ACS_SSI_HOSTNAME appropriately in your vm.conf
> file.  The acs daemon inserts the value (or inferred value) of
> ACS_SSI_HOSTNAME into all communications with the acs server.  Also, make
> sure that if you are using acls on the acs server, that they match the
> name/IP used in ACS_SSI_HOSTNAME.
> 
>   Cheers
>   Mike
> 
> 
> On 1:43:52 pm 2006-12-08 Justin Piszcz <jpiszcz at lucidpixels.com> wrote:
> > It is 100% correct.  Yep.  I ran about 5 test backups to each drive
> > in the robot.  No problems.  It is only when there is a burst of jobs.
> >
> > Justin.
> >
> > On Fri, 8 Dec 2006, Mike Dunn (veritas-bu) wrote:
> >
> > >  Justin,
> > >
> > >  Are you absolutely certain that you have your drive mapping done
> > >  properly? The fact that the job fails 30 minutes after the initial
> > >  mount attempt makes it sound like you are failing with a media
> > >  mount time out.  The most common cause (especially with ACS
> > >  environments) is a simple mismatch betwee the /dev/rmt path and
> > >  your ACS path (i.e. ACS,LSM,PANEL,DRIVE).  The SL8500 is also very
> > >  difficult to address properly, since the ACS path has little
> > >  correlation with the physical location of the drive.
> > >  Probably the quickest test you can perform is to verify that your
> > >  jobs are being affected by the media mount timeout.  If you
> > >  shorten the media mount timeout parameter, to say 10 minutes, your
> > >  jobs should fail 10 minutes after they start if the mount timeout
> > >  is what fails the jobs.
> > >  You should also track down which drives are failing to mount, and
> > >  see if there is a correlation.
> > >
> > >    Cheers
> > >    Mike
> > >
> > >
> > > >
> > > >  Message: 7
> > > >  Date: Fri, 8 Dec 2006 11:08:39 -0500 (EST)
> > > >  From: Justin Piszcz <jpiszcz at lucidpixels.com>
> > > >  Subject: [Veritas-bu] Question posed to ACSLS/STK8500 users.
> > > >  To: veritas-bu at mailman.eng.auburn.edu
> > > >  Message-ID: <Pine.LNX.4.64.0612081102150.15271 at p34.internal.lan>
> > > >  Content-Type: TEXT/PLAIN; charset=US-ASCII
> > > >
> > > >  All,
> > > >
> > > >  My group is setting up two Sun/StorageTek SL8500s.  Sun did the
> > > >  install of ACSLS, there were no problems on their side.  Each
> > > >  SL8500 is in its own environment.  On each SL8500, we have 8
> > > >  media servers, connected to four drives each, giving us a total
> > > >  of 32 drives.  For testing, I did the following.  Ran a
> > > >  NON-MULTIPLEXED backup to each drive, to ensure each drive
> > > >  worked properly.  To do this I kicked off four jobs in
> > > >  succession. When I do this, I utilize all 4 drives.  I did this
> > > >  with each media server without a single problem.  However, when
> > > >  testing everything together, all 32 drives, I kick off 45 jobs
> > > >  for example.  It says there are 32 active jobs in netbackup,
> > > >  which is correct.  The problem is, randomly, 2 or 3 jobs will
> > > >  hang at "Mounting MediaID.." and then the drive will go down
> > after 30 minutes.  Why is this?  With an L700, I can send 500-1000 jobs
> > > >  to all of the drives in it and there is never a mounting
> > > >  problem.  There is nothing wrong with any of the drives, they
> > > >  are brand new.  I can use ACSLS and dismount the media from the
> > > >  drives and then re-run my earlier test backups, one at a time to
> > > >  each of the four drives per-media server without any issues.  It
> > > >  is only when the robot receives a 'burst' of jobs that this
> > > > happens.
> > > >  Has anyone experienced anything like this before?
> > > >
> > > >  Thanks for any help and responses,
> > > >
> > > >  Justin.
> > > >
> > > >
> > >
> > >  _______________________________________________
> > >  Veritas-bu maillist  -  Veritas-bu at mailman.eng.auburn.edu
> > >  http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>