Are you using UDP for communication with your acs server? (UDP is default).
If is, try switching to TCP.
Cheers
Mike
On 2:16:50 pm 2006-12-08 Justin Piszcz <jpiszcz at lucidpixels.com> wrote:
> Nope, only 1 NIC. And even so yeah I do specify that in the vm.conf
> just incase.
>
> Justin.
>
> On Fri, 8 Dec 2006, Mike Dunn (veritas-bu) wrote:
>
> > Hmmm, do your media server's have multiple NIC, and are you using
> > IP multipathing software? (like in.mpathd under Solaris) If so,
> > then make sure that you have set the ACS_SSI_HOSTNAME
> > appropriately in your vm.conf file. The acs daemon inserts the
> > value (or inferred value) of ACS_SSI_HOSTNAME into all
> > communications with the acs server. Also, make sure that if you
> > are using acls on the acs server, that they match the name/IP used
> > in ACS_SSI_HOSTNAME.
> > Cheers
> > Mike
> >
> >
> > On 1:43:52 pm 2006-12-08 Justin Piszcz <jpiszcz at lucidpixels.com>
> > > wrote: It is 100% correct. Yep. I ran about 5 test backups to
> > > each drive in the robot. No problems. It is only when there is
> > > a burst of jobs.
> > > Justin.
> > >
> > > On Fri, 8 Dec 2006, Mike Dunn (veritas-bu) wrote:
> > >
> > > > Justin,
> > > >
> > > > Are you absolutely certain that you have your drive mapping
> > > > done properly? The fact that the job fails 30 minutes after
> > > > the initial mount attempt makes it sound like you are failing
> > > > with a media mount time out. The most common cause
> > > > (especially with ACS environments) is a simple mismatch
> > > > betwee the /dev/rmt path and your ACS path (i.e.
> > > > ACS,LSM,PANEL,DRIVE). The SL8500 is also very difficult to
> > > > address properly, since the ACS path has little correlation
> > > > with the physical location of the drive. Probably the
> > > > quickest test you can perform is to verify that your jobs are
> > > > being affected by the media mount timeout. If you shorten
> > > > the media mount timeout parameter, to say 10 minutes, your
> > > > jobs should fail 10 minutes after they start if the mount
> > > > timeout is what fails the jobs. You should also track down
> > > > which drives are failing to mount, and see if there is a
> > > > correlation.
> > > > Cheers
> > > > Mike
> > > >
> > > >
> > > > >
> > > > > Message: 7
> > > > > Date: Fri, 8 Dec 2006 11:08:39 -0500 (EST)
> > > > > From: Justin Piszcz <jpiszcz at lucidpixels.com>
> > > > > Subject: [Veritas-bu] Question posed to ACSLS/STK8500 users.
> > > > > To: veritas-bu at mailman.eng.auburn.edu
> > > > > Message-ID: <Pine.LNX.4.64.0612081102150.15271 at p34.internal.
> lan>
> > > > > Content-Type: TEXT/PLAIN; charset=US-ASCII
> > > > >
> > > > > All,
> > > > >
> > > > > My group is setting up two Sun/StorageTek SL8500s. Sun did
> > > > > the install of ACSLS, there were no problems on their side.
> > > > > Each SL8500 is in its own environment. On each SL8500, we
> > > > > have 8 media servers, connected to four drives each, giving
> > > > > us a total of 32 drives. For testing, I did the following.
> > > > > Ran a NON-MULTIPLEXED backup to each drive, to ensure each
> > > > > drive worked properly. To do this I kicked off four jobs in
> > > > > succession. When I do this, I utilize all 4 drives. I did
> > > > > this with each media server without a single problem.
> > > > > However, when testing everything together, all 32 drives, I
> > > > > kick off 45 jobs for example. It says there are 32 active
> > > > > jobs in netbackup, which is correct. The problem is,
> > > > > randomly, 2 or 3 jobs will hang at "Mounting MediaID.." and
> > > then the drive will go down after 30 minutes. Why is this?
> > > > > With an L700, I can send 500-1000 jobs to all of the drives
> > > > > in it and there is never a mounting problem. There is
> > > > > nothing wrong with any of the drives, they are brand new.
> > > > > I can use ACSLS and dismount the media from the drives and
> > > > > then re-run my earlier test backups, one at a time to each
> > > > > of the four drives per-media server without any issues. It
> > > > > is only when the robot receives a 'burst' of jobs that this
> > > > > happens. Has anyone experienced anything like this before?
> > > > >
> > > > > Thanks for any help and responses,
> > > > >
> > > > > Justin.
> > > > >
> > > > >
> > > >
> > > > _______________________________________________
> > > > Veritas-bu maillist - Veritas-bu at mailman.eng.auburn.edu
> > > > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
|