And you see now errors in acsss_event.log on the ACSLS server when the
drives get downed?
Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: hampus.lind at rps.police.se
-----Ursprungligt meddelande-----
Fr?n: veritas-bu-bounces at mailman.eng.auburn.edu
[mailto:veritas-bu-bounces at mailman.eng.auburn.edu] F?r Justin Piszcz
Skickat: den 8 december 2006 21:05
Till: Hall, Christian N.
Kopia: Mike Dunn (veritas-bu); veritas-bu at mailman.eng.auburn.edu
?mne: Re: [Veritas-bu] Question posed to ACSLS/STK8500 users.
Yes everything matches perfectly. Remember, if I run the backups slowly,
one at a time, I can see each of the 4 drives being used per each media
server. When I run a burst of jobs though, 29-30 of them work (1 tape per
each drive) and a RANDOM 2-3 drives do not work (it differs each time I do
it)..
Currently I am not using MPX so Ican easily test, ie 1 job = 1 tape drive.
Justin.
On Fri, 8 Dec 2006, Hall, Christian N. wrote:
> Justin,
>
> Do the ACSLS,LSM,PANEL,DRIVE NUMBER for ACSLS match serial number
> results from the tpautconf -t on the master server /dev/rmt/*cbn?
> Can you please display the output? Did you perform this test from your
> master server, or did you perform this test from each host that are
> media servers? After you attempt your multi-plexing do you have stuck
> tapes?
>
> Chris
>
> -----Original Message-----
> From: veritas-bu-bounces at mailman.eng.auburn.edu
> [mailto:veritas-bu-bounces at mailman.eng.auburn.edu] On Behalf Of Justin
> Piszcz
> Sent: Friday, December 08, 2006 2:44 PM
> To: Mike Dunn (veritas-bu)
> Cc: veritas-bu at mailman.eng.auburn.edu
> Subject: Re: [Veritas-bu] Question posed to ACSLS/STK8500 users.
>
> It is 100% correct. Yep. I ran about 5 test backups to each drive in
> the robot. No problems. It is only when there is a burst of jobs.
>
> Justin.
>
> On Fri, 8 Dec 2006, Mike Dunn (veritas-bu) wrote:
>
> > Justin,
> >
> > Are you absolutely certain that you have your drive mapping done
> properly?
> > The fact that the job fails 30 minutes after the initial mount attempt
>
> > makes it sound like you are failing with a media mount time out. The
> > most common cause (especially with ACS environments) is a simple
> > mismatch betwee the /dev/rmt path and your ACS path (i.e.
> > ACS,LSM,PANEL,DRIVE). The SL8500 is also very difficult to address
> > properly, since the ACS path has little correlation with the physical
> location of the drive.
> >
> > Probably the quickest test you can perform is to verify that your jobs
>
> > are being affected by the media mount timeout. If you shorten the
> > media mount timeout parameter, to say 10 minutes, your jobs should
> > fail 10 minutes after they start if the mount timeout is what fails
> the jobs.
> >
> > You should also track down which drives are failing to mount, and see
> > if there is a correlation.
> >
> > Cheers
> > Mike
> >
> >
> > >
> > > Message: 7
> > > Date: Fri, 8 Dec 2006 11:08:39 -0500 (EST)
> > > From: Justin Piszcz <jpiszcz at lucidpixels.com>
> > > Subject: [Veritas-bu] Question posed to ACSLS/STK8500 users.
> > > To: veritas-bu at mailman.eng.auburn.edu
> > > Message-ID: <Pine.LNX.4.64.0612081102150.15271 at p34.internal.lan>
> > > Content-Type: TEXT/PLAIN; charset=US-ASCII
> > >
> > > All,
> > >
> > > My group is setting up two Sun/StorageTek SL8500s. Sun did the
> > > install of ACSLS, there were no problems on their side. Each SL8500
>
> > > is in its own environment. On each SL8500, we have 8 media servers,
>
> > > connected to four drives each, giving us a total of 32 drives. For
> > > testing, I did the following. Ran a NON-MULTIPLEXED backup to each
> > > drive, to ensure each drive worked properly. To do this I kicked
> > > off four jobs in succession. When I do this, I utilize all 4 drives.
>
> > > I did this with each media server without a single problem.
> > > However, when testing everything together, all 32 drives, I kick off
>
> > > 45 jobs for example. It says there are 32 active jobs in netbackup,
>
> > > which is correct. The problem is, randomly, 2 or 3 jobs will hang
> > > at "Mounting MediaID.." and then the drive will go down after 30
> > > minutes. Why is this? With an L700, I can send 500-1000 jobs to
> > > all of the drives in it and there is never a mounting problem.
> > > There is nothing wrong with any of the drives, they are brand new.
> > > I can use ACSLS and dismount the media from the drives and then
> > > re-run my earlier test backups, one at a time to each of the four
> > > drives per-media server without any issues. It is only when the
> > > robot receives a 'burst' of jobs that this happens.
> > >
> > > Has anyone experienced anything like this before?
> > >
> > > Thanks for any help and responses,
> > >
> > > Justin.
> > >
> > >
> >
> > _______________________________________________
> > Veritas-bu maillist - Veritas-bu at mailman.eng.auburn.edu
> > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> >
> _______________________________________________
> Veritas-bu maillist - Veritas-bu at mailman.eng.auburn.edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
_______________________________________________
Veritas-bu maillist - Veritas-bu at mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
|