Veritas-bu

Re: [Veritas-bu] How to replace a broken tape drive in NetBackup 6.0MP4?

2007-09-13 07:19:07
Subject: Re: [Veritas-bu] How to replace a broken tape drive in NetBackup 6.0MP4?
From: Dominik Pietrzykowski <dominik_pietrzykowski AT toll.com DOT au>
To: Justin Piszcz <jpiszcz AT lucidpixels DOT com>, veritas-bu AT mailman.eng.auburn DOT edu
Date: Thu, 13 Sep 2007 21:03:31 +1000
Justin,

Try this, it doesn't always work and I delete media servers sometimes if I
have to.



tpautoconf -report_disc


Document ID: 271280
http://support.veritas.com/docs/271280 E-Mail Colleague IconE-Mail this
document to a colleague

How to replace devices in a shared storage option configuration on NetBackup
media servers
Exact Error Message
Device Configuration must be run on all servers to complete drive
replacement

Details:
This is related to VERITAS NetBackup Enterprise Server (tm) (version 5.x).
Step 5 in the Media Manager System Administrator's Guide for Windows (Page
335 - Appendix A) is a little confusing and this document hopes to clarify
the steps involved.

If you replace an existing device in your shared drive configuration with a
new device, the serial number of the device will likely change. This change
can lead to a wrong configuration in Media Manager of the global device
database and also the local device databases on each server.

The tpautoconf options used in this procedure are available only with
NetBackup release 5.0 or later. Media servers that are also robot control
hosts, must be running NetBackup release 5.0 or later to use this procedure

1. Configure the new device on all servers sharing the device. The device
must be available through the operating system of each server. This device
configuration may require remapping, rediscovery, and possibly a reboot of
the operating system (refer to the NetBackup Media Manager Device
Configuration Guide for more information).

2. On a server running NetBackup 5.0 or later, run tpautoconf -report_disc
on one of the reconfigured servers to produce a list of new and missing
hardware. This command will scan for new hardware, and produce a report
showing the new and replaced hardware.

3. Ensure that all servers that are sharing the new hardware are up and are
running NetBackup services.

4. On a server running NetBackup 5.0 or later, run tpautoconf with the
-replace_drive drive_name or -replace_robot robot_number option. Also
specify the -path hardware_path option. hardware_path is the new hardware
path, as shown in the output of the report created in step 2. The serial
number is read from the new hardware device and the media manager global
data base is updated. Also any servers (running NetBackup 5.0 or later)
sharing the new device will replace the serial number in their local
databases.

5. If the new device is a drive, run the device configuration wizard on all
servers that are sharing the drive. If the new device is a robot, run the
wizard on the server that is the robot control host and on all servers that
are running any version of NetBackup 4.5.


Example:
C:\>tpautoconf -replace_drive IBMULT3580-TD22 -path Tape2

Found a matching device in global DB, IBMULT3580-TD22 on host nbusso1
update of local DB on host nbusso1 completed  globalDB update for host
nbusso1 completed

Found a matching device in global DB, IBMULT3580-TD22 on host nbusso2
update of local DB on host nbusso2 completed  globalDB update for host
nbusso2 completed

Found a matching device in global DB, IBMULT3580-TD22 on host nbusso3
update of local DB on host nbusso3 completed  globalDB update for host
nbusso3 completed

Found a matching device in global DB, IBMULT3580-TD22 on host nbusso4
update of local DB on host nbusso4 completed  globalDB update for host
nbusso4 completed

Found a matching device in global DB, IBMULT3580-TD22 on host nbusso5
update of local DB on host nbusso5 completed  globalDB update for host
nbusso5 completed

Device Configuration must be run on all servers to complete drive
replacement


NOTE:
Step 5 is at times a little confusing when reviewing the output of
tpautoconf -replace_drive and reading the description provided in Step 4. In
case the device being replaced is a drive, please stop and restart the
device manager service on all the hosts sharing the drive.

The last line in the output of Step 4 states "Device Configuration must be
run on all servers to complete drive replacement." Restarting the Device
Manager service on all hosts sharing the affected drive should be sufficient
for all non-clustered NetBackup media servers sharing drives using shared
storage option.


-----Original Message-----
From: Justin Piszcz [mailto:jpiszcz AT lucidpixels DOT com] 
Sent: Thursday, 13 September 2007 8:45 PM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] How to replace a broken tape drive in NetBackup
6.0MP4?

This is how it plays out:

1. Drive 02 is down: from tpconfig -d:

$ sudo /usr/openv/volmgr/bin/tpconfig -d 
Id  DriveName           Type   Residence
       Drive Path
Status
****************************************************************************
1   XXXXXXXXXX_02        hcart2 TLD(0)  DRIVE=2
       /dev/st/nh1c0t0l0                                                DOWN

2. Our vendor comes out and replaces the failed drive.

3. We run bp.kill_all and then /etc/init.d/netbackup start:

media-server$ sudo /usr/openv/volmgr/bin/tpconfig -d
Id  DriveName           Type   Residence
       Drive Path
Status
****************************************************************************
0   XXXXXXXXXX_01        hcart2 TLD(0)  DRIVE=1
       /dev/st/nh0c0t0l0                                                UP
1   XXXXXXXXXX_02        hcart2 TLD(0)  DRIVE=2
       MISSING_DRIVE:HUM5AB0C31                                         DOWN

Currently defined robotics are:
   TLD(0)     robotic path = /dev/sg/h2c0t0l0,

EMM Server = master-server

4. The new drive shows up in /dev/st:

$ ls -l /dev/st/
lrwxrwxrwx    1 root     root            9 Sep 13 03:05 nh0c0t0l0 ->
/dev/nst0
lrwxrwxrwx    1 root     root            9 Sep 13 03:05 nh1c0t0l0 ->
/dev/nst1

5. The question is how do you fix this in NetBackup?

6. The way we currently do it is by deleting the media server from the
    master server with vmoprcmd -delete -devhost <media-server>
    Then re-adding it.
    I am parsing through the docs now but basically if you just:
    a. stop ltid
    b. update the drive via tpconfig
    c. this does not seem to fix the problem

7. What does veritas support say?
    1. stopltid
    2. startltid
    3. but this does not fix the problem either.

8. How do I know this?
    1. fastest way -> try to clean the drive and I get:
    a. error(82) no media/drive available
    2. other way -> no backups or restores will ever use this tape drive

--

This robot is an L700-- I will mention when drives are swapped out of other
types of hardware, such as ones that use ACSLS, I do not see this issue.

This case involves an L700 directly attached using fiber (SCSI) robotic
control.

--

Symantec has some docs but they seem to be pretty outdated:

http://seer.support.veritas.com/docs/259835.htm (4.5)

Anyone here have implemented procedures that are less invasive that 
actually work?

Using 6.0MP4 here.

Justin.
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [Veritas-bu] How to replace a broken tape drive in NetBackup 6.0MP4?, Dominik Pietrzykowski <=