Problems with EMC Celera. Filer to server backup

balkark

ADSM.ORG Member
Joined
May 12, 2007
Messages
44
Reaction score
2
Points
0
I am using TSM 5.5.2.0 and thought that using the 3 way backup gave me the most flexibility. In that i can backup to any native storage in TSM.

So i created a disk pool, tape pool to migrate too and a tape copy.
Created the domain, registered the node and datamover without errors.
Was able to do backups and i thought this was easy. Until one of the backup did finish, i cancelled and then i cant backup again.

TSM Error code keeps coming up.

I opened a call with IBM. who is saying the error is being generated by the NAS. I called EMC and they said 3 way backups are not supported.
So i am dead in the water until i get this sorted out.

Is anyone doing 3 way backups with an EMC Celera and could provide some guidance.

12/14/09 9:49:10 PM ESTANR1063I Full backup of NAS node EDZVSWICSN01, file system /root_vdm_6/REORGPROD, started as process 20 by administrator ADMIN. (SESSION: 263, PROCESS: 20)12/14/09 9:49:40 PM ESTANR9999D_2547733210 (ndlog.c:120) Thread<42>: NDMPLOG: E Medium error(SESSION: 263, PROCESS: 20)12/14/09 9:49:41 PM ESTANR9999D_1212503635 (ndlog.c:120) Thread<42>: NDMPLOG: E Backup is aborted.(SESSION: 263, PROCESS: 20)12/14/09 9:49:41 PM ESTANR9999D_3456149487 (ssremote.c:3515) Thread<42>: Unknown error 4 from spiStoreSeg. (SESSION: 263, PROCESS: 20)12/14/09 9:49:41 PM ESTANR1078E NAS Backup to TSM Storage process 20 terminated - internal server error detected. (SESSION: 263, PROCESS: 20)12/14/09 9:49:41 PM ESTANR0985I Process 20 for BACKUP NAS (FULL) running in the BACKGROUND completed with completion state FAILURE at 21:49:41. (SESSION: 263, PROCESS: 20)
 
i used to do 3-way NDMP for 2 years until 2 months ago as i switched back to native NDMP. What errors do you see on the datamover itself ?

server_log server_x
 
Attaching the log during the backup run.
The stuff i was focusing in on was the medium error portion.

Write fails in local or remote wirte msg, moverAddressType=1, mp=0x24ede60
2009-12-14 22:04:46: NDMP: 3: Thread bkup115 Medium error
2009-12-14 22:04:46: NDMP: 3: < LOG type: 2, msg_id: 0, entry: Medium error, has AssociatedMsg: 0, associatedMsgSeq: 0 >
2009-12-14 22:04:46: NDMP: 4: Thread bkup115 Write failed on archive volume 1
2009-12-14 22:04:46: NDMP: 4: < LOG type: 3, msg_id: 0, entry: Write failed on a rchive volume 1, hasAssociatedMsg: 0, associatedMsgSeq: 0 >
2009-12-14 22:04:46: NDMP: 4: Thread nasa00 Backup root directory: /root_vdm_7/c ontent
 

Attachments

  • naslog.txt
    15.5 KB · Views: 18
it seems like Celerra does not like your media type, are you specifying correct managment class when you issue "backup node" command ?
 
btw ..3-way NDMP is supported by EMC. They are trying to blow you off
 

Attachments

  • EMC_Celerra.pdf
    804.2 KB · Views: 56
I tried switching the copygroup settings from disk, to physical tape, to virtual tape.
The funny thing was I had this thing working and I decided to cancel a running job. I think something broke in the process.
Apparrently you can crash your server when cancelling this job with 5.5.2.
http://www-01.ibm.com/support/docview.wss?uid=swg1IC59290
 
i would call back EMC, they can help you run a trace and see what's going on.
 
EMC is trying to sell me avamar and there nas accelerator so they are not playing nice.
 
escalate to your EMC account manager, that's bs. Your config is in their support matrix and they need to help you.
 
So based on my discovery that cancelling your backup can screw up your tsm server. I figured i would isntall a TSM version 5.5.4.0 server and see whats up.
I kicked of the backup to a disk pool and it worked like a champ.

Going back to TSM support now to find out why they have been pulling my balls.
My apologies to my EMC support folks for the bashing recently.
 
TSM support suggested upgrading to TSM 5.5.4.0 since it worked on the other server but unfortunately it did not solve my problem. I guess it needs to get escalated to the develoment team.
 
Bakark: I know this was a while ago but we had this issue before and we fixed it but can be multiple issues:

1. Make sure your snapshot timeout is longer then the 5-15 minutes that is default. Make it 30 or 60 minutes
2. Make sure you clear the snapshots before you restart any new backup.
3. You can only run 4 concurrent backups of NAS since each NAS head can do this.
4. Make sure your NDMP user account did not get reset.
5. If you are using tape you may need to run the probe, verify, and list to make sure the Chain Target Lund is still showing up. When running this on LTO2 drives it sometimes gets lost. It does not happen on LTO4 drives.
6. Make sure your NDMP head did not failover to another one since the CTL only shows on the configured one.
 
Thanks for the suggestions eperez.
I think i found the problem even though it was not easy to find. MY TSM Windows server is multihomed.

The NAS is setup to use Vlan tagging to get multiple vlans of single connections.

I changed the tcp binding order in the Windows
Network Connection, Select Advanced Settings, Adapters and Bindings tab.
Just picked the adapter that was the IP of the NAS.

True problem appears to be network related.
I also have an AIX server that I will be setting up. Not sure i will run into the same problem there.
 
Back
Top