ADSM-L

Re: [ADSM-L] VE 7.1.1.1 backup "freezing" a VM & question about "megablocks"

2015-03-11 15:24:57
Subject: Re: [ADSM-L] VE 7.1.1.1 backup "freezing" a VM & question about "megablocks"
From: Andrew Raibeck <storman AT US.IBM DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 11 Mar 2015 15:18:44 -0400
Hi Steve,

No firm diagnosis or solution, but this could be something going on during
snapshot cleanup, when the volumes are being unmounted if you are using the
HOTADD transport. One possibility (but by no means the only one) is:

kb.vmware.com/kb/2010953

One way to try to isolate this is to manually create a snapshot, and leave
it in place for the same amount of time the backup normally runs for that
VM. Do you see the frozen VM issue occur? Next, remove the snapshot you
just created. Does the VM seem to freeze while the snapshot is being
removed?

The big factor is how long the backup took and how busy the target VM's I/O
was. The longer the backup with high I/O, the larger the redo log becomes,
requiring longer time to consolidate the disks.

I'm not sure what those error log messages are about, but I cannot tie them
to the problem you are describing in this thread.

Best regards,

- Andy

____________________________________________________________________________

Andrew Raibeck | Tivoli Storage Manager Level 3 Technical Lead |
storman AT us.ibm DOT com

IBM Tivoli Storage Manager links:
Product support:
http://www.ibm.com/support/entry/portal/Overview/Software/Tivoli/Tivoli_Storage_Manager

Online documentation:
http://www.ibm.com/support/knowledgecenter/SSGSG7/welcome
Product Wiki:
https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli%20Storage%20Manager

"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 2015-03-09
10:08:31:

> From: "Schaub, Steve" <Steve_Schaub AT BCBST DOT COM>
> To: ADSM-L AT VM.MARIST DOT EDU
> Date: 2015-03-09 10:12
> Subject: Re: VE 7.1.1.1 backup "freezing" a VM & question about
"megablocks"
> Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
>
> Andy,
>
> These are the only odd messages I saw from these backups.  Just to
> clarify, the backups didn't freeze, but the servers became
> unresponsive to the end user while the backup was running.  As soon
> as the backup completed, the servers became responsive again.
> Looking at the Windows app event logs, there is a gap showing no
> activity for the duration of the backup.  There were some entries in
> the system event log during the backup.  It was like someone hit the
> "pause" button.  Which I could understand if it looked like either
> VSS or VMWare Snapshot manager hung, but as far as I can tell they
> were successful.
>
> Thanks,
> -steve
>
> 03/05/2015 07:58:19 ANS9365E VMware vStorage API error for virtual
> machine 'XXXXXXXX'.
>    TSM function name : VixDiskLib_Read
>    TSM file          : vmvddksdk.cpp (2766)
>    API return code   : 1
>    API error message : NBD_ERR_DISKLIB
> 03/05/2015 07:58:19 ANS0361I DIAG: VmProcessExtent(): Retrying
> failed read: vddksdkRead() rc=4398, startSector=31955968,
numSectorsToRead=512
> 03/05/2015 09:14:51 ANS9365E VMware vStorage API error for virtual
> machine 'ZZZZZZZZ'.
>    TSM function name : VixDiskLib_Read
>    TSM file          : vmvddksdk.cpp (2766)
>    API return code   : 1
>    API error message : NBD_ERR_DISKLIB
> 03/05/2015 09:14:51 ANS0361I DIAG: VmProcessExtent(): Retrying
> failed read: vddksdkRead() rc=4398, startSector=24463360,
numSectorsToRead=512
> 03/05/2015 09:16:02 ANS9365E VMware vStorage API error for virtual
> machine 'XXXXXXXX'.
>    TSM function name : VixDiskLib_Read
>    TSM file          : vmvddksdk.cpp (2766)
>    API return code   : 1
>    API error message : NBD_ERR_DISKLIB
> 03/05/2015 09:16:02 ANS0361I DIAG: VmProcessExtent(): Retrying
> failed read: vddksdkRead() rc=4398, startSector=76280832,
numSectorsToRead=512
> 03/05/2015 09:19:05 ANS9365E VMware vStorage API error for virtual
> machine 'XXXXXXXX'.
>    TSM function name : VixDiskLib_Read
>    TSM file          : vmvddksdk.cpp (2766)
>    API return code   : 1
>    API error message : NBD_ERR_DISKLIB
> 03/05/2015 09:19:05 ANS0361I DIAG: VmProcessExtent(): Retrying
> failed read: vddksdkRead() rc=4398, startSector=82475520,
numSectorsToRead=512
> 03/05/2015 10:06:56 ANS9365E VMware vStorage API error for virtual
> machine 'XXXXXXXX'.
>    TSM function name : VixDiskLib_Read
>    TSM file          : vmvddksdk.cpp (2766)
>    API return code   : 1
>    API error message : NBD_ERR_DISKLIB
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On
> Behalf Of Andrew Raibeck
> Sent: Monday, March 09, 2015 9:24 AM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: [ADSM-L] VE 7.1.1.1 backup "freezing" a VM & question
> about "megablocks"
>
> Hi Steve,
>
> Off-hand I am not sure what this is, but a couple of things:
>
> 1. Are there any anomalous messages in the error log during the
> timeframe of the backup that exhibits the problem?
>
> 2. Consider capturing a dump (*) of the TSM client backup process,
> e.g., dsmcsc.exe, when the backup is in the "frozen" state, open a
> PMR with TSM support, and send in the dump, the dsmerror.log, and
> the dsmsched.log.
>
> (*) In case you are not familiar with capturing a dump: start task
> manager, find the TSM client process that appears frozen, right-
> click on the process, and select "Create Dump File".
>
> Best regards,
>
> - Andy
>
>
____________________________________________________________________________

>
> Andrew Raibeck | Tivoli Storage Manager Level 3 Technical Lead |
> storman AT us.ibm DOT com
>
> IBM Tivoli Storage Manager links:
> Product support:
> http://www.ibm.com/support/entry/portal/Overview/Software/Tivoli/
> Tivoli_Storage_Manager
>
> Online documentation:
> http://www.ibm.com/support/knowledgecenter/SSGSG7/welcome
> Product Wiki:
> https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli%
> 20Storage%20Manager
>
> "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 2015-03-06
> 11:28:19:
>
> > From: "Schaub, Steve" <Steve_Schaub AT BCBST DOT COM>
> > To: ADSM-L AT VM.MARIST DOT EDU
> > Date: 2015-03-06 11:29
> > Subject: VE 7.1.1.1 backup "freezing" a VM & question about
"megablocks"
> > Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
> >
> > First, many thanks to Wanda & others who have been so helpful in
> > answering my previous VE questions!
> >
> > We had a situation yesterday where 2 VE backups were causing the VM's
> > to go unresponsive.  No response to ping, unable to RDP, etc.
> > As soon as the backup finished (or was killed in one case), the
> > servers picked back up where they left off.  They never rebooted, but
> > you can actually see in the Windows event logs a gap where no activity
> > happens.  Has anyone seen this behavior before?  VE is at 7.1.1.1, the
> > Hosts are ESXi 5.0 U2, vCenter is 5.5, windows 2008R2.
> >
> > Secondly, while reading the docs, I ran across the idea of performing
> > periodic full backups in VE due to fragmentation of "megablocks"?  Is
> > this needed?  If so, how do you manage it (how frequently, do you try
> > to scatter the fulls across every day, how do these interact with
> > daily incrementals, etc)?  If it matters, all our backups land on a
> > VTL.
> >
> > Thanks,
> >
> > Steve Schaub
> > Systems Engineer II, Backup/Recovery
> > Blue Cross Blue Shield of Tennessee
> > 423-535-6574 (desk)
> > 423-785-7347 (cell)
> >
> > -----------------------------------------------------
> > Please see the following link for the BlueCross BlueShield of
> > Tennessee E-mail disclaimer:
> > http://www.bcbst.com/email_disclaimer.shtm
> >
> -----------------------------------------------------
> Please see the following link for the BlueCross BlueShield of
> Tennessee E-mail disclaimer:  http://www.bcbst.com/email_disclaimer.shtm
>