Re: [ADSM-L] TSM rant

The bug is IT02929; for us it affects Exports as well as restores (so I'm 
guessing would affect backupsets as well).
And yes, hits any copypool, LTO or not.

OTOH, it's not necessarily hard to live with, depending on your situation.  
More details of that below.

What you get with 7.1.1 that may make IT02929 worth living with for you:

- much more function in the OC
- ability to make the active log larger than 128G (important for people doing 
dedup)
- migration from DISK pools occurs at filespace level instead of node level 
(fixes a pain point for people using VE, or other PROXY relationships)

My experience with IT02929:
It can hit you if
1) the data you are trying to restore or export is on both a primary and 
copypool volume
2) the copypool volume is marked reado or readw

So if your copypool is tape, and you vault your copypool daily (so your 
copypool volumes are generally marked OFFSITE), you are unlikely to ever see 
the problem.

We found it's easy to circumvent; before we do an export or if we see the 
problem occurring during a restore we do:

     update vol * wherestgpool=copypoolname whereaccess=readw 
access=unavailable 

When we are done with exports, we reverse it with:

   update vol * wherestgpool=copypoolname whereaccess=unavailable access=readw

So for us it is quite manageable, as we don't do that many restores.  Your 
circumstances may differ. 


Wanda Prather
TSM Consultant
ICF International Enterprise and Cybersecurity Systems Division



-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Thomas Denier
Sent: Friday, January 30, 2015 10:02 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] TSM rant

The 7.1.1.100 server code has a rather serious bug affecting restores. If copy 
storage pool volumes are available the TSM server will mount both primary pool 
volumes and copy pool volumes when performing a restore. This is expected to be 
fixed in 7.1.1.200. The last time I checked, the target date for 7.1.1.200 was 
second quarter of 2015. That pretty much rules out a 6.2 to 7.1 upgrade, unless 
you are prepared to live with the bug described above. We are currently at 
6.2.5.0, and expect to upgrade to 6.3.5.0 or 6.3.5.100.

Thomas Denier,
Thomas Jefferson University

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Rhodes, Richard L.
Sent: Friday, January 30, 2015 8:39 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] TSM rant

We just completed our conversion from v5 to v6.2.5 the end of December.  In 
general we're very happy with v6.2.5, but then we don't use any of the newer 
features like dedup - and don't plan to.  (we have DataDomain for our dedup 
load)

V6 has really solved our v5 pain points:  very long expirations (+24hr) and 
very slow db processing.  We run a "morning report" every morning against our 
TSM servers.  It generates a lot of info about an instance that we keep for 
documentation/reporting.  Some of our "morning reports" ran over an hour due to 
heavy SQL cmds.  Now on V6 the morning report runs in a few minutes!  V6 has 
definitely raised the scalability of TSM.

Yes, I have a long list of complaints about TSM, but in general we are happy 
with V6.

Now that we are completely on V6.2.5, we have to upgrade quickly due to ending 
support in April.  Our debate is going to v6.3.x or jumping to 7.1.x.  I'd be 
interested in any recommendations for this.

Thanks

Rick


PS:  Just in case you are curious, here are the reports we generate in our 
"morning report":

  print "= r000 report index and beginning time stamp "
  print "= r010 activity summary "
  print "= r011 admin schedule activity "
  print "= r015 scratch count "
  print "= r019 scratch tape usage "
  print "= r020 tape vol summary for all TSM instances"
  print "= r021 reclaimable tapes by pct-reclaim "
  print "= r022 volume info "
  print "= r023 volumes per stgpool status and maxscratch  "
  print "= r024 volume average utilization by stgpool "
  print "= r025 q dr "
  print "= r030 q path (not emailed)"
  print "= r036 drive activity "
  print "= r040 q db"
  print "= r045 q log"
  print "= r050 log consumption and utilization"
  print "= r055 log pin info (not emailed)"
  print "= r065 q sess"
  print "= r070 q stgpool"
  print "= r075 q copygroup (not emailed)"
  print "= r076 q events for exceptions - missed backups (not emailed)"
  print "= r077 slow backups"
  print "= r080 db backups"
  print "= r085 expiration - completions "
  print "= r090 expiration - detail (not emailed)"
  print "= r095 drive and media errors"
  print "= r097 nodes with tcp_ip or tcp_name changes"
  print "= r100 recplan dir listings"
  print "= r105 q volhost type=dbb"
  print "= r110 q volhost type=dbs (not emailed)"
  print "= r120 stgpool volumes: 7 day trend (not emailed)"
  print "= r125 aix errpt"
  ###print "= r129 tdp notes - summary "
  ###print "= r130 tdp notes - full (not emailed)"
  ###print "= r131 tdp notes - incremental (not emailed)"
  ###print "= r132 tdp notes - logs (not emailed)"
  print "= r140 session per node where count > 1 (not emailed) "
  print "= r141 q option (not emailed)"
  print "= r145 occupancy by server "
  print "= r150 occupancy by domain "
  print "= r152 occupancy by stgpool "
  print "= r153 occupancy by collocation group "
  print "= r155 occupancy by node (not emailed)"
  print "= r157 q audotocc (not emailed)"
  print "= r159 nodes locked (not emailed)"
  print "= r160 nodes with no associations"
  print "= r161 nodes with no associations EXCLUSION LIST"
  print "= r165 nodes never backed up (not emailed)"
  print "= r166 zzrt nodes with associations (not emailed)"
  print "= r167 nodes by collocation group (not emailed)"
  print "= r170 filespaces not backed up in 7 days (not emailed)"
  print "= r175 filespaces never backed up (not emailed)"
  print "= r180 server critical errors"
  print "= r184 backup objects and bytes per domain (not emailed)"
  print "= r185 backup objects and bytes per node (not emailed)"
  print "= r190 q vol  for tape (not emailed)"
  print "= r195 q libvol (not emailed)"
  print "= r999 report end timestamp"






-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Remco Post
Sent: Thursday, January 29, 2015 6:07 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: TSM rant

> Op 29 jan. 2015, om 22:39 heeft Skylar Thompson <skylar2 AT U.WASHINGTON DOT 
> EDU> het volgende geschreven:
>
> TSM between v6.1 and the end of v6.2 was really rough, mostly related 
> to DB2. By v6.3 it got a lot more stable. I'm glad we upgraded from v5 
> early, though, since we really benefit from DB2's improved indexing 
> and table compression - between two TSM instances we have close to 2 
> billion file versions tracked by TSM. That would have overwhelmed any
> v5 server, but we get by with relatively modestly-sized hardware.

v5 was stable as a rock, but was going nowhere. All of the new features since 
6.1 are possible thanks to DB2. And I vividly recall the days of TSM v4 where 
the day would start with yet another patch of the server and hoping that that 
version would run for more than 24 hours. And yes… I’m still waiting for 
somebody to explain why you need 64 GB of RAM for a medium sized server without 
deduce or replication.

>
> On Thu, Jan 29, 2015 at 09:29:53PM +0000, David Ehresman wrote:
>> I've been admin TSM since the V3 days if memory serves, and sometimes it 
>> doesn't so much anymore.  I've had a lot less problems with TSM since the 
>> move to DB2. We are now at 7.1.1.  I do not know DB2 and have not had a need 
>> to learn it.  We back up about 400 servers, 350 VMs, 100 databases, and an 
>> Exchange systems which is just big.  FWIW, we do not do software (TSM) dedup.
>
> --
> -- Skylar Thompson (skylar2 AT u.washington DOT edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine

--

 Met vriendelijke groeten/Kind Regards,

Remco Post
r.post AT plcs DOT nl
+31 6 248 21 622


-----------------------------------------
The information contained in this message is intended only for the personal and 
confidential use of the recipient(s) named above. If the reader of this message 
is not the intended recipient or an agent responsible for delivering it to the 
intended recipient, you are hereby notified that you have received this 
document in error and that any review, dissemination, distribution, or copying 
of this message is strictly prohibited. If you have received this communication 
in error, please notify us immediately, and delete the original message.
The information contained in this transmission contains privileged and 
confidential information. It is intended only for the use of the person named 
above. If you are not the intended recipient, you are hereby notified that any 
review, dissemination, distribution or duplication of this communication is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender by reply email and destroy all copies of the original message.

CAUTION: Intended recipients should NOT use email communication for emergent or 
urgent health care matters.