ADSM-L

Re: Wishlist Item

2004-09-14 09:22:38
Subject: Re: Wishlist Item
From: "Coats, Jack" <Jack.Coats AT BANKSTERLING DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 14 Sep 2004 08:23:48 -0500
Eric,

        Thanks for the response!

        Why bother?  For the same reason that using TSM 'Incremental
Forever' approach makes sense.  TSM, without co-location, has the same
problem you mentioned, about the fragmentation for restores.  TSM already
does the scan and checking of attributes to allow its Incremental approach
to work.  And using the TSM Journal even helps with that some.

        My thought is that this is if TSM is already paying the price (for
the most part), then it might make sense to do the next step in
optimization.  Especially for any enterprise site that is backing up a
number of (at least on the systems partition) substantially similar
machines.

        Even if it didn't work out, this would be a good study for
IBM/Tivoli TSM support team to make to see if it is 'worth it', to them and
to us as paying customers.

        I have no illusion that every suggestion will be taken, but ones
that make sense, and give an edge in the market place, with 'minimal' effort
(easy for me to say, I'm not the developer ;) - could be a good thing.

        In my not understanding the internals of TSM, I would think it would
need a field added to the DB record for each file (a big hit on DB size) to
keep an MD5 or similar checksum, update the server software to accept this
update and use it as an indexed search key, update the client to send and
check on the MD5 also, and update the Journal software to allow the cache to
hold and use the MD5 key. (I suggest MD5 as it is a well known algorithm
that should cover most sins, but I am sure a real statistical study would be
needed to see what is sufficient to give an assurance that it is really the
file needed.

        I am sure this will add some overhead, but hopefully enough would be
gained to overcome the cost!

        ... Still, it is just an idea.  And the TSM team has probably
thought of it anyway, and already dealt with it in house.

        ... Jack

-----Original Message-----
From: Loon, E.J. van - SPLXM [mailto:Eric-van.Loon AT KLM DOT COM]
Sent: Tuesday, September 14, 2004 2:55 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: Wishlist Item

Hi Jack!
Before we started using ADSM we ran a backup application called ESM on our
MVS mainframe. It was one of the most sophisticated backup applications at
that time (1993). It was created by a company called Legent and later on
Legent was bought by Computer Associates. They relabeled it to CA-ESM and
now (but I don't know this for sure) it's called BrightStor.
This product did what you mentioned. It stored all files just once, so the
winfile.exe file was only stored once for all Windows clients. The problem
with this approach was that the actual scan process involves some kind of
CRC check. Just checking the attributes and file size is not enough to
determine whether a file has changed. This scan took far to much time. Also,
since some files are identical during backup to version backed up earlier
on, your storage pool gets heavily fragmented and thus, a restore took a
very long time to complete.
Also, storage is getting cheaper and cheaper, so why bother?
Kindest regards,
Eric van Loon
KLM Royal Dutch Airlines


-----Original Message-----
From: Coats, Jack [mailto:Jack.Coats AT BANKSTERLING DOT COM]
Sent: Monday, September 13, 2004 22:25
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Wishlist Item


Yes, I know I am dreaming, but...



An open source program, pcbackup or backuppc something like that on
Sourceforge has a very nice feature.



If a file is already backed up, it only keeps one copy of that file for ALL
its clients!  What technique does it use to figure out if the files are
identical without comparing them?  I didn't research it that far, but I
assume it uses something like file size and a checksum of some kind.



Anyway, if you have significantly identical client computer you are backing
up, just keeping one rather than N copies is better than any compression
known to man!  It would be another field or so in the database for every
file, but it might be worth it!  At least as an option.



... Jack


**********************************************************************
For information, services and offers, please visit our web site:
http://www.klm.com. This e-mail and any attachment may contain confidential
and privileged material intended for the addressee only. If you are not the
addressee, you are notified that no part of the e-mail or any attachment may
be disclosed, copied or distributed, and that any other action related to
this e-mail or attachment is strictly prohibited, and may be unlawful. If
you have received this e-mail by error, please notify the sender immediately
by return e-mail, and delete this message. Koninklijke Luchtvaart
Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be
liable for the incorrect or incomplete transmission of this e-mail or any
attachments, nor responsible for any delay in receipt.
**********************************************************************

<Prev in Thread] Current Thread [Next in Thread>