ADSM-L

Re: [ADSM-L] Multiple journal engines on a single server

2012-07-31 20:24:54
Subject: Re: [ADSM-L] Multiple journal engines on a single server
From: "Prather, Wanda" <Wanda.Prather AT ICFI DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 1 Aug 2012 00:17:19 +0000
Hi Geoff,

I don’t think you told us whether the physical server is Windows or AIX, I’m 
writing here from my experience with Windows.

When you install the journal engine, it does create a separate journal DB for 
each drive that you want journaled.
I have never installed multiple instances of the journal.

The “journal engine” doesn’t actually participate in the backup.  Yep, it isn’t 
part of the backup at all.

What the journal engine does is invoke a Windows function that monitors file 
system activity and makes a list of the files that have changed.   (In the 
journal DB).  So when the backup runs, it is essentially doing a “dsmc 
selective –filelist=filea,fileb,filec, etc”, getting that list of files from 
the journal DB.
So the journal engine doesn’t play into the speed of the backup.

On a big fileserver (70+ million files, say), the change rate is usually very, 
very low.  So getting the backup done is usually pretty trivial, when all you 
are doing is backing up the new/changed stuff and you don’t have to traverse 
the filetree(s).

So backing up “quicker” doesn’t’ have much to do with how many journal engines 
you have, AFAIK.
With the journal engine operating, it’s just about how much data you have to 
move during the backup, no extra time traversing the file tree.

That being said, there ARE serious reasons to make lots of LUNS.


·        The biggest one:  when you use the journal engine, IBM still 
recommends that you do a periodic backup with –nojournal, to pick up things 
that the Windows monitoring function the journal engine uses has missed.  So 
periodically, you still have to do a dsmc incr that traverses the filetree.  My 
experience with Win2K3 was that over 70,000,000 files and it becomes nearly 
impossible to traverse the filetree in less than 24 hours.  You want to use 
something like Windows dfs or mounted drives to make the directory trees look 
“nice” and rational for the users, but actually have the directories spread 
across smaller separate LUNS which can’t have more than a cazillion files each.


·        If you have multiple LUNS, you can set resourceutilization=10 and have 
up to 4 pairs of backup sessions running at once



·        If something invalidates the journal (and something will), you’ve only 
invalidated the journal on part of your backups, so you don’t have to scan the 
whole filetree to revalidate it (and you will have to revalidate the journal 
for a LUN, at some time)



·        Think about restores.  Journaling helps you back up, it doesn’t help 
you restore.  The bigger the LUN, the harder/longer it takes to put it back.  
You can get two 1 TB luns restored in the almost the same time as 1, with 
multiple restore streams.




I can’t think of any reason you’d need to install multiple journal services, 
unless the change rate on the filesystem is too high for 1 journal service to 
keep up with it.  And if you look at the parms for the journal service, there 
are buffering values and other misc stuff to tune the engine so that it can 
better keep up with the change rate, before resorting to multiple services.

I’m sure other folks will have different experiences to share.

W


From: avalnche96@ [mailto:yahoo.com avalnche96 AT yahoo DOT com]
Sent: Tuesday, July 31, 2012 5:03 PM
To: Prather, Wanda
Subject: Re: RE: [ADSM-L] Multiple journal engines on a single server

I understand Wanda.  The Customer does not want us to use ndmp so they are 
moving some data to a physical server so we can use journal.

Sent from my LG Thrill™ 4G smartphone with glasses-free 3D on AT&T



------ Original Message ------
From : Prather, Wanda
To : Geoff Gill;
Sent : 7/31/2012 14:29
Subject : RE: [ADSM-L] Multiple journal engines on a single server


Easy answer, you can't use journaling on a NAS, as the client can't be 
installed there.



If it's a Netapp, use snapdiff, solves the problem easily.



If it's a non-Netapp NAS, you either suffer through NDMP, or you set up proxy 
relationships and let clients back up the shares via CIFS.











-----Original Message-----

From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT 
EDU]<mailto:[mailto:ADSM-L AT VM.MARIST DOT EDU]> On Behalf Of Geoff Gill

Sent: Tuesday, July 31, 2012 10:36 AM

To: ADSM-L AT VM.MARIST DOT EDU<mailto:ADSM-L AT VM.MARIST DOT EDU>

Subject: [ADSM-L] Multiple journal engines on a single server



Hi All,



Sometimes when I read things that make sense it causes me to question if it 
really works so I thought I'd throw this out there to see if anyone is doing it.

I've read that you can put up multiple journal engines on a single server and 
I'm wondering if anyone has tested it and I'm also curious if you have decided 
if it has any advantages or disadvantages. I was thinking, because NDMP has 
been squashed for a specific customer, I need to find out the best of what's 
left. Hearing statements like, "we want to move all the data curretly x number 
of servers currently access on the NAS to a single server", it makes me 
question how we're going to handle this single backup.



I currently can't tell you how many millions of files we're talking about nor 
can I say how much data we're talking about, but it seems to me that it would 
make sense to create multiple drives to spread it out to be able to use 
multiple journal engines to track everything, and I was hoping it might make it 
quicker. One other question is if it would be better to schedule seperate 
backup windows for those different drives to help spread things out and get it 
backed up "at least somewhat timely".



Any suggestions would be welcome.

Thank You

Geoff Gill






<Prev in Thread] Current Thread [Next in Thread>