Networker

Re: [Networker] Rejected posting to NETWORKER AT LISTSERV.TEMPLE DOT EDU

2010-01-04 17:19:27
Subject: Re: [Networker] Rejected posting to NETWORKER AT LISTSERV.TEMPLE DOT EDU
From: Randal N Manchester <rnm AT WHOI DOT EDU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 4 Jan 2010 17:17:12 -0500
I tried playing around with having the period set to several days, together 
with different hi/lo water marks. But the problem I kept running into was as 
you said, that the file system the adv_file devices were on would hit 100% 
full, then all my backups would pause while the staging process ran for several 
hours before freeing space. One of the things I don't like about the way 
automatic staging works is that it doesn't free the space until the entire 
staging operation is finished. I've found that having the staging running every 
few hours, the time it takes to stage is pretty short and I pretty much always 
have a free tape drive for restores (which is really only a problem on the 
storage node which only has the two LTO4s). The trick is to have a large enough 
adv_file space to handle several large backups at once. That's getting harder 
as some of my users are getting large disk arrays too and expecting them to be 
backed up.

Before I went to staging (a few months ago) I was sending all my backups 
directly to tape and all the tape drives were constantly in use and usually 
only pushing around 10 MB/s. Restores were agonizingly slow due to the 
interleaving of savesets on tape (sometimes 12 hours or more to restore 100GB). 
Slow clients would sometimes drop to 10kb/s and hog the tape drives so restores 
couldn't run. Since going 100% D2D2T all that's gone away. I don't have any 
more incomplete/aborted savesets sucking up tape space and my restores are way 
faster (no more interleaving). Forgot to mention, I also have a 12TB ISCSI disk 
array sitting in another building on campus that I clone DR stuff to. That's 
also adv_file device. I just put that in a few weeks ago and am very happy with 
it. I have shorter retention period set on the clones and a small script to 
launch the nsrclone out of cron. I see about 80-100 MB/s to that, even without 
jumbo frames enabled.


On Jan 4, 2010, at 2:54 PM, Tom Birkenbach wrote:

> Thanks for sharing!
> Setup sounds familiar to what I'm running.
> Server: Sun x4200
> Storage Node: Sun X4500 (x2) w/48 500GB drives.
> 
> I don't do any direct backup to the server and (try to) direct all backup
> I/O to the two storage nodes (Thumpers).
> 
> Have you thought of extending your store period beyond 4 hours?  I was
> hoping to make available at least 1 cycle (week) on disk.  My restore
> requests typically don't date back beyond 1 week and restoring from disk
> would be much more convenient and quicker.  However, having your store
> period to just 4 hours I suspect makes it much easier to setup and maintain
> the system.  If I were to direct all the fulls to one AFTD, the bigger
> systems (e-mail) would quickly chew up the available disk space causing
> staging to perform much sooner.  So many configuration options...  ;-)
> 
> -----Original Message-----
> From: Randal N Manchester [mailto:rmanchester AT whoi DOT edu] 
> Sent: Monday, January 04, 2010 2:26 PM
> To: EMC NetWorker discussion; Tom Birkenbach
> Subject: Re: [Networker] adv_file - bucket of disk or sized partitions
> 
> I'm doing something similar.
> 
> My config:
> 
> Server: Sun x4450, 24GB Mem, 24 core. 4 x LTO3 SCSI, Promise 610f with 16
> 1TB SATA drives in RAID6 and in a single ZFS pool.
> Storage Node: X4500 (thumper), 16GB Mem, 4 core. 2 x LTO4 fibre, 48 500GB
> SATA (internal).
> 
> Both boxes have 4 e1000 copper nics setup as an aggregate interface.
> I have ZFS filesystem for full, non-full and default with adv_file devices
> setup on each.
> About 600+ clients, several multi terabyte.
> 
> Here's how my staging is setup:
> 
> High water: 50
> Low water: 40
> Selection: Oldest save sets
> Max store period: 4 hours
> Recover space: 4 hours
> Check: 3 hours
> 
> I basically suck in the backups over the LAN as fast I can straight to disk,
> then stage them quickly off to tape. With a group of about 20+ clients I see
> about 150MB/s going to the disks. When the staging runs I see about 100MB/s
> average to the LTO3 drives (plus or minus about 30 MB/s) and about 140MB/s
> to the LTO4 drives.
> 
> So far this seems to work pretty well in my environment (about 600+
> clients). My backups typically start in a burst that hits around 150 MB/s
> for a while, then gradually slopes down to around 10 MB/s or less for the
> slower clients. My tapes always stream at full speed. On a busy weekend full
> set I see around 5-7TB waiting to stage on each of the two servers. Rarely
> hits 100% full.
> 
> A few tests I ran with bigasm show my max thru-put to my adv_file devices at
> around 180MB/s (on the promise, haven't tested the x4500).
> 
> On Jan 4, 2010, at 11:45 AM, Tom Birkenbach wrote:
> 
>> In an effort to control disk space usage and automating staging in using
>> adv_file devices (or AFTD), I've been using/testing a practice where I
> setup
>> a "disk partition" (specifically a Solaris ZFS file system)for each group
>> and estimate about 2.5 weeks worth of storage in sizing the partition.
> It's
>> been working well enough, but I'm curious...
>> 
>> What are others doing in using "adv_file" devices?
>> How is containment (i.e. space utilization and growth) being maintained?
>> 
>> I'm just wondering if there's a better/easier/simpler way rather than
>> creating all these different partitions/file-systems.  I know some of you
>> are using rather clever and intuitive scripts for staging.  I haven't done
>> much in the development and testing of such scripts, but I hope to.
>> 
>> Your thoughts and input are greatly appreciated.  Thanks!!
>> 
>> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
>> type
> "signoff networker" in the body of the email. Please write to
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this
> list. You can access the archives at
> http://listserv.temple.edu/archives/networker.html or
>> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>> 
> 
> 

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [Networker] Rejected posting to NETWORKER AT LISTSERV.TEMPLE DOT EDU, Randal N Manchester <=