Amanda-Users

Re: [Fwd: [SAGE] Backing up Multi-Terabyte filesystems.]

2008-07-09 10:03:12
Subject: Re: [Fwd: [SAGE] Backing up Multi-Terabyte filesystems.]
From: Jordan Desroches <jordan.d.desroches AT Dartmouth DOT EDU>
To: amanda-users AT amanda DOT org
Date: Wed, 9 Jul 2008 09:24:40 -0400
We have a similar topology where I work. We have a NetApp (with 3TB of space used at the moment) that we backup up with a combination of spinning disk and tape.

The spinning disk solution is a linux box with a large direct attached storage (in this case a Coraid) behind it. Every night we create a hard link copy of the pre-existing data on the spinning disk backup, and then NFS mount the NetApp and rsync over the new data. This gives us space efficient archival backup using the same technology that Apple time machine uses on OS X.

We have a belt and suspenders approach to backups, so we also have an AMANDA server with a tape library that performs full backups once a week. These backups are kept for two weeks, and the tapes rotated through. This backup set is primarily for disaster recovery. As our storage needs grow, this backup will probably be pushed off to once a month.

Speed was a big concern with us, and we've noticed the NetApp is limited or is limiting the speed a single CIFS or NFS connection (or client, we haven't figured it out yet) can provide to around 15 MB/s . To get around this, we broke up the NetApp into several exports, and mount them from several IPs, allowing us greater aggregate speed. With the Disk based backups, we run several rsync connections simultaneously, and with AMANDA server, multiple simultaneous streams.

We have a lot of NTFS data, and back up the ACLs using a free program from M$ called "subinacl" to backup the ACLs, and then dump them to the disk and tape backups. Subinacl also allows restoration of of ACLs.

Hope this helps anyone looking for a similar solutions :-)

Jordan

On Jul 8, 2008, at 5:55 PM, Chris Hoogendyk wrote:

If anyone has commentary, case studies, examples, of this sort of thing with Amanda, I could pass them along, posting them, or a summary, to the SAGE list.


---------------

Chris Hoogendyk

-
 O__  ---- Systems Administrator
c/ /'_ --- Biology & Geology Departments
(*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst
<hoogendyk AT bio.umass DOT edu>

---------------
Erdös 4




-------- Original Message --------
Subject:        [SAGE] Backing up Multi-Terabyte filesystems.
Date:   Tue, 08 Jul 2008 14:59:35 -0600
From:   Ray Frush <phred AT frii DOT com>
To:     SAGE Members Mailing List <sage-members AT usenix DOT org>



We've recently discovered a limitation (design flaw) in the backup strategy for our NetApp NAS solution at my place of employment. When we keep our file systems "small" ( 2TB) the NetApp solution works as promised by all the sales engineers. However, several of our more recent projects required 16TB of space (for each project), and breaking them up into 2TB buckets has been problematic. So, we allowed larger file systems. We discovered, later, that the NetApp backup solution starts to have problems with space reservation when the file systems grow beyond 4TB. The NearStore shows over 70% of its capacity as reserved but unused, and we've run out of capacity. Expanding the NearStore doesn't solve the problem since most of the space never gets used, just reserved for use.

We have 3 new projects starting up this month that will eventually need 24-30TB each. Our engineering teams have asked even larger file systems (8TB) to keep the task of managing file links simplified. Since it's not clear if NetAPP can address this problem, we've started researching alternatives.

It occurred to me that members of this community may have already had to solve this problem, so it would be a good question to ask: How do (would) you backup several 8TB file systems?

VTL is a possiblity, but the databases created by the backup software start to become increasingly unwieldy. If you've solved this with VTL, how did you split out the loads?

Other methods?
I look forward to your creative ideas.

--
Ray Frush


-----------------------------------------------
Jordan Desroches
Systems Administrator
Thayer School of Engineering at Dartmouth College
8000 Cummings Hall
Hanover, NH 03755

phone: 603-646-8192
fax:       603-646-9856





<Prev in Thread] Current Thread [Next in Thread>