Amanda-Users

Re: [Amanda-users] Amanda 2.6.0p2 + S3 results in cURL errors

2008-09-11 18:11:22
Subject: Re: [Amanda-users] Amanda 2.6.0p2 + S3 results in cURL errors
From: "Dustin J. Mitchell" <dustin AT zmanda DOT com>
To: amanda-users AT amanda DOT org
Date: Thu, 11 Sep 2008 17:32:09 -0400
On Thu, Sep 11, 2008 at 2:12 PM, moekyle <amanda-forum AT backupcentral DOT 
com> wrote:
> Here is what I posted to the bug tracker in sourceforge.

Oh, dear -- the sourceforge bug tracker is completely unused these
days.  If there are others on the list who have submitted things to
the sourceforge tracker, please post to the list.  I should find a way
to hide that tracker.

> delete_file(S3Device *self,
> int file)
> {
> gboolean result;
> GSList *keys;
> char *my_prefix = g_strdup_printf("%sf%08x-", self->prefix, file);
>
> result = s3_list_keys(self->s3, self->bucket, self->prefix, "-",
> &keys);
> if (!result) {
[snip]
> Not sure whay the my_prefix is trying to do but by removing it from the
> s3_list_keys it fixed my issue and now backups work correctly.
> result = s3_list_keys(self->s3, self->bucket, self->prefix, "-",
> &keys);

If you look at the documentation for s3_list_keys, you'll see that it
lists all strings matching PREFIX*DELIMITER*, but only including the
PREFIX*DELIMITER portion.  The S3 device names objects in a bucket
like
  slot-01f0000001e-filestart
  ..
  slot-01f0000001eb00000000000003ad.data
  slot-01f0000001eb00000000000003ae.data
  slot-01f0000001eb00000000000003af.data
  ..
The first object is the header, and the remainder are blocks of data,
where the 'b' in the middle is the border between the file number
(${FILENUM}, 0x1e in this case) and the block number within that file
(0x3ad..0x3ae shown here).  'slot-01' here is the device prefix
(${DPFX}).  The pattern as present in the released source code looks
for all objects matching "${DPFX}f${FILENUM}*", and requests the full
name of each.  It then proceeds to delete each of those objects -- in
effect, deleting all data with the given file number.

With your patch, you're asking for all files matching "${DPFX}*-*",
returning only the portion matching "${DPFX}*-".  This will match the
"tapestart" object for *all* files, but only return the first part of
the object name:
  slot-01f00000001-
  slot-01f00000002-
  slot-01f00000003-
  ..
these keys do not exist, so the deletion should fail -- but perhaps
Amazon does not respond with an error when deleting a nonexistent
object?  Anyway, the end result is that you've effectively disabled
delete_file, so you are probably using more S3 storage than you need,
and will get old data intermingled with new data when you try to make
a recovery.  This avoids the error Lisa reported simply by masking the
problem.

On looking more deeply, I think I see the problem: s3_list_keys, or in
particular list_fetch, limits the response to 100k.  I'll need to dig
into this a little more deeply, but I should have a patch out shortly.

-- 
Storage Software Engineer
http://www.zmanda.com

<Prev in Thread] Current Thread [Next in Thread>