Amanda-Users

RE: Have 2.4.4p1 RAIT over file: working sort of...

2003-11-21 03:27:02
Subject: RE: Have 2.4.4p1 RAIT over file: working sort of...
From: "Dana Bourgeois" <em-lists AT netgods DOT us>
To: "'Jean-Louis Martineau'" <martinea AT iro.umontreal DOT ca>
Date: Fri, 21 Nov 2003 00:21:23 -0800
Wasted more time on the weirdness...

Did a number of tests and all pointed to the info file not getting updated
properly as the cause of the problem.  Added a bunch of echo debug
statements to monitor the state of my changers (chg-multi).  From my tests
(ask if you want copies of the output, its long) I concluded that the
chg-rait script is working OK.  What I see is what I exepct to see.  I tried
adding a null device to the device lines but chg-multi says it ignores them
and it apparently does.  It reads them in but then no reference is made to
them.

Followed it all down to the actual call to mt (or in this case, ammt) and
found a couple of things.  

First, despite the code in chg-multi, mt will not do the right thing with
the file: driver.  It said something like "no such device".  So if it works
at all it is because it is using ammt.  'ammt -f file:/backup1/daily15
rewind' run by hand works just fine and updates the info file.  I also
straced this action to see the open of the info file and the read followed
by the write.  No surprised yet.

Second, added the strace to the chg-multi script and found that it errored.
Now this is good.  When I strace the whole amlabael command, everything
works but fails without the strace.  Here, I got a failure with strace.  So
I looked at the trace output and damn me if it doesn't have the *SAME* open,
read and write calls!  Aaaaaa.......

The write to the info file is a literal:
3324  open("/backup3/daily15/info", O_RDWR|O_CREAT|O_LARGEFILE, 0600) = 3
3324  open("/backup3/daily15/data/",
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY
) = 4
3324  fstat64(4, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
3324  fcntl64(4, F_SETFD, FD_CLOEXEC)   = 0
3324  getdents64(0x4, 0x804a8d8, 0x1000, 0) = 288
3324  getdents64(0x4, 0x804a8d8, 0x1000, 0x804b908) = 0
3324  close(4)                          = 0
3324  brk(0)                            = 0x804c000
3324  brk(0x804e000)                    = 0x804e000
3324  read(3, "position 3\n", 8192)     = 11
3324  read(3, "", 8192)                 = 0
3324  _llseek(3, 0, [0], SEEK_SET)      = 0
3324  ftruncate64(0x3, 0)               = 0
3324  write(3, "position 0\n", 11)      = 11
3324  close(3)                          = 0

B..b..b..b..but the info file contents are "position 3"!!  How the HELL is
this happening?  If it didn't drive me near crazy it would be funny.

Bedtime again...

...ummmm, holdup.  I did a man on llseek and got:

       int _llseek(unsigned int fd, unsigned long offset_high,  unsigned
long
       offset_low, loff_t *result, unsigned int whence);

Isn't that 5 arguments instead of the 4 I see above?


Dana Bourgeois


> -----Original Message-----
> From: owner-amanda-users AT amanda DOT org 
> [mailto:owner-amanda-users AT amanda DOT org] On Behalf Of Dana Bourgeois
> Sent: Wednesday, November 19, 2003 11:35 PM
> To: 'Jean-Louis Martineau'
> Cc: amanda-users AT amanda DOT org
> Subject: RE: Have 2.4.4p1 RAIT over file: working sort of...
> 
> 

<snip>



<Prev in Thread] Current Thread [Next in Thread>