Bacula-users

Re: [Bacula-users] What is the real meaning of the VolErrors column in the table Media

2017-06-28 10:29:42
Subject: Re: [Bacula-users] What is the real meaning of the VolErrors column in the table Media
From: Kern Sibbald <kern AT sibbald DOT com>
To: Panayiotis Gotsis <pgotsis AT noc.grnet DOT gr>, bacula-users AT lists.sourceforge DOT net
Date: Wed, 28 Jun 2017 16:28:40 +0200
Hello,

You are running on a very old Bacula, so VolErrors probably does not have any significant meaning. The concept is that when the Storage daemon sees an error with a volume, it will increment that field. If you want to see what is implemented, you will need to look at the VOLUME_CAT_INFO structure in the SD.

Note: the volume status can change from append to error then back to append, and the VolErrors for the moment is a lifetime count. If it is a very big number I would probably want to know what was going on. Otherwise, it is probably nothing to worry about.

To the best of my knowledge nothing ever clears the field once it is set.

Best regards,

Kern


On 06/28/2017 12:04 PM, Panayiotis Gotsis wrote:
Hello all,

We are using version bacula_5.2.6+dfsg-9 of bacula for our
environment. Our setup up till recently was saving backups to disk
files. We have introduced a TS4500 library recently and we have
experienced the following problem.

For one of the tapes of the TS4500 library, there is a count of 1 in
the VolErrors column of the Media table. We have seen tapes or disk
files marked with Status Error, but not just an increase in this
counter.

This is what is stored in the DB.

mysql> select
VolumeName,PoolId,LastWritten,VolJobs,VolFiles,VolBytes,VolErrors,VolStatus,VolRetention From Media WHERE LastWritten>0 AND PoolId=6 ORDER BY LastWritten; +------------+--------+---------------------+---------+----------+---------------+-----------+-----------+--------------+ | VolumeName | PoolId | LastWritten | VolJobs | VolFiles | VolBytes | VolErrors | VolStatus | VolRetention | +------------+--------+---------------------+---------+----------+---------------+-----------+-----------+--------------+ | EDE132L6 | 6 | 2017-06-15 01:20:43 | 2947 | 5361 | 2888346811392 | 0 | Full | 7776000 | | EDE467L6 | 6 | 2017-06-27 17:11:14 | 456 | 629 | 275665453056 | 1 | Append | 7776000 | +------------+--------+---------------------+---------+----------+---------------+-----------+-----------+--------------+
2 rows in set (0.00 sec)

It seems that some error got recorded for the EDE467L6 Volume but this
was not a big problem, or otherwise the Volume would be marked as
Error. The volume is still being used for backups.

Taking from the examples/sample-query.sql file, under the section "List
Volumes likely to need replacement from age or errors" we see that the
VolErrors is used for this query. However we have not found any
reference to what it truly means. In addition, as we are monitoring
the output of this query via icinga, we are pretty sure of when it
happened (give or take 10 minutes) and there is no error related to
this Volume within this time period.

I have tried to check the source code on hints of where this value is
set, and I have seen that this is part of the MEDIA_DBR structure, but
it is not clear on where the trigger for its increase is.

Can anyone shed some light on what it actually means, on whether some
backup job actually failed, or what kind of procedure can I use to
more specifically pinpoint the problem?

Thanks




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>

ADSM.ORG Privacy and Data Security by https://kimlaw.us