Re: [Networker] Session statistics broken for over 2Tb.
2012-10-26 08:33:35
Again, I don't see why the mediadb has anything to do with that. I
have nsrstage running. It has to write 5Tb of data (check below). I see
incorrect numbers for "amount kb" and "total kb" which are statistics of
the running staging session. Obviously, these are calculated from the
size of the save set which comes from the mediadb, but if that was a
problem, it should have been a problem with the staging process itself,
not just the statistics.
My database was migrated from Solaris 2.6 (?) all the way to CentOS
6.3, but I am sure that various Legato upgrades (5 to 6 and 6 to 7)
included media db conversions.
On 10/26/2012 01:33 PM, Francis Swasey wrote:
My own mediadb (and the rest of /nsr/res) has come along from 32-bit OS's into
the 64-bit era as well. I also have experience with the savegroup emails not
always being correct. When the amount of data crosses the 2TB limit, it is a
crap shoot whether the data displayed in nsradmin's 'show session' will be
correct or not. I often get push back from my customers and I have to explain
to them that their backups are too big (in that regard, I like this bug!) and
because of that they will need to do an mminfo query to see the real size of
their saveset.
I also opened an issue with EMC, and got my name added to the RFE for this
problem, which is NW113348. I don't know if EMC will ever have a real fix for
it (other than the clean slate restart with running scanner on all your media
volumes [shudder]). However, eventually some bright programmer will stumble on
the exact combination of what is doing it and be able to write a conversion
program to read in the 32-bit /nsr/res constructs and write out the correct
64-bit /nsr/res constructs. Still, I'm not going to hold my breath waiting!
It is yet another reason to keep individual save sets below 2TB... Yeah, I
know, that's not realistic anymore.
-- Frank
On Oct 26, 2012, at 4:57 AM, Yaron Zabary<yaron AT aristo.tau.ac DOT il> wrote:
This doesn't make sense because mminfo, and nsradmin's 'show session' knows
the correct size as can be seen below. The problem seems to be with some
variable defined as 'int' and not 'long int' in nsradmin and NMC.
On 10/26/2012 09:47 AM, Tony Albers wrote:
AFAIK this is a known issue if you've upgraded a 32-bit mediadatabse
from an old networker to a newer 64-bit nw and mediadb.
I don't think there's any other way around it than making a complete new
64 bit backup server and then moving the data to it. That is use scanner
to populate the new media db (yes I know).
/tony
Tony Albers - Technical Consultant - Proact Systems A/S
Tel: +45 7010 1132 - Mobile: +45 2210 5208 - Fax: +45 7010 1142
toal AT proact DOT dk www.proact.dk - We secure mission-critical information -
On 10/25/2012 05:48 PM, Yaron Zabary wrote:
Hello all,
I have this script which tries to dig some statistics from nsradmin's
session statistics. It works nicely for sessions smaller than 2Tb, but
breaks above that. I suspect that nsradmin does 32 bit counters. For
example:
[root@legato ~]# nsradmin
NetWorker administration program.
Use the "help" command for help, "visual" for full-screen mode.
nsradmin> . type: NSR
Current query set
nsradmin> option hidden;
Hidden display option turned on
Display options:
Dynamic: Off;
Hidden: On;
Raw I18N: Off;
Resource ID: Off;
Regexp: Off;
nsradmin> option dynamic
Dynamic display option turned on
Display options:
Dynamic: On;
Hidden: On;
Raw I18N: Off;
Resource ID: Off;
Regexp: Off;
nsradmin> show session statistics
nsradmin> print
session statistics: id = 285113144, jobid = 0,
name = dayan-ng.tau.ac.il, mode = browsing,
"group = ", "pool = ", "volume = ", rate
kb = 0,
amount kb = 0, total kb = 0, amount files
= 0,
total files = 0, start time = 1350993680,
connect time = 185605, num volumes = 0,
used volumes = 0, completion = running,
flags = 0, "level = ", id = 285113524,
jobid = 76501, name = cloning session,
mode = recovering, "group = ", pool =
DDPool,
volume = DDPool.001.RO, rate kb = 0,
amount kb = 129176321, total kb =
1018277321,
amount files = 0, total files = 0,
start time = 1351029303, connect time =
149982,
num volumes = 0, used volumes = 0,
completion = running, flags = 4, "level = ",
id = 285113525, jobid = 76501,
name = legato.tau.ac.il, mode = saving,
"group = ", pool = TAUDefault, volume =
JDF648,
rate kb = 0, amount kb = 0, total kb = 0,
amount files = 0, total files = 0,
start time = 1351029303, connect time =
149982,
num volumes = 0, used volumes = 5,
completion = running, flags = 26, "level
= ";
nsradmin>
[root@legato ~]# /usr/local/TAUSRC/Local/ToolBox/monstage.pl
76501 r=0MB/s size=841GB/971GB time=16205m ETA=5/23:43
The size is reported correctly with nsradmin's session attribute:
[root@legato ~]# /usr/local/TAUSRC/Local/ToolBox/showsessions.pl|nl
1 dayan-ng.tau.ac.il:root browsing
2 cloning session:1 of 7 save set(s) reading from DDPool.001.RO
4431 GB of 5313 GB
3 legato.tau.ac.il:cloning session saving to pool 'TAUDefault'
(JDF648)
NMC is no better. It thinks that the size of this staging session is
1018Gb. I had this investigated under SR#44358972, but they claimed that
this was OK with 7.6.3HF and was related to NW138153. Networker is now
7.6.4.2.Build.1060, but the problem is still here.
Does anyone knows which version has this corrected ?
#!/usr/bin/perl
use lib "/usr/local/TAUSRC/Local/ToolBox";
use Nsradmin;
require "timelocal.pl";
set_nsradmin("/usr/sbin/nsradmin");
$server = "legato";
$query = "type: NSR ";
$show = "session statistics";
$options = "hidden; dynamic";
@reslist = query($server, $query, $show, $options);
#
# A reslist is a list of resources. Resources are a
# hash of attributes, which have a name and value lists.
#
$found = 0;
foreach $res (@reslist) {
%attrlist = %{$res};
$attr = "session statistics";
@vallist = @{$attrlist{$attr}};
foreach $val (@vallist) {
if ($val =~ "jobid")
{
($a,$jobid) = split(/ = /,$val);
}
if ($val =~ "total kb")
{
($a,$totalkb) = split(/ = /,$val);
}
if ($val =~ "amount kb")
{
($a,$amountkb) = split(/ = /,$val);
}
if ($val =~ "connect time")
{
($a,$ctime) = split(/ = /,$val);
$rate = $amountkb/$ctime;
if ($found == 1)
{
#print "$totalkb $amountkb \n";
$left = $totalkb - $amountkb;
$leftt = $left/$rate if $rate> 0;
$eta = time() + $leftt;
($sec,$min,$hour,$mday,$monx,$year,$wday,$yday,$isdst) =
localtime($
eta);
$rate = int($rate/1024);
$left = int($left/1024/1024);
$leftt = int($leftt/60);
$totalkb = int($totalkb/1024/1024);
printf "%d r=%dMB/s size=%dGB/%dGB time=%dm ETA=%d/%d:%02d\n",
$jobid,$rate,$left,$totalkb,$leftt,$mday,$hour,$min
if ($found
== 1);
$found = 0;
#last;
}
}
$found = 1 if ($val =~ 'cloning session');
}
}
--
-- Yaron.
--
-- Yaron.
|
|
|