Networker

Re: [Networker] To many open files

2005-01-06 10:07:04
Subject: Re: [Networker] To many open files
From: Robert Maiello <robert.maiello AT PFIZER DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Thu, 6 Jan 2005 10:06:12 -0500
Ah yes, Legato should be well informed with  this.  You didn't
post what version of Networker.  I can guess it is 6.1.x or the distributed
7.1.1.   A recap for everyone..

To reproduce the problem:

a.) Run one of the versions mentioned above
b.) Disable all tape drives
c.) Run many groups and load the server up as much as possible
d.) Watch Networker crumble and post this error in the daemon.log.

Solution to the problem:

I. Immediate solution;

a.) Get the server as unloaded as possible.
b.) See if the stuck jukebox or jukebox out of writable media can be
    addressed.
c.) If the condition persist for too long and the backup load cannot
be reduced  only option is to shutdown networker (show over).

Other have reported success with savegroup parallelism and load to keep
open file count under the limit (1024).

II. Long term cause and solution;

This condition exist because of the way Networker is compiled for Solaris
(32 bit).  Solution to scale is to:

a.) Run a 64 bit build of Networker for Solaris (7.1.2 on or request a
7.1.1 build from Legato).

b.) Raise the open file limit in /etc/system as described by others.  We
saw the 32bit 7.1.1 hit the 1024 limit despite larger values in /etc/system.

c.) Of course, the root cause that generated the open files will
need to be found;  ie.  tapes not mounting for whatever reason.

In short later versions should scale better.

Let me know if you need an open file counter to see how far away from
meltdown you are :)


Robert Maiello
Pioneer Data Systems

On Wed, 5 Jan 2005 16:08:49 -0500, Ciolek, Ken <Ken.Ciolek AT AIG DOT COM> 
wrote:

>Running on a Solaris 2.8 platform.
>
>
>-----Original Message-----
>From: Chad Smykay [mailto:csmykay AT rackspace DOT com]
>Sent: Wednesday, January 05, 2005 4:01 PM
>To: 'Legato NetWorker discussion'; Ciolek, Ken
>Subject: RE: [Networker] To many open files
>
>Ken,
>
>Are you running it on a UNIX or Linux OS?  If so then you have too many open
>files on the system that it can
>
>A.  Either not handle that many open files
>Or
>B.  You have to increase the number of open files allowed on your kernel.
>
>Let us know if you have any more questions.
>
>
>Chad Smykay, RHCE, LCNA
>Systems Storage Administrator
>Rackspace Managed Hosting (TM) - The Managed Hosting Specialist (TM)
>
>-----Original Message-----
>From: Legato NetWorker discussion [mailto:NETWORKER AT LISTMAIL.TEMPLE DOT 
>EDU] On
>Behalf Of Ciolek, Ken
>Sent: Wednesday, January 05, 2005 12:33 PM
>To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
>Subject: [Networker] To many open files
>
>My backup server is running real slow and the daemon log is showing to many
>open files.
>
>
>
>
>
>
>--
>Note: To sign off this list, send a "signoff networker" command via email
>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>http://listmail.temple.edu/archives/networker.html where you can
>also view and post messages to the list. Questions regarding this list
>should be sent to stan AT temple DOT edu
>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
>
>--
>Note: To sign off this list, send a "signoff networker" command via email
>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>http://listmail.temple.edu/archives/networker.html where you can
>also view and post messages to the list. Questions regarding this list
>should be sent to stan AT temple DOT edu
>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list. Questions regarding this list
should be sent to stan AT temple DOT edu
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>