Home > DNX_Affinity

DNX_Affinity

DNX_Affinity is a project mainly written in C, based on the GPL-2.0 license.

This is a fork of DNX to enable individual clients to be affined to specific hosts.

DNX README File

DNX - Distributed Nagios eXecutor is a Nagios Event Broker (NEB) plug-in module that distributes checks amongst several servers ("worker nodes") to reduce load and check latency introduced by a large Nagios installation.

DNX exists as two parts:

  • The DNX NEB module itself (dnxServer.so) and
  • The DNX Client (dnxClient).

dnxServer.so works as any other NEB module would; it is loaded into the same process address space as Nagios upon start up.

dnxClient resides on a separate host and sends requests to the thread started by the DNX NEB module on the Nagios server ("head node") for checks to perform and then returns the status data back to the module for Nagios' interpretation and subsequent actions (alerts, notifications, etc).

Management Interface

DNX provides a simple management interface, which allows the administrator to perform management tasks against installed DNX clients. The tool is called 'dnxstats' and is installed (by default) into the $(prefix)/bin directory on the DNX server (installed using install or install-server).

The dnxstats utility has a simple interface, which you can discover using the --help command-line option:

$ /usr/local/nagios/bin/dnxstats --help Usage: dnxstats [options] Where [options] are: -s specify target host name (default: localhost). -p specify target port number (default: 12480). -c send to server. (Hint: Try sending "HELP".) -v print version and exit. -h print this help and exit.

The functionality provided by dnxstats is really provided by the stats server (the DNX client, in this case). To find out the level of functionality supported by a given server, send the "HELP" command:

$ /usr/local/nagios/bin/dnxstats -s 10.1.1.1 -c "HELP" DNX Client Management Commands: SHUTDOWN RECONFIGURE DEBUGTOGGLE RESETSTATS GETSTATS stat-list stat-list is a comma-delimited list of stat names: jobsok - number of successful jobs jobsfailed - number of unsuccessful jobs thcreated - number of threads created thdestroyed - number of threads destroyed thexist - number of threads currently in existence thactive - number of threads currently active reqsent - number of requests sent to DNX server jobsrcvd - number of jobs received from DNX server minexectm - minimum job execution time avgexectm - average job execution time maxexectm - maximum job execution time avgthexist - average threads in existence avgthactive - average threads processing jobs threadtm - total thread life time jobtm - total job processing time Note: Stats are returned in the order they are requested. GETCONFIG GETVERSION HELP

Thus, the dnxstats utility is really a dumb client, sending text specified by the user, and dumping raw response text to the console.

Except for single-word commands, all command strings should be enclosed in double or single quotes, so that the shell interprets the entire command string (text following the -c option) as a single argument.

Nagios Support

Currently, DNX has been tested with Nagios 2.7, 2.8, 2.9, 2.10, 2.11, 3.0 and 3.0.1. As we continue to release new versions of DNX, we'll continue to enhance support for the latest Nagios versions.

Advanced Features

Local Execution Only Checks

You may have some check which must be run from one host only (firewall issues, SAN connections, proprietary libraries, etc). To accomodate this, you may flag certain checks to not be distributed. They will execute in the normal fashion as if DNX was not loaded. This could also be used to perform checks on the nagios server itself (check_load, check_nagios, etc). This could be problematic in that it makes it harder to run these same checks on the worker nodes. We recommend the use of nrpe for all such checks (nagios engine server and worker nodes), if for no other reason than for simplicity. The worker nodes will use nrpe to check the nagios engine server as well as each other.

Plug-in Propagation

One concern with DNX is that the plugins will exist on different servers and must all be identical. Thus, included with DNX is a simple script (sync_plugins.pl) which will run like a plugin that dnxServer.so would execute upon startup. This script provides a mechanism to ensure that all of your plugins exist on each of the worker nodes (this is important because you can't be sure which node will actually perform the checks). The script uses rsync to push all of the plugins in your plugin directory (/usr/local/nagios/libexec, for example) from the nagios engine server (the plugin authority) to each worker node. For this to work you must set up SSH key sharing (as the nagios user) between your servers for the nagios user so that this rsync will work without a password. Note that this could be considered a security risk by some people/organizations. If you do not like this mechanism, you may write your own sync plugin, or disable DNX's interal ability to sync plugins and do so by your own devices.

Also note that each worker node needs to be set up similarly, if not identically to other worker nodes (any external libraries, perl modules, paths, etc). And lastly, you should be aware that the rsync may clobber meta data of plugins on the first run, which you may want to fix manually. For example, check_icmp needs to be run as root with the set uid bit, if check_icmp is updated (or moved to the worker node for the first time) by rsync, it will loose ownership by root. Once the plugin is in place with the permissions correct, rsync will leave it alone, however.

Additional Information

For information on building, installation and configuration, please refer to the INSTALL file.

For the latest changes, please refer to the NEWS file.

For detailed information on source changes between versions, please refer to the ChangeLog file.

Please see the COPYING file for details on the GNU General Public License, under which this software is released.

Previous:Joshs--Snippets