Gnucomo - User's Manual

GnuCoMo can be downloaded from the GnuCoMo website. If you read this file you are likely to have a copy of GnuCoMo on your system. If you are serious about using GnuCoMo it would be a good idea to periodicly check the website for updates and bugfixes.

To be able to install and run GnuCoMo you'll need several other packages:

PostgreSQL: PostgreSQL is the database we use for GnuCoMo. Most linux distributions provide ready to install packages for PostgreSQL. We need at least the postgresql, postgresql-server, postgresql-libs and postgresql-devel packages. If you want to compile PostgreSQL from source: go to the PostgreSQL homepage and download the sources via one of the ftp sites. We need libpq++ support for GnuCoMo. Though we appreciate the performance improvements that the PostgreSQL 7.3 server provides, we recommend sticking to the PostgreSQL 7.2 client versions for now; at least until we've solved the problems with the libpq++ libraries and PostgreSQL 7.3.
libpqxx: The C++ client interface for the PostgreSQL database server is a separate project, not distributed with PostgreSQL. This library replaces the old libpq++ library. The place to find libpqxx is the Gborg website. Make sure you install libpqxx in a default library path or set your LB_LIBRARY_PATH environment variable to include the directory in which libpqxx is installed. The default for libpqxx is /usr/local/libpqxx/lib.
PHP: PHP is used as programming language for gcm_deamon. If you have packages you would want to install at least the php and php-pgsql packages. You can get PHP sources and documentation from the PHP website. If you're compiling yourself, don't forget to include PostgreSQL support.
libxml2: We use XML for configuration that we can't (or don't want to) store in the database and for documentation in XMLDOC format. The libxml2 library usually comes with your linux system and you would want to install both the libxml2 and libxml2-devel packages. The libxml2 sources can be downloaded via the XMLSoft site.
AXE: It's not likely that you'll find precompiled AXE packages on the net, so you'll have to compile from source. Get the AXE sources from the AXE page; you'll need version 0.3 or better. Instructions on compiling and installing AXE are given later in this document.

The following packages are optional and provide additional functionality to GnuCoMo:

GnuPG: Recommended for encryption and signing of data that is transported over the network. Not used at this moment.
XMLDoc: We use XMLDoc to process our documentation. Download and installation instructions can be found on the XMLDoc site.
Apache: [HELPME: basic instructions]
Python + tkinter: A GUI based configuration tool named MalfisInter (mi) is being worked upon. This tool is programmed in Python and requires XML and TkInter support.

2.2 Compiling

If you're lucky enough to find precompiled packages for your system and have root permissions to install them, things are easy for you; otherwise you would have to compile from source, which takes a bit more time if you allready have the standard developer tools (C and C++ compilers, make, (f)lex and yacc or bison) installed. You will need those tools for GnuCoMo anyway. You will also need the GNU automake and autoconf packages.

Another essential utility is the bzip compressor package; you will need it to unpack the archives. bzip2 and bunzip2 come standard with Linux, but may not be available on older Unix distributions. Sources for bzip2 can be found at http://sources.redhat.com/bzip2/. Most of the packages mentioned in this document come with detailed compilation and installation instructions and I recommend to read the README and INSTALL files before compiling and installing them.

2.2.1 Compiling AXE

For compiling AXE you need the X-windows (X11) headers and libraries. Under Linux you might need to install the XFree86-devel package, but these headers are usualy available whenever the C compiler is installed (also on propriatary unixes). [TODO]

2.2.2 Compiling GnuCoMo

The usual ./configure; make install will compile the binary applications and install gcm_input and logrunner into /usr/local/bin. You will need to copy the web interface scripts and gcm_daemon manually to the appropriate directories.

2.3 Setting up the database

Gnucomo won't do anything usefull without a PostgreSQL database, so you will have to set this up first. Make sure you have the PostgreSQL server running and you have a valid userid and password to access the server. You also need to have the permission to create new databases. If not, you will need to log in as the database administration user (usually a user called postgres) and create a new user for the database.

To create the initial Gnucomo database, you need to use the SQL script create.sql, which is in the src/database directory of the Gnucomo distribution. Type the following commands to create the database:


   createdb gnucomo
   psql gnucomo -f create.sql

[TODO] How to set up database security is yet to be described.

2.4 The Gnucomo configuration file

All Gnucomo applications use a single configuration file, named gnucomo.conf by default. There are three places where this configuration file is stored. Gnucomo applications first look for the configuration file in /etc, and if the file is not found there, it should be in /usr/local/etc. Furthermore, you can have an additional configuration file in your home directory, named $HOME/.gnucomo.conf. The configuration file in /etc or in /usr/local/etc holds the system-wide parameters. The configuration file in a user's home directory may hold user-specific preferences. The system-wide configuration is mandatory; any user-specific configuration is optional.

The configuration parameters are organized as a two-level tree, written in XML. The first level of the tree constitutes the 'sections', referring to a set of related paramaters. The second level contains the parameters themselves. Each parameter is a single value, i.e. there are no provisions to create structured parameters. Here is an example of what the Gnucomo configuration file may look like:

   <?xml version='1.0'?>
   <gnucomo version='0.0.8'>
      <database>
         <type>PostgreSQL</type>
         <name>gnucomo</name>
         <user>arjen</user>
         <password>guess-again:-)</password>
      </database>
      <logging>
         <method>file</method>
         <destination>/var/log/gcm_input</destination>
         <level>0</level>
      </logging>
   </gnucomo>

At the root of the tree is an XML element, called gnucomo. The direct children of the root element in the example above are database and logging. These are the 'sections', corresponding to different aspects of the Gnucomo system. As you may have guessed, the database section holds information we need to access the Gnucomo database on the PostgreSQL server and the logging section specifies how much logging we want the Gnucomo applications to generate and where we want that information to go.

Here is a full list of the parameters that are used by Gnucomo, subdivided in their various sections:

database

All parameters that provide Gnucomo applications access to the database.

type: The type of the database server. Currently, only PostgreSQL is supported.
host: The hostname of the system on which the database server runs.
port: The TCP port used by the database server.
name: The name of the database to use for Gnucomo.
user: The user name with which to log in on the database server.
password: The password used for logging in to the database server.

logging

The amount of logging generated by Gnucomo applications and where to send it.

method: The method for sending log information.
destination: Where the logging information is sent.
level: Choose a higher level for more information.

logfile

Primarily used by logrunner. A list of files that are scanned for system logging that in sent to the Gnucomo server. There can be more than one logfile element in a Gnucomo configuration file.

name: Name of the file to scan for logging, e.g. /var/log/messages.
type: The type of the logging information. Types supported by Gnucomo are system log, apache access log and apache error log.
fromhost: Hostname (not a FQDN) of the system for which the log entries are sent. Only log entries that originate from this host are sent to the Gnucomo server. This option applies to system log files that may have log entries for several hosts, in case the system acts as a syslog server.
filter: A regular expression to filter out log entries before sending to the Gnucomo server. There may be several filter elements in a single logfile element.

2.5 Getting the web interface up and running

Your primary view onto Gnucomo is through the web interface. Gnucomo uses a web server, e.g. Apache, with PHP scripts to implement its user inetrface At the present, the web interface is the only way to manage the information in the Gnucomo database and view the results that Gnucomo produces.

To be able to use the web interface, you need to copy the PHP scripts from the src/web and src/phpclasses directories into a directory accessible through the web server. On most Red Hat Linux systems, this is usually in /var/www/html. One way to accomplish this is to create a subdirectory gnucomo under the Document Root of the webserver. The main gnucomo scripts go into this directory and the PHP class library can be copied either in a subdirectory classes or ../phpclasses. Here's an example installation:

  [gnucomo] # mkdir $DocumentRoot/gnucomo
  [gnucomo] # cp src/web/* $DocumentRoot/gnucomo
  [gnucomo] # cp -r src/phpclasses $DocumentRoot

2.6 And now...

If you like GnuCoMo [TODO] Report success under Non-Linux [TODO]

3 Using Gnucomo

Once you have the database and the web interface set up, Gnucomo is almost ready to go. To put Gnucomo to work, you need to enter at least one object into the database and create your methods to feed information to gnucomo.

3.1 The initial entries

Gnucomo maintains information about monitored systems such as servers, work stations, switches and that sort of thing. In Gnucomo, these are called objects. The objects are distinguished by their name, usually a hostname in your network. Or, better still, a fully qualified name on the internet. You can use the web interface to create an object by logging in and selecting the picture that looks like a computer cabinet in the button bar at the top. Enter the object's name, for example server.gnucomo.org, and click the Create button.

3.2 Feeding log files

Once an object is known by Gnucomo, i.e. the object's name is stored in the database, you can start to fill the database with log entries. Log entries are stored by a variety of programs into one or more log files on the object's harddisk. For example, system messages on most Linux distributions are put in a file called /var/log/messages. There are several ways to transfer the contents of a log file into the Gnucomo database, each of which involves gcm_input. Gcm_input is the Gnucomo application that reads various kinds of raw input and tries to scan for as much information as possible. The usefull information that is extracted from the input is stored into the Gnucomo database.

The most direct way to invoke gcm_input maually and read the log file through its standard input:

   gcm_input -h server.gnucomo.org </var/log/messages

You'll have to explicitly specify the hostname of the object because there is no way for gcm_input to know where the log file comes from. Take care not to feed the same log file another time through gcm_input. This will lead to duplicate log entries in the Gnucomo database.

Feeding information into the Gnucomo database can also be done through a mail server, such as sendmail or postfix. Create a mail alias that invokes gcm_input as a mailer program. For example:

   gnucomo:        "|gcm_input"

With such an email alias, Gnucomo clients can simply send log files and other reports to this email address:

   mail gnucomo@server.gnucomo.org </var/log/messages

The third method to feed log files into the Gnucomo database is by using logrunner. Where the previous methods can only feed complete logfiles as a whole, logrunner can feed partial logiles to the Gnucomo server. Logrunner keeps track of the growth of a logfile and converts the newly added log entries into an XML message suitable for gcm_input. A practical way to deploy logrunner is:

  /usr/local/bin/logrunner -1 | mail gnucomo@gnucomo.server

Note the '-1' option for logrunner. This option is needed to keep logrunner from generating multiple XML documents in one output stream. Which logfiles are scanned by logrunner is specified in the Gnucomo configuration file.

3.3 Viewing results

Use the web interface to review log entries, notifications and parameters. The Objects page shows statistics about the data that is collected for each monitored object. Clicking on the links in the statistics will display the specific data for that object. The icon of each object takes you to a form for editing information about that object, such as an identification code, physical location and services used by that object.

After feeding the log entries of monitored objects to the Gnucomo server, these log entries can be analyzed by the Gnucomo analyzer daemon, gcm_daemon. The present version of Gnucomo does not yet support the operation of a real daemon, so you will have to start gcm_daemon explicitly. This performs a number of analyses on the log entries and updates the statistics for each object. When the gcm_daemon finds something out of the ordinary, it will create a Notification. Notifications are created to alert you to the fact that something may be wrong with that object.

The checks made by Gnucomo are related to services that run on a monitored object. A few checks are predefined in Gnucomo. You can add more checks to tailor your situation through the Services page. Click on the icon of a specific service to edit the list of checks that are performed for log entries generated by that service. Each check consists of a pattern in the form of a regular expression and an action. For example, the following pattern matches a log entry from postfix which states that outgoing mail is delivered:

  to=.+, relay=.+, delay=[0-9]{1,2}, status=sent \(250 .+\)

The action is performed when the regular expression matches with a log entry. Each check is tagged with a rank number for defining the order in which the checks are applied. The first pattern that matches is used to perform the action. The very last check is a 'catch all' pattern, predefined in the Gnucomo database. When a pattern matches, one of four actions is executed:

ignore: Simply ignore this log entry.
notify: Create a notification. The Issue for the notification is stated in the Argument field.
abuse: Add one to the number of abuses detected for the IP address or hostname that is mentioned in the log file. The argument for this action is almost always "$1", which is the first part of the pattern between parenthesis. Note that this requires a special form of regular expression in the pattern which will select certain parts as a sub expression.
forgive: Remove one abuse for the IP address of hostname in the argument.

A notification is always created with a specific Issue which states what Gnucomo thinks is going wrong. A number of issues are predefined in the Gnucomo database. You can create your own issues and have Gnucomo generate notifications with these issues on the Issues page.

3.4 Abuse and intrusion detection

Gnucomo maintains a list of IP addresses that have committed some kind of abuse. This can be attempts to send spam, attempts to intrude a system from the internet or any other kind of malicious action. These abuses are detected by creating the proper patterns in checks for log entries as discussed in the previous section. Each time an abuse is detected in the log, one abuse is added to the record for that Object and IP address. A reference to the log entry in which the abuse was detected is also maintained. When the number of abuses for a certain IP address exceeds the limit (32 in this version of Gnucomo), the status for the IP address is set to 'dropped'. You can review the list of abusing IP addresses by clicking on the 'Abuse list' in the Objects page.

The most useful application of the abuse list is to maintain a firewall and block all IP addresses that have the 'dropped' status. To do this automatically, you need to provide access to the database from a script that is probably run by root. A special user 'firewall' that can only read the abuse list can be created with the following SQL commands:

CREATE USER firewall WITH PASSWORD 'secret';
GRANT SELECT ON object_abuse TO firewall;

When the Gnucomo database runs on a different system than the one on which the firewall is maintained, the database server needs to provide access from external systems. This implies setting up the PostgreSQL configuration and firewall rules. The following script then augments the firewall with the information from the Gnucomo abuse list:

#!/bin/sh
#
#  Create a firewall script from the gnucomo abuses table
#

psql "sslmode=require host=server.gnucomno.org dbname=gnucomo user=firewall password=secret"
         -c "select source from object_abuse
         where status='dropped' and objectid=$1"|grep -v '^$'>/tmp/gnucomo-abuses

while read ADDRESS
do
   echo iptables -I INPUT -s $ADDRESS    -j DROP
done < /tmp/gnucomo-abuses

4 Managing system parameters

Besides log entries which represent the transient events that occur in an object, Gnucomo also maintains the state of the object. The state of an object is defined by the set of parameters and their values. Examples of parameters are the file systems, users, processes, installed packages, and so on. By managing those parameters with Gnucomo, you can hold a firm grip on the state of your systems. Especially on large sites with many workstations, servers, switches and other objects, managing the state of all those systems is not a trivial task. The following sections explain how Gnucomo can help.

4.1 System resources

By monitoring the resources of a system, such as memory, cpu, file systems and network interfaces, you can keep track on the system's performance and capacity. The utilization of these resources varies frequently over time. Viewing the values of resource properties provides an excelent insight in the system's behaviour. Gnucomo can maintain a record of the system resources by using DYNAMIC properties in parameters. You can define classes of parameters with properties that will reflect the state of your systems and feed the numbers into the Gnucomo database by using scripts that extract the information from a system and convert this into XML format.

As an example, consider the processing load of a system. Two metrics can be of interest to keep an eye on the system load: The total number of processes and the number of runnable processes. These numbers are easily obtained with standard UNIX commands ps and uptime. To maintain this in Gnucomo, a class is needed to define the properties we want to maintain. For this example, we create a class systemload with the properties processes and runqueue. Note that both these properties are DYNAMIC. The properties for the systemload class can be defined by using Gnucomo's web interface or typing SQL statements directly into the database:

INSERT INTO parameter_class (name, property_name, description, property_type, min, max, notify)
  VALUES ('systemload', 'processes', 'Total number of processes', 'DYNAMIC', 0, 1000, 'f');
INSERT INTO parameter_class (name, property_name, description, property_type, min, max, notify)
  VALUES ('systemload', 'runqueue', '5 minute average length of the run queue', 'DYNAMIC', 0, 5.00, 'f');

When the class is defined in the Gnucomo database, we are ready to feed the information into Gnucomo through gcm_input. The following shell script will create the appropriate XML document with the parameter values:

#!/bin/sh
#
# Gnucomo system load report.
#
# Create a parameter report with two values:
# The total number of processes and the 5-min load average.
# 


HOST=`hostname`
TIME=`date`

echo "<?xml version='1.0'?>"
echo "<gcmt:message xmlns:gcmt='http://gnucomo.org/transport/'>"
echo "  <gcmt:header>"
echo "      <gcmt:messagetype>XML</gcmt:messagetype>"
echo "      <gcmt:hostname>$HOST</gcmt:hostname>"
echo "      <gcmt:time>$TIME</gcmt:time>"
echo "   </gcmt:header>"
echo "   <gcmt:data>"

echo "   <gcmt:parameters gcmt:class='systemload'>"

PROCESSES=`ps ax|wc -l|awk ' {print $1}'`
LOADAV=`  uptime|awk ' { print $11 }' | tr -d ,`

echo "<gcmt:parameter name='Load'>"
echo "   <gcmt:description>System processing load</gcmt:description>"
echo "   <gcmt:property name='processes'>$PROCESSES</gcmt:property>"
echo "   <gcmt:property name='runqueue'>$LOADAV</gcmt:property>"
echo "</gcmt:parameter>"

echo "    </gcmt:parameters>"
echo "   </gcmt:data>"
echo "</gcmt:message>"

Filesystems are predefined in Gnucomo. Gcm_input understands the output of most df implementations directly. The following examples show how to feed filesystem information into Gnucomo:

df -lPk -x tmpfs | gcm_input -h `hostname`
df -lPi -x tmpfs | gcm_input -h `hostname`

Make sure to add the -k option so df reports the filesystem sizes in kilobytes.

4.2 Installed packages

Gnucomo can maintain a list of all packages installed on an object. This list is stored in the database as parameters of class package. Each package parameter has one property: the version number of the package. Gnucomo keeps an historical record of the version number for each package, so you will know when the version of a package changes. Furthermore, you can compare the installed packages of two objects, so you can easily find the differences between two installations.

To maintain the installed packages of an object, all you need is to feed a complete list into the Gnucomo database through gcm_input. The package list is a text file with a package on each line of the form packagename-version. Not the dash ('-') between the name and the version. On many Linux systems that use the RedHat Package Manager (RPM), such a list is easily obtained and fed into gcm_input:

   rpm -qa | gcm_input -h example.gnucomo.org

You will have to repeat this procedure at regular intervals. Each time you feed a new package list to Gnucomo, Gnucomo will compare the new list with the package parameters in the database and notify you of any changes.

For keeping your systems up-to-date, it is often convenient to maintain a repository of the most recent packages. Gnucomo can help you with this task. To enter the list of packages and updates from a distribution into the Gnucomo database, you can create a virtual object and create the package parameters for this object. For example, to maintain a list of RPMs for RedHat Linux 7.3, create an object (using the web interface) with the name redhat-7.3. Using this object, enter the first batch of package parameters from a directory listing of the CD-ROM. However, you can not use the output from ls directly as input for gcm_input. You need to strip off two siffixes off the filenames to make it look like a rpm -qa output. Futhermore, a repository of updates often contains multiple versions of a package file. You want to make sure that the latest version of each package is recorded in the Gnucomo database. The (python) script report_repository.py will perfom these tasks:

   python report_repository.py /mnt/cdrom/RedHat/RPMS | gcm_input -h redhat-7.3

You can repeat this command to enter additional packages into the database. For example, to add packages from other CD-ROMs or from a directory where you keep downloaded updates. Remember, however, to use the -i option for "incremental" with gcm_input. Without this option, gcm_input will regard the list of packages as a full list and remove any parameters that are not present in the new list.

With a list of available packages for a specific distribution in the Gnucomo database, along with the list of packages installed on your systems, you can easily obtain an overview of which packages you need to update. To make this overview, use the web interface to compare the packages on an object with the packages on the virtual object.

Gnucomo - User's Manual

Table Of Contents

1 What is GnuCoMo ?

1.1 About this Document

1.2 Reporting Bugs

2 Installation

2.1 Getting the software

2.2 Compiling

2.3 Setting up the database

2.4 The Gnucomo configuration file

2.5 Getting the web interface up and running

2.6 And now...

3 Using Gnucomo

3.1 The initial entries

3.2 Feeding log files

3.3 Viewing results

3.4 Abuse and intrusion detection

4 Managing system parameters

4.1 System resources

4.2 Installed packages

1 What is GnuCoMo ?

1.1 About this Document

1.2 Reporting Bugs

2 Installation

2.1 Getting the software

2.2 Compiling

2.2.1 Compiling AXE

2.2.2 Compiling GnuCoMo

2.3 Setting up the database

2.4 The Gnucomo configuration file

2.5 Getting the web interface up and running

2.6 And now...

3 Using Gnucomo

3.1 The initial entries

3.2 Feeding log files

3.3 Viewing results

3.4 Abuse and intrusion detection

4 Managing system parameters

4.1 System resources

4.2 Installed packages