Category Archives: Linux

SLURM RPMs on COPR

I’ve started building SLURM RPMs using the Fedora COPR service, which allows Fedora developers to produce packages using the Fedora infrastructure. At the moment, the builds only work for EPEL6, EPEL7 and Fedora 22. There was a change in the RPM defaults for Fedora 23 and above that means the builds aren’t working at the moment. I’ve reported the bug upstream and will investigate it further when I have time.

My repository is at:

https://copr.fedorainfracloud.org/coprs/verdurin/slurm/

I’ve always found it a little strange that the upstream includes an embedded .spec file, but doesn’t provide RPMs themselves, so this is an attempt to work around that omission.

They could probably do with a bit of a cleanup (I doubt rpmlint will be very happy), but that will come later.

Advertisements

Simple OpenStack monitoring with Ganglia and Nagios

I’ve been running an OpenStack-based cloud for a while. While the modularity of OpenStack is a strength, helping the fast pace of development, it also means that the interactions between components can be quite complex, with all the possibilities for obscure errors this implies. For instance, upgrades in one component (such as a GlusterFS backend) can cause problems elsewhere. Here’s a description of some simple monitoring I’ve added to ameliorate this.

This assumes you already have Ganglia and Nagios available. There are two parts: a regular Ganglia check and a Nagios service that checks the Ganglia value, raising an alert if it crosses your chosen threshold. In my case, one of the sets of metrics I’m interested in is the number of instances in different states – active, error, build and shutoff. If there are too many in the build state, that may mean there’s a problem with the shared /var/lib/nova/instances directory, or with the scheduler, for example.

Here’s the script that runs on each compute node, triggered by cron every 10 minutes:

#!/bin/bash
# Script to check some OpenStack values

. /root/keystonerc_admin

INSTANCES_ERROR=$(nova list --all-tenants|grep ERROR|wc -l)
INSTANCES_ACTIVE=$(nova list --all-tenants |grep ACTIVE|wc -l)
INSTANCES_BUILD=$(nova list --all-tenants |grep BUILD|wc -l)
INSTANCES_SHUTOFF=$(nova list --all-tenants |grep SHUTOFF|wc -l)

/usr/bin/gmetric -d 1200 -x 1200 --name=instances_error --value=${INSTANCES_ERROR} --type=uint8
/usr/bin/gmetric -d 1200 -x 1200 --name=instances_active --value=${INSTANCES_ACTIVE} --type=uint8
/usr/bin/gmetric -d 1200 -x 1200 --name=instances_build --value=${INSTANCES_BUILD} --type=uint8
/usr/bin/gmetric -d 1200 -x 1200 --name=instances_shutoff --value=${INSTANCES_SHUTOFF} --type=uint8

The file keystonerc_admin needs to contain the OpenStack Nova API credentials. The --name value will be used in the Nagios check.

In Nagios, assuming that you’ve already defined the hosts to be checked and created a cloud servicegroup, this service definition will raise an alert if the instance status values collected by the Ganglia script exceed the specified thresholds:

define service {
                use default-service
                service_description INSTANCES ERROR
                servicegroups cloud
                check_command check_with_gmond!instances_error!5!10
                notification_options c,w,f,s,r
                host_name host.to.be.monitored
                normal_check_interval 30
}

define service {
                use default-service
                service_description INSTANCES BUILD
                servicegroups cloud
                check_command check_with_gmond!instances_build!3!5
                notification_options c,w,f,s,r
                host_name host.to.be.monitored
                normal_check_interval 30
}

check_with_gmond is defined as a command that calls the plugin check_ganglia.

This simple approach can be extended to monitor images in Glance, volumes in Cinder etc.

The two scripts are available as gists in Github: Ganglia script and Nagios script.

Vidyo and Skype audio on Fedora 20

Just a quick note that the fix mentioned here for Skype, when using Pulseaudio 4, is also needed for Vidyo. With Skype, the symptom I found was that as soon as it emitted any sound, a horrible continuous buzzing started. With Vidyo, I couldn’t hear any other participants in conference calls (it’s used a lot by CERN and the LHC experiments). With the fix, it worked as normal.

The permanent fix is to edit the .desktop file so that the environment variable is set correctly at login. The file is:

/etc/xdg/autostart/vidyo-vidyodesktop.desktop

and change the line

Exec=VidyoDesktop -AutoStart

to read

Exec=env PULSE_LATENCY_MSEC=60 VidyoDesktop -AutoStart

As noted in this Bugzilla report, you may also need to apply a similar change to

/usr/share/applications/vidyo-vidyodesktop.desktop

OpenStack ephemeral disk problem

I’m administering an OpenStack cloud, running the Folsom release.  One of the users requested some ephemeral disk space for their instances, so I created a custom flavour to meet their requirements.  Unfortunately, all the instances went straight to the error state, because the scheduler couldn’t find a valid host.  This was strange, because I knew there were several hosts with sufficient space.  Here is the error message:

2013-06-12 13:20:19 DEBUG nova.scheduler.filter_scheduler [req-XXX] Attempting to build 1 instance(s) schedule_run_instance /usr/lib/pytho
n2.6/site-packages/nova/scheduler/filter_scheduler.py:66
2013-06-12 13:20:19 WARNING nova.scheduler.manager [req-XXX] Failed to schedule_run_instance: No valid host was found. Exceeded max schedu
ling attempts 3 for instance XXX
2013-06-12 13:20:19 WARNING nova.scheduler.manager [req-XXX] [instance: XXX] Setting instance to ERROR st
ate.

After some searching, I came across this question on a Rackspace forum.  The moderator suggested it was an Ubuntu packaging problem.  It also exists in the Red Hat packages I’m using, so I made the change myself and it fixed the ephemeral disk creation errors I was seeing.

I’ll mention this to the Red Hat/Fedora developers.

A direct link to the diff is here.

Unlikely petabyte network values in rrdtool/ganglia

Of late the networking graphs in our Ganglia monitoring have suffered from irritating, improbable spikes (30PB…) that effectively render them meaningless.  At first I tried the removespikes.pl script that I saw mentioned by other people with the same problem.  This didn’t work all that well, either over- or under-shooting what was required.  It also felt like solving the symptoms rather than the cause.  After all, Ganglia is just plotting what it receives from rrdtool.

Eventually I found a suggestion of applying a maximum value in the header of RRD files with rrdtool.  This way, I could rule out these (pretty much) impossible values.  Here’s an example command:

rrdtool tune bytes_in.rrd --maximum sum:9.0000000000e+09

Clearly care is needed that legitimate values aren’t excluded e.g. interfaces running at 10 gigabit or higher speeds.  It’s been working well for the past week and the network graphs are now meaningful again (after manually removing the outlying values).

Fedora 18 NFS and iptables changes

In previous versions of Fedora, if you wanted add some firewall protection to NFS, you could configure static ports for the various daemons in the file /etc/sysconfig/nfs as follows:

RQUOTAD_PORT=49152
STATD_PORT=49153
MOUNTD_PORT=49154
LOCKD_TCPPORT=49155
LOCKD_UDPPORT=49155
STATD_OUTGOING_PORT=49156

Those arguments don’t appear to have the expected effects on Fedora 18. These are the ports in use if those settings are used:

# systemctl restart nfs-lock.service
# systemctl restart nfs-server.service
# rpcinfo -p
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  47768  status
    100024    1   tcp  36896  status
    100003    2   tcp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100227    2   tcp   2049  nfs_acl
    100227    3   tcp   2049  nfs_acl
    100003    2   udp   2049  nfs
    100003    3   udp   2049  nfs
    100003    4   udp   2049  nfs
    100227    2   udp   2049  nfs_acl
    100227    3   udp   2049  nfs_acl
    100021    1   udp  49155  nlockmgr
    100021    3   udp  49155  nlockmgr
    100021    4   udp  49155  nlockmgr
    100021    1   tcp  49155  nlockmgr
    100021    3   tcp  49155  nlockmgr
    100021    4   tcp  49155  nlockmgr
    100005    1   udp  20048  mountd
    100005    1   tcp  20048  mountd
    100011    1   udp    875  rquotad
    100011    2   udp    875  rquotad
    100011    1   tcp    875  rquotad
    100011    2   tcp    875  rquotad
    100005    2   udp  20048  mountd
    100005    2   tcp  20048  mountd
    100005    3   udp  20048  mountd
    100005    3   tcp  20048  mountd

Note that mountd and status have random ports, not the ones we specified.
The new system is to use arguments to the various daemons instead:

#
# Optinal options passed to rquotad
RPCRQUOTADOPTS="--port 49152"
#
# Optional arguments passed to in-kernel lockd
LOCKDARG=""
# TCP port rpc.lockd should listen on.
LOCKD_TCPPORT=49155
# UDP port rpc.lockd should listen on.
LOCKD_UDPPORT=49155
#
# Optional arguments passed to rpc.nfsd. See rpc.nfsd(8)
RPCNFSDARGS=""
# Number of nfs server processes to be started.
# The default is 8. 
RPCNFSDCOUNT=8
# Set V4 grace period in seconds
#NFSD_V4_GRACE=90
#
# Optional arguments passed to rpc.mountd. See rpc.mountd(8)
RPCMOUNTDOPTS="--port 49154"
#
# Optional arguments passed to rpc.statd. See rpc.statd(8)
STATDARG="--outgoing-port 49153 --port 49156"
#
# Optional arguments passed to rpc.idmapd. See rpc.idmapd(8)
RPCIDMAPDARGS=""
#
# Optional arguments passed to rpc.gssd. See rpc.gssd(8)
RPCGSSDARGS=""
#
# Optional arguments passed to rpc.svcgssd. See rpc.svcgssd(8)
RPCSVCGSSDARGS=""
#
# To enable RDMA support on the server by setting this to
# the port the server should listen on
#RDMA_PORT=20049 
#
# Optional arguments passed to blkmapd. See blkmapd(8)
BLKMAPDARGS=""

and now the list of ports:

# rpcinfo -p
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  49156  status
    100024    1   tcp  49156  status
    100003    2   tcp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100227    2   tcp   2049  nfs_acl
    100227    3   tcp   2049  nfs_acl
    100003    2   udp   2049  nfs
    100003    3   udp   2049  nfs
    100003    4   udp   2049  nfs
    100227    2   udp   2049  nfs_acl
    100227    3   udp   2049  nfs_acl
    100021    1   udp  49155  nlockmgr
    100021    3   udp  49155  nlockmgr
    100021    4   udp  49155  nlockmgr
    100021    1   tcp  49155  nlockmgr
    100021    3   tcp  49155  nlockmgr
    100021    4   tcp  49155  nlockmgr
    100005    1   udp  49154  mountd
    100005    1   tcp  49154  mountd
    100005    2   udp  49154  mountd
    100005    2   tcp  49154  mountd
    100005    3   udp  49154  mountd
    100005    3   tcp  49154  mountd
    100011    1   udp  49152  rquotad
    100011    2   udp  49152  rquotad
    100011    1   tcp  49152  rquotad
    100011    2   tcp  49152  rquotad

I saw this in a Bugzilla report