More tea please.


For my PhD, I’ve been using the University of Southampton’s Iridis Compute Cluster, a.k.a. “supercomputer”. I’m using this to run the fitness tests for some genetic algorithm optimisation things I’m working on. Each fitness test takes 10 to 30 seconds, so the more I can run in parallel, the better (up to a point…). Using this cluster, I can run my work much faster. Getting to this point took a lot of beating though.

In what appears to have been some kind of twisted marketing stunt, many places report that the Iridis cluster runs Windows. It doesn’t. If it did, I wouldn’t entertain going near it. It runs Red Hat EL5.

After filling in the paperwork to get an account on this wondrous cluster, I shelled in and went about compiling my work so I could run it on the cluster. I was expecting some packages not to be installed, as I do on most systems that I initially approach. Unfortunately one library that I wanted to use, CGAL wasn’t installed, nor was it in the repositories. So a request for this to be installed would have involved getting one of the sysadmins (who are already stretched on fixing some killer, and I really do mean killer, GPFS performance issues) to install it from source, and would take far too long.

Option 1: Build from Source

So, I went about building it myself from source. Like a good little source-building monkey, I climbed the dependency tree, building the various other things that I needed for this. Things like cmake and boost… This became painful, as the 32-bit headers for libc weren’t installed. CGAL has some sort of fatal bug on 64-bit systems. Damn. I wasn’t going to build the C library. That’s where I drew the line. I went home, drank some tea, watched some Family Guy, slept.

Option 2: Static linking

With a freshly caffeinated brain, I decided to try static linking. This’d hopefully solve my problems because I could compile all the libraries from my local machine into one giant executable that I’d transport to the remote machine. Sounded good. Then I found that Fedora only provides static libraries for a small set of packages, and has a general dislike of these things. I was a little perturbed by this.

Option 3: Bundling the dynamic linker

I nosed around with the dynamic linker. I found that the dynamic linker can be invoked with the program that it should dynamically link as its argument, e.g:

% /lib/ld-linux.so.2 /bin/bash

I realised that this could fix the issues I was having. So I wrote a small utility that’d bundle together the dynamic linker from my Fedora system along with the executables that I wanted to run and the shared libraries they required. This was nice. I could take a binary from my Fedora system, pass it through this utility, and get a wrapped binary that I could run on RHEL.

A further nicety that I discovered was the “auditing” functionality of the dynamic linker. This allowed me to write a function that’d get called every time the dynamic linker went to load a new shared library from disk. My auditing library would scream its head off if this library wasn’t from the set that within the bundle. This meant I could be sure that the code I was running was the same as the code I was running on my own machine.

There’s more information about the auditing API in the rtld-audit man page.

I stuck with this solution for a few hours, until I discovered some issues with it. Many programs rely on configuration files and other executables on the system they’re running on. ImageMagick is one of these programs. It often invokes other programs to convert between formats, and has a configuration file that modifies this process too. My statically linked program used ImageMagick’s C API to manipulate some images, so it ended up invoking some stuff from the host system. This turned nasty when RHEL’s buggy librsvg was used to convert an SVG to a PNG :(

Option 4: fakechroot

It was clear at this point that I needed some kind of chroot environment. Unfortunately chroot itself requires superuser rights on the system that it’s run on (for some reason that isn’t completely clear to me). So I looked around and found fakechroot. I extended my bundling utility to support bundling entire RPMs from my system. It would then scan through all the files, find the dynamically linked ones, and replace them with a wrapper script that ensured the correct dynamic linker was used.

Suddenly, horror. ImageMagick’s convert would segfault whilst it was somewhere inside pixman. I started debugging this, and rapidly discovered that most of the values I wanted to see in pixman’s code were unavailable to me, as gdb reported they were “optimised out”. I installed Fedora 14 in a VM so I could use the later version of gcc and gdb’s combined power to be able to see these variables. I wrangled on what was going on for quite a while, and roped Jeremy into the situation for a few hours too. We found that the dynamic linker was incorrectly returning a NULL pointer when the address of a specific piece of thread-local storage was requested. This was ghastly. I waded around in this situation for a while. Then, through a random stroke of luck, I found that if I didn’t load my audit library it all worked! Something about the rather unique situation I’d created evoked a bug from glibc. Exciting. I don’t have time to work on it, so I’ll have to leave it there and just not use the audit library.

The result: dynpk

The result of all of this is a tool called dynpk (“din-pack”). You provide it with a list of the RPMs from your own system that you’d like to be bundled, baked, and wrapped into something you can transport to a Linux system of your choice. It’s nice to be able to run Fedora’s ipython on top of a RHEL machine, for example!

Instructions on how to get hold of and use this tool can be found here.

This makes me wonder what the MATLAB and LabVIEW-loving license-server junkies do on the cluster when they find a bug…

Another application

I can tell another Red Hat EL5 related tale as well. All of the Linux computers in the undergraduate computing lab in ECS run Red Hat EL5. Sounds great. However, EL5 is made of old software. It’s probably fine if you’re using it to perform simple office tasks, but seeing bugs that were fixed years-ago still romping around on those desktops is incredibly frustrating.

There’s a student-run project called CSLib that has, for many years now, attempted to solve the lack of software that the undergraduate machines have. Unfortunately, CSLib is never going to match the man power, and hence freedom from bugs and number of packages, that $MAJOR_DISTRO (e.g. Fedora, Ubuntu) achieves. It’s a brave effort, but in order for it to be a catch-all solution, it really needs to use the power of the larger free-software community.

dynpk can provide some relief to people who are in situations where they are essentially forced to use a system that they are not in control of that lacks the software they desire.

The Future

It seems to me that a long-term solution for both the supercomputer and public-machine problems are virtual machines. Yes, I know I’m late to the “let’s all wave our hands in the air about virtual machines” party, but I think this invasion needs to continue much further. The compute cluster should run virtual machine images. Amazon’s EC2 already does this. The supercomputing posse should follow suit. The lab machines I speak of above would also massively benefit from allowing each user to have their own VM image that’s transferred to the machine they’re using when they log in.

Posted at 10:59 pm on Monday 8th November 2010
One Comment

FET430 Firmware Already Fixed

I just had a very quick look into what could be done to fix the FET430 UIF firmware loading issue I wrote about before. The issue looks like it was fixed in this commit. That’s in Linux 2.6.29, which was released 6 days ago.

No work for me to do there then…

Posted at 4:28 am on Sunday 29th March 2009

Location sensitive ssh “tunnelling”

I wanted to always be able to shell to a machine within the Uni network. The Uni network has a firewall that stops incoming requests to most machines. There’s a machine that all undergrads can shell to, which I normally use netcat combined with the ssh ProxyCommand setting. However, it’s a little silly to divert all traffic through another machine when I’m in the network. So, meet the new script I use in the ProxyCommand:


got=`ifconfig eth1 | egrep -o "inet addr:152.78.[0-9]{1,3}\\.[0-9]{1,3}"`
got+=`ifconfig eth0 | egrep -o "inet addr:152.78.[0-9]{1,3}\\.[0-9]{1,3}"`

if [[ "$got" == "" ]] 
    ssh uglogin.ecs.soton.ac.uk nc $HOST 22
    nc $HOST 22
Posted at 8:38 am on Wednesday 11th June 2008

UIF Fixing

I spent today working on a bug in the driver for the TI UIF MSP430 programmer. It stopped initialising properly in 2.6.24, but it worked in 2.6.23. I did a git-bisect between those versions to find the commit that induced the fault. I narrowed the search a bit by telling git-bisect to work on commits only in drivers/usb/, as I hypothesized that the bug was induced somewhere in there.

About 10 builds and 20 reboots later, I found the commit in which the problem was happening, and then read some stuff about USB etc (the LWN device drivers book proved invaluable yet again) and subsequently generated a patch. I’ve sent it to (what I think are) the right places.

If you can’t wait for the next kernel release (if it passes review…), then you can rebuild the ti_usb_3410_5052 module by downloading this tarball, untarring it and then running “make” in the resulting directory, and then “make install” as root. You will need enough of your kernel’s sources hanging around to do this. In Fedora, these are provided in the kernel-devel package.

Update (5th April ’08): The patch has made its way into Linus’s tree, so I think it’ll be in 2.6.25.

Posted at 12:25 am on Monday 24th March 2008

F9 Alpha Kernel oops reporting

I downloaded the Fedora Alpha 9 live CD ISO and ran it on my desktop. It’s got PolicyKit and PackageKit, which are pretty cool. Just after I’d logged in, I was greeted with this box:


I clicked “Yes” (I’d click “always”, but this was on a Live CD so it wouldn’t really have meaning), and then it popped up with:

Oops Sent

Pretty cool. It didn’t ask me for any of my details, which I think is cool. Going to kerneloops.org reveals that it exists to track which crash signatures occur the most.

Posted at 11:15 pm on Saturday 1st March 2008

Audio conversion in GNOME

I walked to the shops earlier. I took my mp3 player with me, which I haven’t used in a while. I keep all local copies of my (new) music in Ogg Vorbis or FLAC, which means that when I transfer to my mp3 player, I have to convert the tracks. Previously, I’ve bashed together a shell script that does the conversion, but today was different. Today I got graphical.

I had a quick search for audio converters for GNOME, and found AudioFormat and audio-convert-mod. audio-convert-mod is in the Fedora repos, so I used that. It was surprisingly enjoyable. It automatically detected the encoders and decoders that were available on my system:

audio-convert-mod: Installed Codecs

The program takes the form of a wizard, which first asks for the files to convert, then the format to convert to, and the destination directory. Then it converts them. That’s it, no hassle.


Posted at 11:34 pm on Thursday 14th February 2008

sendkey: Automated ssh key setup

I sent my “sendkey” script to Ivor. This script automates the injection of one’s ssh public key into a remote host’s authorized_keys file (to allow password-less login). It didn’t work very well when Ivor ran it. I’ve now updated the script:


KEY=`cat ~/.ssh/id_rsa.pub`

ssh $1 bash <<EOF                                                                                                                               
mkdir -p ~/.ssh
chmod u=rwx,g=,o= ~/.ssh                      
echo $KEY >> ~/.ssh/authorized_keys
chmod u=rw,g=,o= ~/.ssh/authorized_keys

It can be run like this:

% ./sendkey remoteusername@remotehost

15/02/2007 Update: I fixed the script so that it made the .ssh directory first… kind of important.

29/02/2007 Update: Klaus pointed out that it might be useful to write how to generate one’s ssh keys:

/usr/bin/ssh-keygen -t rsa -f ~/.ssh/id_rsa -N ''

Posted at 10:38 pm on Wednesday 13th February 2008

I2C Device udev rule

I just wrote a udev rule that would make any userspace i2c devices be owned by the i2c group:

KERNEL=="i2c-[0-9]*", GROUP="i2c"

(You can stick that in a file in /etc/udev/rules.d if you’re using Fedora).

Posted at 12:24 am on Tuesday 29th January 2008

A file IO monitoring utility: iomon

Recently I needed to log the calls to read() and write() calls that a program made, including the data that went through them. I hacked together a small program, called “iomon”. It runs on Linux (or at least on my Fedora 8 install anyway). What makes it exciting is that I can use it without modifying the program I’m monitoring.

It uses features of the dynamic linker (the thing that’s resposible for loading the shared libraries that a program needs during execution – see man ld.so) to interpose a monitoring function between a call to a library function and the actual library function. When read(), write() or open() are called, an event is logged.

You can get iomon by checking it out like so:

git clone http://xgoat.com/proj/fiu/iomon/iomon.git/

You can build it by just running “make”:

% cd iomon
% make

You might need to install the glib 2.0 headers (in Fedora, they’re in the glib2-devel package).

Example Usage

I thought it might be useful to demonstrate how to use it with an example. After building iomon, build the following C program (also available here) using gcc:

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>

int main( int argc, char** argv )
        int f, g;

        umask( 000 );
        f = open("test-file1", O_RDWR | O_CREAT);
        g = open("test-file2", O_RDWR | O_CREAT);

        write(f, "test", 4);
        write(g, "second file", 11);

        return 0;

This is the program that we’re going to monitor. As you can probably see, the program just opens a file called “test-file1” and another called “test-file2”. It then writes “test” and “second file” to the first and second file respectively. Not really a very useful program, but it will suffice for this demo.

The next thing to do is set the LD_PRELOAD environment variable to contain iomon.so. The dynamic linker needs to be able to find the shared object file, and so the easiest thing to do is to put the full path into LD_PRELOAD:

% cd iomon
% export LD_PRELOAD=`pwd`/iomon.so

Now every time that you run a program in this shell, iomon.so will be loaded first. At the moment iomon will default to dumping all the read, write and open calls to standard output in raw format. I suggest that you don’t run anything in this mode… iomon is configured through the IOMON environment variable. This takes a list of arguments just like any command line program:

% IOMON="--help" cat /dev/null
  iomon [OPTION...] 


Help Options:
  -?, --help         Show help options

Application Options:
  -f, --file         File to monitor access to.
  -l, --log          Log file
  -t, --text-log     Log in text, instead of binary
  --times            Include times in the log

There’s some helpful information on how to use it. I’ll explain the arguments in a little more detail:

Right! Now we can run the test program:

% IOMON="-t --times -f test-file2" ./test
0.000016 open: 74 65 73 74 2D 66 69 6C 65 32 
0.000138 write: 73 65 63 6F 6E 64 20 66 69 6C 65 

And there we see the call to open for test-file2, followed by the filename in hex. Then there’s the data sent to it through write() also in hex. The logged call to open() happened 16μs after the first call to open(), and the write() call we were interested in happened 122μs later.

Hopefully I’ll be writing about why I needed this utility later.

Posted at 8:06 pm on Monday 21st January 2008

Using I2C from userspace in Linux

I’ve been using various I2C things in Linux for the past year and a bit, and I’ve learnt a few things about it from that. Here I hope to collate some of this information.


Posted at 2:55 am on Sunday 11th November 2007

Site by Robert Spanton. © 2008 - 2011