Every year we design a new game for the next batch of Student Robotics teams. This design process is always tricky, and we’ve learnt considerably from our mistakes in the past. The game has to be well-matched with what the teams can realistically achieve with the resources and time they have. It’s a challenging problem, and the rapidly changing world of cheap hardware that teams can load onto their robots makes it even more interesting.
Since our teams are comprised of 16-18 year-olds, it’s pretty standard for them to leave building their robots until the last few weeks before the competition. This means that we really don’t start to see if we’ve set the game at the right level until right before the competition.
We’re two weeks away from this year’s competition. Over the last couple of weeks we’ve seen some relieving progress from teams. They’re definitely meeting the challenge we’ve set them! I thought I’d share some videos from a few of the teams here.
QMC
This video has got to be the best one I’ve seen so far this year. QMC manage to align with and pick up a token, before placing it on a pedestal. I must remind you that these robots are entirely autonomous — there’s no remote control here! In this video, QMC also manage to completely avoid a common failure we see in robot testing; they don’t keep picking up their robot to reorient it towards a token. They leave it to do it’s own thing. Sounds like a reasonably obvious thing to do, but we see this happen so often!
MAI
Team MAI have done some similarly impressive work. The video below shows their robot locating a token, and then picking it up using a sucker mounted on top of their awesome scissor lift. Their robot then carries the token over to a target location. Really cool.
GMR
So GMR decided to be the first SR team to build and enter a hovercraft into the competition. It’s a very exciting robot and the GMR guys are certainly very motivated to get it to go. Here’s the first time they got it to hover:
A list of this year’s teams can be found on our website.
A quick link to the Student Robotics team websites that have popped up over the last few weeks:
The result of Chris’s internship was a library called libkoki. This library hasn’t really been released yet properly to the internets — we’re working on it, but various other things have had a higher priority, like shipping Student Robotics stuff etc. You’ll be hearing more about it soon… The quick summary is that libkoki is a library for finding visual markers in images.
So the important thing for Chris during his internship was to make libkoki robust. It is more robust than ARToolkit, which is a popular choice, in a few ways. Its markers feature CRCs in their patterns, which prevent things like windows and other rectangular structures being misidentified as markers. libkoki also uses adaptive thresholding to be able to detect markers in heterogeneous lighting conditions, such as when one side of a marker is better lit than another.
Since much of the focus has been on getting libkoki robust (and I think rightly so) and into a working state (which it is in), not much time has yet gone into getting it to perform well. We’re now shipping it as the main part of the Student Robotics vision API, and so we’re quite interested in getting it to perform well.
Profiling
So to get things to perform well, one needs to know where to focus one’s efforts. In the past I’ve used gprof, which is OK, but has various limitations — like not being able to profile shared libraries, which is what libkoki is. I’d read some things about Linux perf recently, so I decided to try it out. (I’d link to it, but their wiki seems to have been down for a while…) In conclusion, perf is pretty amazing. Just running `perf top`
results in an summary of which functions are currently taking up CPU time for the whole system, including userspace and kernel time. This is pretty amazing. No binary modifications required. As long as debug symbols are available, it’ll tell you function names.
We use libkoki on a BeagleBoard, which is an ARM Cortex-A8 platform. We ship a slightly old 2.6.32 kernel (it does the job, and developer hours are limited…), in which perf did not support ARM platforms. It turns out that our BeagleBoard’s kernel image has OProfile support compiled in. OProfile appears to do similar things to perf, so I used that.
So, I profiled some simple code that used libkoki and got the following profile from oprofile:
root@beagleboard:~# opreport -l pyenv/lib/libkoki.so Overflow stats not available CPU: ARM V7 PMNC, speed 0 MHz (estimated) Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No unit mask) count 100000 samples % symbol name 1955 56.0493 koki_label_image 1018 29.1858 koki_threshold_adaptive 203 5.8200 koki_v4l_YUYV_frame_to_RGB_image
So, much time was spent in koki_label_image
. This function splits up a black-and-white thresholded image into connected regions. On closer inspection of some annotated source code that oprofile gave me, it turned out that koki_label_image
was spending a lot of time when the code discovered that two regions were in fact the same. After changing the way that libkoki goes about aliasing one region’s label with another, this changed to:
samples % symbol name 2326 40.8572 koki_threshold_adaptive 2208 38.7845 koki_label_image 427 7.5004 koki_v4l_YUYV_frame_to_RGB_image
So the 1.5 second processing time on the BeagleBoard has now been reduced to 0.78 seconds processing time. Fun times, but still some way to go!
My last post was about how the first stage of ARToolkit’s marker detection works. Chris has just started a Student Robotics internship, and is working towards a new vision system for the SR kit. As the first part of the journey towards this, he’s continuing the work on dissecting ARToolkit. Find part 2 of the dissection on his blog, in which Chris covers how ARToolkit finds the boundaries of the regions found in part 1.
Tonight’s PhD creative/productive escape was to continue teasing apart some of the functionality of ARToolkit. I’m pursuing getting some fiducial marker detection, à la ARToolKit, into the core of Student Robotics 2012’s kit (the one that’ll be shipped in October). We won’t be using the exact algorithms found in ARToolKit, as it frequently reports markers that aren’t there, but learning some of what I perceived to be the “magic” of the library’s guts seemed like a good idea.
I first hit Google scholar to find papers about how ARToolKit and other similar libraries work. Luckily, I’m currently a student at an institution that provides me with access to journal and conference papers. Sadly this is not the case for everyone, which sucks. I read, and skimmed through, a few papers. This gave me an idea of what’s around. Unfortunately I didn’t find a thorough description of ARToolKit’s algorithms. Even ARToolKit’s own description of its vision algorithm left big gaps out. There are a few papers out there that compare different marker detection libraries. I’d link to them, but they’re behind a paywall, so I’d rather not.
What I suspect is happening is that people take one look at the more interesting functions within ARToolKit’s source and then run a mile. Take, for example, the function labeling2 found in arLabeling.c. Take a look at it now. To understand what’s going on in there you need to be really determined. You need to wade through some manual loop unrolling, preprocessor fudging, and arrays of ints being treated like they should be arrays of structs. More on this in a bit.
What’s interesting is that this code works well enough for people to use it. Wikipedia says:
“ARToolKit is a very widely used AR tracking library with over 160,000 downloads since 2004”
So, either I’m going crazy and I have lost the ability to read someone else’s code, or the ARToolKit code leaves a lot to be desired. For what are hopefully intuitive reasons, I’m going to opt for the former latter explanation. So here I get to apply some rather extended extrapolation (for more of this see Freakonomics) about how we reached a situation in which at least thousands of people are using a library that’s quite impenetrable. I think it’s a pretty good demonstration of two things. First: usability can count more than implementation detail. ARToolKit functions as a library fine, and has a usable interface. Most users don’t need to care about the internal implementation until it breaks. Secondly, it’s a demonstration that organically developed things can work, and that one doesn’t need to follow software engineering formalisms to the letter to get the job done.
Still, all this mustn’t belittle the achievement of the authors of ARToolKit. They’ve obviously made a significant contribution to a lot of good projects that people have done, and I’m sure that many of the ideas wouldn’t have nucleated had the library not existed at all. So, time to quit whinging about how impenetrable the code is, and get to work on deciphering it! I’ll be doing this over a series of blog posts. This is the first one, and is about the first stage of ARToolKit’s image processing: labelling. I’ll be ignoring the image acquisition itself because this is a pretty run-of-the-mill operation.
Ok, I lied, I’m actually going to cover thresholding and labelling here. The thresholding step is really simple and happens in the same pass as labelling, so it doesn’t really count. Both the thresholding and labelling happen in the labeling2() function in arLabeling.c. I’ve spent several hours working through this function, simplifying it so that I could understand what was going on. Whilst I was doing this, I committed my changes to a git repository:
git clone https://xgoat.com/pubgit/artoolkit-revenge.git
The contents of the above git repository is not intended to be run. The only reason for this code existing is to help me (and maybe others) understand how ARToolKit works. In doing this, I removed several things that I didn’t care about, such as different pixel colour formats. These things weren’t important to understanding how ARToolKit’s algorithms work. At the point of writing, I’ve got labeling2() down from 344 lines of quite unapproachable code to 169 lines of fodder that I find easier to grok.
Thresholding
ARToolKit identifies black-and-white markers in images from the camera, such as the one shown on the right. It converts the colour image that comes from the camera into a black-and-white image using a threshold specified by the user. If one is working with an RGB image, then the sum of the R, G, and B components for each pixel are compared against the threshold that the user specifies like so:
r + g + b < threshold * 3
Pixels that satisfy this threshold are suitable for labelling. Other pixels are ignored.
Labelling
The pixels that got through the thresholding step are now grouped into sets of connected pixels. Each group of connected pixels is assigned a label, which is just an integer greater than zero. This is done by iterating through the thresholded image pixels, row-by-row from top-left to bottom-right, and storing its calculated label number in a second image buffer. A pixel's label is decided as follows:
- If the pixel above is labelled, take the same label as it.
- Otherwise, if the pixel to the top-right was labelled then some checks are made to determine if two groups of (what are currently) differently labelled pixels require merging to have the same label. If at least one of the pixels to the left and or top-left of the current pixel is labelled, then two labelled regions are indeed connected. ARToolKit makes the labels of the two intersecting groups equivalent, simply by recording the fact that these two labels are equivalent -- it doesn't go renumbering the labels stored in the label image buffer (this'd be quite inefficient).
- Otherwise, take the label of the pixel to the top-left if it has one.
- Otherwise, take the label of the pixel to the left if it has one.
- Finally, if none of the above conditions were met, then a new region has been found. A new label number is assigned to the pixel.
Whilst all this labelling is going on, statistics about each label are built up as well. The following stats are collected for each label:
- The number of pixels assigned this label.
- The sum of the x coordinates, as well as the sum of the y coordinates of each pixel of this label.
- The minimum and maximum x and y coordinates for the pixel.
The first two of those statistics are used to calculate the centre-point of a labelled region, whilst the other numbers will be passed back to the caller of labeling2.
As I said above, label numbers can be made equivalent to each other during the labelling process. This means that after the labelling is complete, there can be redundancy in label numbers. ARToolKit performs a little bit of label number shuffling to remove this redundancy, and ensures that all label numbers are consecutive.
These statistics are passed back along with the labelled image buffer to the caller. I won't go into the precise details here. If you want to know more, have a look at the labeling2 function in the modified sources that I linked to above. I've changed the prototype of the labeling2 function so that it uses arrays of structs that are easier to decipher, so hopefully it'll all make sense.
That's where I'm up to right now with parsing ARToolKit's behaviour. The next instalment will be on the behaviour of arDetectMarker2(), which will use some of the information collected by the labelling process.
It’s recently become warmer here in Southampton. So warm that when presented with assembling 42 Student Robotics servo boards, we opted to assemble them outside in the garden. It got colder as the sun went down, so I got the oil-filled radiator out of the shed and put it underneath the garden table, which we embellished with a blanket skirt to keep the heat in.
After placing one side, we retreated into the house for baking and second side placement.
Looks like it’s going to be a good year of hacking.
Sam‘s Bristol Ignite talk about Student Robotics has just appeared:
Good job Sam!
What do you do with 70-odd silicone sponge shims that you’ve just washed after laser cutting? Thanks right, bake them at 150°C for half an hour to dry them out:
In about a month’s time, Student Robotics will be shipping robotics kit to several schools in the Southampton and Bristol area. Sixth formers from these schools will be competing their robots against each other some time in April next year. The robots are programmed using USB keys. In the past, we’ve tended to purchase the absolute cheapest USB key we possibly could, and we’ve had quite a lot of them fail.
So this year, we’ll be using some USB keys from a known brand. I bought a couple of Kingston 2GB DataTraveler keys and went about abusing one to try to discover if it’s likely to fail. I hacked together some scripts that continuously rewrote the same 10MB space (of the block device that the key presents) over and over again. Out of curiosity I also logged the time it took for this space to be written to each time.
So I ran my scripts and rewrote the 10MB block 2000 times. Here’s what the bandwidth did:
The step is interesting. It could be something to do with the USB key’s wear-levelling. My bet is that the key uses its flash as a big circular buffer (like JFFS). Initially, all the blocks of flash are erased, so no erases have to happen when writing to them. When the end of the circular buffer is reached, it has to erase each block before it can write to it, which drops the bandwidth. Obviously this is pure postulation…
I’m reasonably convinced that this type of USB key won’t fall apart in the robots. The file the students will be loading onto the robot is of the order of 100k. Also I think it quite unlikely that they’ll reach 2000 programming iterations. So I’m for shipping these keys.
The hacky scripts that I used are in a git repo.
Older Posts >> |
Site by Rob Gilton. © 2008 - 2019