I’m excited to be able to finally get around to another post in this Digging Deep series, in which I hope to delve head first into some Ruby esoterica. This time around, I have an interview with Ara T. Howard about some of the hackery he does with packaging rq
Hope you enjoy it! Questions follow.
>> I’d like to ask you some questions about the way you package Ruby Queue,
>> but before I do that, can you give me a short description of the project?
DESCRIPTION
ruby queue (rq) is a zero-admin zero-configuration tool used to create instant
unix clusters. rq requires only a central nfs file system in order to manage a
simple sqlite database as a distributed priority work queue. this simple
design allows researchers with minimal unix experience to install and
configure, in only a few minutes and without root privileges, a robust unix
cluster capable of distributing processes to many nodes - bringing dozens of
powerful cpus to their knees with a single blow. clearly this software should
be kept out of the hands of free radicals, seti enthusiasts, and one mr. j
safran.
the central concept of rq is that n nodes work in isolation to pull jobs
from an centrally mounted nfs priority work queue in a synchronized fashion.
the nodes have absolutely no knowledge of each other and all communication
is done via the queue meaning that, so long as the queue is available via
nfs and a single node is running jobs from it, the system will continue to
process jobs. there is no centralized process whatsoever - all nodes work
to take jobs from the queue and run them as fast as possible. this creates
a system which load balances automatically and is robust in face of node
failures.
although the rq system is simple in it’s design it features powerful
functionality such as priority management, predicate and sql query , compact
streaming command-line processing, programmable api, hot-backup, and
input/capture of the stdin/stdout/stderr io streams of remote jobs. to date
rq has had no reported runtime failures and is in operation at dozens of
research centers around the world.
URIS:
- http://codeforpeople.com/lib/ruby/rq/
- http://raa.ruby-lang.org/project/rq/
- http://www.linuxjournal.com/article/7922
the short version is this:
rq is a command line application which runs in several modes: in daemon mode
is pulls jobs from an nfs mounted priority queue and runs them. in other
modes it manipulates said queue by submitting jobs to it, queries existing
jobs, etc. rq is designed such that both the queue and code live on an nfs
mount. if sweet graphics help you understand things then this will be
illuminating
+-----------+ +--------------+
| | | |
| compute | | compute |
| node | | node |
| | | |
+-----------+ +--------------+
`.
`.
`. `-.
NFS SERVER `.
-------------------------------`.
| |
| /nfs/exported/priority/q |
| |
| /nfs/exported/bin/rq |
| |
| /nfs/exported/bin/ruby |
| /nfs/exported/lib/ruby |
__.----------------------------=.-'+
___..--''' .-'
''--------------+ _.-'------------+
| | | |
| compute | | compute |
| node | | node |
| | +---------------+
+---------------+
the idea is that you dump some code on an nfs box and start some daemons on
the clients and you’re done.
>> You had mentioned at the MountainWest RubyConf that instead of relying on
>> the sqlite3 gems which build native extensions, you actually manually
>> package and build sqlite3 within rq’s own gem install process, can you
>> explain what this gains you?
actually, rq uses sqlite v2. word on the street is that sqlite v3 is actually
slower than sqlite in some cases and, because the ruby apis are more well
tested with v2 i chose to go that route. also, when i began developing rq
sqlite v3 had only just come out and i was looking to be robust, not bleeding
edge.
sqlite, and the ruby bindings, are a pretty simple pair of things to install
if you’ve ever compiled anything. however, things can get complicated since
many linux distros may have one version installed and users have sometimes
installed another, say in /usr/local or whatever, and trying to build a v2
ruby binding against a v3 lib, or visa versa, is a nightmare of course. then
gems take this and abstracts is one step further - requiring a few
incantations to get the right compiler flags through to the underlying sqlite
setup.rb script, which sometimes fails without error, etc. in short the
sqlite + ruby sqlite install is completely normal with respect to installing
some related open source packages, which is to say not trivial unless you are
the kind of guy who knows what ’strings libsqlite.so|grep version’ does.
so the short answer is that i don’t want people to have to deal with
installing the right version of sqlite and making sure rq finds it.
so, added to all this, is that fact that the target audience for rq is non
technical users who happen to need to setup a small linux cluster - today.
the original rq installer actually dumped everything it needs onto an nfs
mount, including ruby. the hope was that the user may not have even heard of
ruby but could still get a linux cluster up in an hour or so. as it turned
out that worked well - many users downloaded the rq tar ball, unpacked, ran a
/bin/sh script and viola - live linux cluster.
the problem is with the ruby community, who don’t want a massive installer
that clobbers your ruby and sqlite installation when run (damn them!). they
want a gem, of course.
>> Can you share some of the details of how you actually do this?
it’s pretty straight forward
1) the rq dist includes the src for sqlite and sqlite-ruby
2) during install, the installer first builds both of them with this logic
require 'rbconfig'
rqlib = './lib' # rq's own libdir!
c = Config::CONFIG
arch = c['sitearch'] || c['arch'] # i686-linux for example
prefix = File.join rqlib, arch
bindir = File.join prefix, 'bin'
libdir = File.join prefix, 'lib'
# ....
system "./configure --prefix=#{ prefix } && make && make install"
so the result is that rq has both sqlite and the sqlite-ruby bindings
dumped into an arch specific directory that only it know about. the
directory is the ‘lib’ dir, the very same directory that
gems/install.rb/setup.rb will install during the normal installation
procedure. now it’s simply a matter of having rq.rb do the following
dirname, basename = File.split(File.expand_path(__FILE__))
require 'rbconfig'
c = Config::CONFIG
arch = c['sitearch'] || c['arch']
prefix = File.join dirname, arch
bindir = File.join prefix, 'bin'
libdir = File.join prefix, 'lib'
ENV['LD_LIBRARY_PATH'] = [ libdir, ENV['LD_LIBRARY_PATH'] ].join ':'
ENV['PATH'] = [ bindir, ENV['PATH'] ].join ':'
so, at runtime, rq will configure itself such that both it’s private
sqlite and sqlite libraries are first in it’s path. now, when rq.rb does
require File.join(dirname, 'sqlite') # require rq's sqlite binding
or
system 'sqlite ...'
it can be sure that the sqlite.so or sqlite binary that’s picked up is
it’s very own - regardless of what the user may have installed in various
locations around the system.
in summary rq simply sets up it’s own private cache of binary prerequisites
and arranges for them to be found at run time.
>> What about Windows? In the case of rq, it’s a non-issue obviously,
>> but have you tried using this technique with projects that need to run
>> on Windows? If so, does it involve some cygwin / mingw mess?
you know - i have compiled both the gsl and narray for windows using the mingw
approach. it’s a pain. you have to compile using mingw but install manually
isnce rbconfig is F.U.B.A.R for the one click installer.
it’s my personal belief that, despite the benefit is has made to the
community, the ruby one-click installer is badly broken in that it results in
a ruby that’s lacking one of it’s most important features: the ability to
bootstrap itself with new binary features. that is to say that a ruby without
a working mkmf.rb and rbconfig.rb is badly broken in my eyes. i’ve spoken to
austin many times about this and we have different opinions on how this should
be solved: he’s for the total ms compatibility approach and i’m for a windows
dist which includes the msys compiler toolchain. either are valid and both
would result in a ruby which could, either after a toolchain install or
without it, bootstrap itself into the wonderful world of ruby extensions.
i’m sure i’ve started a holy war here so i best stop while i’m ahead…
>> Is there anything that’s missing in RubyGems that would help solve this
>> problem?
rubygems is not an application installer - it’s a library installer. i think
rubygems should incorporate all the wonderful work erik veenstra has done such
that a user can install a ruby application that bundles ruby itself - one that
will continue to run as rubys get upgraded and gem libraries come an go. in
short we need a a robust way to install applications on systems where people
may never have even heard of ruby. rubygems may not be the ultimate answer,
but it would interesting to explore integrating rubyscript2exe application
creation from within the gemspec framework.
i’ll point out that this is no over sight of rubygems - it was not designed to
be an application installer but, now that ruby is mainstream we, as
developers, need simple tools to distribute ruby applications to the
uninitiated.
>> Got any other neat packaging tricks?
there is one - but i cannot write about it in public (seriously!).

