Author's Note: This article details the IRIX binary compatibility implementation for the NetBSD operating system. This includes the creation of a new emulation subsystem inside the NetBSD kernel and a lot of reverse engineering to understand and reproduce how IRIX internals work.
Because this article includes an introduction to all kernel subsystems involved with IRIX binary compatibility, we assume the reader has some experience in user-land Unix programming.
Throughout this article, we reference various NetBSD kernel source files and NetBSD manual pages.
Unix systems have two distinct modes of operation, known as user mode and kernel (or system) mode. In user mode, the operating system (OS) executes code provided by users. It could be a Web browser, a computer-science-student's project, a Web server (in this case, the user running the program is usually the system administrator), and so on. This code is run with limited privileges. It has limited access to the computer's memory, and usually no access at all to the hardware.
When running in kernel mode, the OS is only executing trusted code, which was loaded at boot time. This code is known as the OS kernel. The kernel has full access to the memory and hardware. It is here to provide services to user programs:
User processes call kernel code by issuing a trap. A trap is a hardware or software exception that suspends user process execution, and gives control to kernel code. The kernel will handle the exception, after which it may return to user mode and resume the execution of the user process, or it may destroy the user process. Example of traps are division by zero, memory faults (accessing any virtual addresses where no physical memory is mapped), timer interrupts (that are used to switch between user processes), or requests by the user process to access some resource controlled by the kernel.
These requests can be opening a file, reading from a network
creating a new process. The process does this by issuing a system call,
fork(2). The system call is in fact a CPU
instruction that causes a trap.
Here is an example of MIPS assembly to call the
fork(2) system call on
li $v0,2 # 2 is the system call number for fork() # v0 is the register holding the system call number syscall # syscall is the CPU instruction to do a system call
syscall instruction execution, the kernel executes a particular
trap handler, which is known as the system call handler. For NetBSD/mips, it
can be found in
sys/arch/mips/mips/syscall.c:syscall_plain(). The system
call handler expects an argument, which is the system call number. The
system call handler uses a table, called the system call table, to look up a
kernel function that will be called in order to complete the system call. On
NetBSD, the system call table for native processes is generated from
System calls are the way a user process requests action from the kernel, but there is also a mechanism used by the kernel to notify the user process of unusual conditions: signals. Signals are issued by various traps and system calls, to notify the process that it raised an exception: memory fault (the famous segmentation fault, well known to students learning C), division by zero and so on.
For each signal, the user process can decide to take default action on
some signals (by default, some signals cause program abortion, other
are simply ignored), to ignore it, or to execute a function called a
handler. This choice is made using the
signal(3) library call or the
sigaction(2) system call.
There is a clean separation between user mode and kernel mode. User processes run on top of the kernel with very little knowledge of what is inside a system call. All they do is issuing system calls, expecting a behavior documented by kernel developers in a set of man pages. Most programs do not care about kernel internals and will just work if you change the kernel, as long as the system call behavior is left unchanged.
This is how NetBSD binary compatibility works. When launching a new
the kernel is able to distinguish between native NetBSD binaries and,
for example Linux or FreeBSD binaries on NetBSD/i386. It will hence
alternative system call table for this program, which will contain
appropriate entries for the emulated OS. For instance, NetBSD/i386 uses
sys/compat/linux/arch/i386/syscalls.master to provide the system call
table for Linux binaries.
When a Linux binary running on NetBSD does a system call, the NetBSD kernel will run the appropriate function in the Linux system call table. This function emulates the behavior of the Linux system call so that the user program is fooled into thinking that it is running on the Linux kernel whereas it is in fact running on the NetBSD kernel.
Some system calls have the same behavior in NetBSD and in the emulated
OS; in this case, the emulation system call table just uses the same
corresponding function. Sometime the behavior is a bit different. For
instance some flags have different values, or there are different
system call semantics. In this case, the system call table references an
emulation function, which will call the native function after adapting the
arguments and/or behavior. This is done, for instance, in
sys/compat/linux/common/linux_misc.c:linux_sys_uname() for Linux
uname(2) emulation. Last but not least, the emulated system call may have no
native equivalent. The emulation function that implements the system calls
must hence do all the work, or just act as the work has been done and just
return, hoping that the user process will not notice the broken behavior (yes,
sometimes it works).
The other part of the job is implementing signal emulation. Care should be taken in order to ensure the system call handler is called in the same way the emulated OS would have done it. This job leads to the manipulation of machine registers and assembly language, and hence it is quite machine dependent.
IRIX 6.5 is known to be a System V Release 4 (SVR4) derived Operating
and thanks to Christos Zoulas, NetBSD already contains a SVR4 binary
compatibility option. The code for this SVR4 emulation can be found
sys/compat/svr4 in the NetBSD kernel sources.
NetBSD already has a binary compatibility with major OSes such as Solaris 2 or SCO OpenServer through this SVR4 compatibility option. The first problem was to decide if the IRIX compatibility would be implemented by improving the SVR4 compatibility, or by introducing an IRIX specific compatibility option.
The answer to this first question is obvious once you compare the
call tables for plain SVR4 and for IRIX 6.5. The table for SVR4 can be
found in NetBSD kernel source in
table for IRIX can be found on an IRIX system in
NetBSD kernel sources, it can be found in
In the IRIX 6.5 system table, only the first 88 system calls are plain
SVR4. Following are 147 system calls that are either IRIX specific, or
are just SVR4 system calls with different system call numbers. This
suggests that IRIX binary compatibility in NetBSD should have its own
syscalls.master, since it would be a pain to add dozens of
Thus, we needed a
sys/compat/irix directory in NetBSD kernel sources,
an IRIX-specific syscalls.master file. However, there are a lot of
SVR4 system calls in IRIX, therefore a lot of code in
used by the IRIX binary compatibility. This code is built when the
is built with the IRIX binary compatibility option (COMPAT_IRIX)
is set, even if the SVR4 binary compatibility option (COMPAT_SVR4), is
In order to create a new compatibility option in NetBSD, we need
syscall.masterfile for the emulated OS.
Most of the work can be done in an incremental way, starting from code
is duplicated from the NetBSD native version, and modifying it until
binaries work. The only field where a NetBSD version is not very
syscalls.master file, because we know that everything will be
later. A null
syscalls.master file, which defines no system calls at
all is a
Now let us see how an emulation option is made visible to the NetBSD
kernel. Everything is done in the
sys/kern/exec_conf.c. In this file, an array
execsw_builtin is defined. Each entry in this array is a struct
execsw, as defined in
execsw describe a particular execution environment. This
includes foreign OSes emulation, and natives situations as well : There are
execsw_builtin for shell scripts,
a.out native binaries, ELF native binaries, and 32-bit ELF binaries running on 64-bit NetBSD systems.
Informations held by the struct execsw include pointers to a function
responsible for identifying the executable format (
Mach-O...), a probe function that should be able to tell if this exec
switch is able to handle a particular binary, and functions for setting up the
program's initial stack, CPU registers, and for writing a core dump to
execsw also holds a pointer to a struct
emul, which is
sys/sys/proc.h. Whereas the fields of struct
execsw are used
at program creation and termination, the struct
emul is used during the
program normal operation. It contains pointers to the system call
table, and to various functions used to handle traps and signals.
The distinction between struct
execsw and struct
emul is there because
OSes supports several executable formats. For instance, NetBSD itself
a.out or ELF binaries. Both kind of binaries share the
same system table and signal handlers, and therefore they have the same
emul. But the binary loading is different, hence they have two
The first job is to create the struct emul for IRIX binary
compatibility. This uses the IRIX system call table (which is empty so far), and all
other fields are copied from the NetBSD native struct
emul, which is found in
sys/kern/kern_exec.c. It is named
emul_netbsd. The struct
emul for IRIX is naturally named
Then we can add the entry for IRIX in the
exec_builtin array. IRIX uses
ELF binaries, so this entails not much more than copying NetBSD ELF
native's entry, and replacing the struct
emul emul_netbsd by our
Now we have registered a new execution environment with the kernel. The next step is to have it actually run something. The struct execsw includes a probe function whose purpose is to tell the kernel if the execution environment described by this entry is able to handle a given binary.
In order to use our new execution environment, we must therefore write
a probe function. Usually, this kind of function tries to find a
signature specific to an OS in an ELF section, or a magic number in a
header that would identify the binary.
For IRIX, things are a bit complicated, since IRIX uses no less than three different kind of ELF executables. Theses correspond to the three Application Binary Interface (ABI) supported in IRIX: o32, n32 and n64. The ABI is the set of conventions that explains how the stacks and registers should be used when calling a function, or doing a system call.
o32 is the traditional 32-bit SVR4 ABI for MIPS processors. Here is a pdf document extensively describing o32 from the SVR4 ABI MIPS processor supplement. n64 is the 64-bit ABI, used for 64-bit ELF binaries. Finally, n32 is a hybrid ABI used to increase performance of applications using a 32-bit address space on 64-bit processors. The difference between o32 and n32 is that 64-bit registers are used instead of 32-bit, where relevant, and more function arguments are transmitted through registers instead of the stack. The goal behind n32 is to improve performance on 32-bit applications that are not necessarily able to build as 64-bit, because some assumption were made on pointer size, for instance.
At the time this paper was written, the NetBSD/mips kernel is only able to run o32 binaries. The goal is hence to match IRIX o32 binaries. o32 binaries are themselves divided into two families: static o32 and dynamic o32. Dynamic o32 binaries are the easy part of the job; therefore we will start with them.
ELF binaries are divided into ELF sections. The sections can be
inspected using the
objdump -h file will list the ELF sections of file, and objdump -j .section -s file will dump .section from file.
All dynamic ELF executables have an
.interp section that contains the
name of the ELF interpreter, which is also known as the dynamic linker. On
NetBSD, this is
ld.elf_so(1) for more
information about ELF dynamic linking.
On program startup, the kernel loads the executable and the interpreter, and then transfers control to the interpreter. The interpreter loads the shared objects into memory, and then transfers control to the dynamically linked program.
On IRIX, things are a bit strange: the interpreter is libc itself.
Libc loads the dynamic linker (
/usr/lib/rld), which in turn loads
all the shared objects and then executes the program by calling it's
This IRIX particularity is good for matching IRIX binaries: the
/lib/libc.so.1, which is quite unusual. Another good point
to examine is that because it maintains three different ABIs, IRIX has
different sets of libraries, and therefore three different interpreter:
/lib/libc.so.1 for o32 binaries,
/lib32/libc.so.1 for n32 and
/lib64/libc.so.1 for n64. It is therefore quite easy to check whether a dynamic executable is an IRIX o32 binary : we just have to peek at the
.interp section and see if the interpreter is
In This Series
IRIX Binary Compatibility, Part 6
IRIX Binary Compatibility, Part 5
IRIX Binary Compatibility, Part 4
IRIX Binary Compatibility, Part 3
IRIX Binary Compatibility, Part 2
Static binaries are more tricky. At a glance, the only difference
between o32 and n32 static binaries is the presence of a
.MIPS.options section in o32 static binaries. This could have been a good test, but
unfortunately, it has some false negatives. One can find a few static
binaries in IRIX 5 that do not have this ELF section (the
But since IRIX's
file(1) command is able to distinguish o32 and n32,
be a reliable difference. The answer is in an IRIX header file:
/usr/include/sys/elf.h. This file defines the ELF header, in which we
can find an
e_flags field. Two bits in this field are used for IRIX binaries
to distinguish between the three ABIs. In order to tell if an IRIX
binary is o32, n32, or n64, we just have to check for theses two bits in the ELF header.
This is quite a robust test to distinguish between o32, n32, and n64 IRIX binaries; however, it can still have some trouble when it is used to distinguish between IRIX and non-IRIX static binaries. Fortunately, there are not a lot of static binaries on an IRIX 6.5 system, and it is not trivial to build static o32 programs. (In fact, I was not able to find how to do it.) Hence, we can afford to use a weak matching scheme; as soon as we can match the few static binaries from IRIX 6.5 correctly, we are safe.
Emmanuel Dreyfus is a system and network administrator in Paris, France, and is currently a developer for NetBSD.
Return to the BSD DevCenter.
Copyright © 2009 O'Reilly Media, Inc.