OpenBSD 4.0: Pufferix's Adventuresby Federico Biancuzzi
On October 18th, OpenBSD celebrated its 11th birthday and ten years of punctual biannual releases. Now it's time for OpenBSD version 4.0, which includes tons of new drivers for wireless, network, and storage chips. Discover what's new and what battles developers must face daily to access documentation and support new hardware.
Warning: Federico Biancuzzi interviewed nearly 20 developers and assembled this long interview under the influence of Humppa-style music!
What is new in OpenBSD 4.0 for wireless drivers?
Damien Bergamini: Five new drivers for WLAN devices have been committed in OpenBSD 4.0. These drivers support the following chipsets:
acx(4) : TI ACX100/ACX111 pgt(4) : Conexant/Intersil Prism GT Full-MAC rum(4) : Ralink Technology RT2501USB wpi(4) : Intel PRO/Wireless 3945ABG uath(4): Atheros AR5005UG/AR5005UX USB2.0
All these drivers require firmwares that are not freely redistributable, with the exception of
rum(4) for which Ralink Technology has allowed us to redistribute the necessary firmware files under a BSD-style license.
uath(4) was imported just before the release and is pretty much work in progress so don't expect too much from it for the moment. Work on it is slow as it is based on reverse-engineering efforts (there is absolutely no documentation for it, not even in the form of a Linux driver).
OpenBSD is the first open-source operating system to have support for the Intel PRO/Wireless 3945ABG and the Atheros USB2.0 adapters without the need for blobs.
zyd(4) driver (for ZyDAS ZD1211 chipset) didn't make it into 4.0 due to some remaining issues in the TX path that we were unable to fix in time for the release. These issues are now fully understood so we'll have a working
zyd(4) driver very soon now.
A lot of changes have been made to our generic 802.11 layer and to the existing drivers (e.g.
ral(4)) to improve interoperability when operating as an access point with a mix of 802.11b and 802.11g client stations).
The rate control algorithm that was used in
ural(4) (AMRR) has been made generic and is now part of net80211. It is used by other drivers like
acx(4), and the upcoming
ifconfig(8) can now report the received signal strength as a percentage thanks to work done by Reyk Floeter.
New developers are being involved in wireless development now which is very encouraging for the future of wireless support in OpenBSD.
I read that this release includes some new Gigabit Ethernet drivers for chips made by Marvell/SysKonnect and Broadcom. I thought these vendors didn't give away documentation, so what happened?
Mark Kettenis: OpenBSD 4.0 has a new
msk(4) driver for the Marvell Yukon 2 Gigabit NICs. These are not radically different from the older SysKonnect NICs supported by the
sk(4) driver; Marvell basically replaced the DMA engine while keeping other parts of the chip the same. Still it took me quite a bit of time to produce a working driver because Marvell doesn't give us documentation. It took me hours of staring at the Linux sky2 driver before I grasped how the new bits worked. Figuring out how interrupts worked was especially hard since Linux does its interrupt handling in a completely different way now. An Open Source really isn't a substitute for proper documentation of the hardware. I can see why Marvell is reluctant to release documentation; the hardware is full of bugs and the list of errata would be embarrassingly long.
Brad Smith: These vendors do not give away documentation and the situation has not changed. For the Broadcom chipsets, what has changed is that Broadcom as of 6 months ago has provided FreeBSD with a
bce(4) driver for their new NetXtreme II family of Gigabit chipsets.
Within days of the driver being committed Theo de Raadt made contact with the Broadcom engineer whom wrote the driver, David Christensen; 2 weeks later he was able to provide Theo and I with engineering samples. With assistance from Reyk Floeter the driver was ported over to OpenBSD as
bnx(4), though with a somewhat rough implementation of the bus_dma code and having some known issues at that time, but it was work in progress.
Two weeks later Marco Peereboom managed to get a hold of a new Dell server in his lab, which happened to pose a set of the PCI Express NetXtreme II chipsets. This, in combination with the fact that the chipset will be very common soon, peaked his interest in trying to assist me with resolving the major remaining issue with the bus_dma code in the TX path. We had 2 testing and debug sessions and Marco was able to come up with the minimal set of bus_dma changes to get the driver going. There were still known issues at this point, but it provided us with a driver that was very much useable by the time the release rolled around.
Since the release improvements have been made in the TX path code to make the driver more robust under heavy UDP transmit traffic load and making the driver use the
loadfirmware(9) framework, so that the firmware bloat can be removed from the kernel.
Although this is a fairly decent vendor driver for a change, it is no substitute for having the proper hardware documentation.
pf(4) now supports Unicast Reverse Path Forwarding (uRPF) checks for simplified ingress filtering. What does this mean, in concrete terms?
Damien Miller: uRPF verifies that the source address of packets received on a network interface matches the routing table. It can be used to filter packets that arrive from unexpected directions, such as ones with spoofed source addresses.
It is similar to the "antispoof" keyword that exists in
pf.conf(5) already, but antispoof only works for directly connected networks whereas uRPF works for networks one or more hops away, at the cost of being a little more permissive.
A good description of uRPF can be found in RFC3704. Our uRPF implementation is what they call "Strict RPF" and suffers from the main limitation that they describe: it does not work properly when asymmetric routes are present. It would be cool to have an implementation of "Feasible Path RPF", but that would require a higher degree of cooperation with the routing daemons than presently exists.
You have enabled adaptive timeouts by default in pf. Why?
Henning Brauer: We have had that feature--adaptive timeouts--in pf for a long time. The more the state table grows to its limit, the shorter the timeouts, aka, the more aggressive we time out old states. Since no new states can be established when the state table is full, you really don't want your state table to fill up. Adaptive timeouts help a lot here, and timing out old states in that case is way better than preventing new connections. Thus, following the "sane defaults" paradigm, we have enabled adaptive timeouts by default. The parameters for adaptive timeouts are calculated relative to the state table limit.
It seems that you developed some features that let
dhcpd(8) interact with pf. Could you tell us more?
Chris Kuethe: PF/dhcpd integration was motivated by the fact that we have an open wireless network at the University of Alberta that was suffering from users camping on addresses, and ill-maintained machines spreading viruses. It was the spread of worms that we most wanted to control. Infected machines easily generate several thousand states, several hundred complaints at the abuse desk and often slow the network to a crawl. And it's just rude to allow an infected machine off your net.
PF's "overload" table is a partial solution to this. Excessively chatty machines would have their states torn down and would be placed into a table whose members were denied further network access. Unfortunately there was no easy mechanism to remove an address from the table automatically, which would lead to a fairly quick denial of service. I remedied this by making it possible for
dhcpd to remove an address from the "overload" table when it was leased to a new hardware device.
Second, we found that
dhcpd was abandoning a significant part of our address space because machines were somehow camping on an address--using that address without properly leasing it. To discourage this behaviour, I made
dhcpd add abandoned addresses to a table and remove them from that table when the address was properly leased. Machines in the "campers" table can be redirected to a web page instructing the user to use DHCP and have further connectivity denied until they do use
dhcp. As soon as an address is leased, it is removed from the overload table.
Users of these features are cautioned against placing too much trust in hardware or IP addresses as they can be easily changed with
dhcpd should be treated as a nuisance mitigation technique; it doesn't completely solve the problem of infected machines, but it does help in keeping you from getting completely swamped when the next worm comes racing through.
Can you explain the new
carp(4) group demotion feature? How can it improve reliability, and how does it interact with applications?
Marco Pfatschbacher: If you are running
carp(4) on multiple interfaces and one of the interfaces fails, you want the remaining interfaces to be taken over to the backup host, which avoids routing one part of your traffic into a blackhole. Initially we were just bumping the advertisement skew on all
carp(4) interfaces to a value of 240 in case of an error. In response to this, a backup host running in preempted mode would take over all
carp(4) interfaces of the failed master.
In more complex setups however, this all-or-nothing behaviour is not always optimal. To allow more control of which
carp(4) interfaces fail over together, we converted the global demotion variable into an interface group attribute. Thus one can move interfaces that combined provide one service into a separate group.
Additionally the value of the demotion counter has been added to a previously unused field of the
carp(4) protocol header. This allows us to act smarter in cases of multiple errors: Each error condition (e.g., a link failure) increases the demotion counter and the host with the lowest error count will become master.
We also made the group demotion counter accessible to userland, such that system daemons can control
bgpd(8) can now hold back a
carp(4) takeover until it has synced its routing table.
sasyncd(8) [similarly] prevents
carp(4) from preempting until it has received the complete SA's from the current master.
The current value of the demotion counter can be get/set via:
ifconfig -g group-name. To look at the default group "carp", for example:
$ ifconfig -g carp carp: carp demote count 0