Multi-Homing: Connecting to Two ISPsby Vincent Jones
Vincent Jones will be speaking at the O'Reilly Open Source Convention in San Diego, CA, July 23-27, 2001.
Many organizations depend upon Internet connectivity to support critical applications. One popular approach for improving Internet connectivity is to connect to more than one Internet service provider (ISP), a technique called multi-homing.
Multi-homing can be very effective for ensuring continuous connectivity -- eliminating the ISP as a single point of failure -- and it can be cost effective as well. However, your multi-homing strategy must be carefully planned to ensure that you actually improve connectivity for your company, not degrade it.
The concept of physical diversity
First, I want to discuss the network components that can affect overall connectivity. Because most network failures are due to problems in the WAN links, it does little good to connect to a second ISP if both ISP links are carried over the same communications circuit. Even if independent circuits are used -- if they are not physically diverse they will still be subject to common failure events such as construction work inside your building or digging in the street outside.
Providing complete physical diversity can be difficult and expensive, but the requirement is not limited to ISP connections. All critical network links for internal communications should also be diversified. Assuming an otherwise well-designed internal network, the easiest way to achieve physical diversity in your ISP connections is to connect from two different locations that are already well-connected to each other. But they must be far enough apart that they don't share any common communications facilities to either ISP.
Redirecting traffic using the Border Gateway Protocol
Once physical connectivity is in place, you need to make it useful. Taking advantage of redundant links requires two conditions to always be present. First, you must be able to detect when a link has failed. Second, you must have a mechanism for redirecting traffic that would normally flow across a failed link to take a different path that is still functional. In a multi-homing environment, both tasks are normally achieved by running Border Gateway Protocol (BGP) between your routers and those of the ISPs.
BGP is often assumed to mean complex configurations on expensive, high-end routers to handle the huge routing tables required to fully describe the Internet. However, depending upon the specific application requirements and the degree of load-balancing you want across all available links, it may be practical to implement multi-homing using the smallest routers you have available that are capable of handling the traffic load.
In other words, implementing multi-homing doesn't have to be an all-or-nothing choice. There are choices you can make along the way based upon the equipment you have available and the level of connectivity you need to provide.
Determining level of connectivity required
At one extreme, when your goal is to simply to provide internal users with access to the Internet, you don't need to run BGP at all. As long as the link layer protocol supports the exchange of keep-alive messages from router to router, link failure will be detected by the link layer protocol. Floating static routes can then reliably direct all outbound traffic to a working ISP link.
Network Address Translation (NAT) is then used to send outbound packets with a source IP address associated by the ISP with that outbound link. Return traffic will automatically come back via the same working link because that link is the only link servicing that address range.
Of course this approach will not work if you are providing services to the outside world, as the addresses associated with the failed link will disappear. Similarly, connections that were established over the link that failed will need to be reconnected. However, for many applications this impact is minor.
For example, a typical web surfer would merely need to hit the "page refresh" button. This approach is also sufficient to provide high-availability virtual private networks (VPN) across the Internet if you use a routing protocol such as OSPF to detect and route around failed IPSec tunnels.
Do you have any additional suggestions that one should consider before setting up multi-homing for an enterprise?
Vincent Jones will be presenting the tutorial, Network Design for High Availability at the O'Reilly Open Source Convention in San Diego, CA, July 23-27, 2001. Rub elbows with open source leaders while relaxing on the beautiful Sheraton San Diego Hotel and Marina waterfront. For more information, visit our conference home page or see our pre-conference coverage.
The other extreme would be when you need to support a common IP address range using both ISPs. Then you need to run BGP. This will normally be the case any time your applications include providing services to Internet users, such as access to a common database. You will need to arrange for both ISPs to accept your BGP advertisements of your IP address prefixes. Then your ISPs need to advertise those address prefixes to the rest of the Internet.
Getting your address prefixes advertised is usually not a problem. You do, however, have to use care in your configuration to ensure that you do not inadvertently advertise any other address prefixes. In particular, you must ensure that you do not advertise yourself as a path between the two ISPs. This could cause your links to be consumed by transit traffic of no interest to you. More challenging is setting up your advertisements so that incoming traffic is reasonably balanced between the ISP links. Achieving that can be difficult at best, and nearly impossible at worse.
Choose the right route for you
The final decision is determining which routes to accept from each ISP. This can range from merely accepting a default route (used to detect if the link is up or down) to accepting all routes (so called "running defaultless"). The former is usually insufficient, because it does not protect you from an ISP which has an internal failure cutting them off from the rest of the Internet. The latter requires using "carrier-class" routers with lots of memory installed (and therefore more expensive). Fortunately, there are some "in-between" choices.
Rather than using a simple default route, you can use a conditional default route to protect against ISP failure behind the ISP's router that serves you. A conditional default route is a default route that is defined by a router only if a specific address is already in that router's routing table. Each ISP is only used for a default route if it is advertising one or more routes that indicate it is receiving advertisements from the rest of the Internet. That way, you will always use a default route which promises to be useful.
Another option is to have the ISP send you just its local routes. That way, you can optimize your outbound routing to avoid sending packets that could be locally delivered to the wrong ISP, adding to delivery delays. Care must be taken when using this option, however, because some ISPs have so many local routes that there is no cost benefit in the size of the routers required to handle them compared to running defaultless.
Options can also be combined. In many cases, taking local routes and a conditional default route will provide all the availability benefits of running defaultless, while still allowing the use of low-cost routers. As is always the case in networking, a good understanding of the requirements and the available capabilities is essential to maximizing cost-effectiveness.