Boosting Internet Access Resilience, ISP Multihoming Explained

Internet access for organizations today is no longer about connectivity for email and web browsing. A stable Internet connection is a vital component in the chain of IT systems required to conduct business. Typically, in the past, the focus around Internet connectivity has been on cost, with vendors providing solutions allowing organizations to spread their traffic across consumer and enterprise products.

This approach is all good and can provide significant cost savings, especially when employee traffic is directed over low-cost consumer products such as ADSL; however, when conducting B2B business through front-end servers hosted in your DMZ, resilience becomes a major concern. A dead Internet link can mean loss of revenue and even, potentially more serious, brand damage in this scenario. In this paper, we discuss several methods that can improve the resilience of an Internet link. While this sounds like it should be a simple case of connecting to multiple Internet Service Providers, the devil, as they say, is in the detail.

Summary show

Mission-critical Internet

Business networks have been mission-critical for some time now, and the focus on resilience and business continuity has always been top of any CIO’s mind. However, this focus’s general areas of interest were restricted to internal networks and systems. With more and more business being conducted directly via the web or B2B over Internet links to systems hosted in DMZs, it is no longer permissible for an Internet link to be down. Loss of access to the Internet can directly impact revenue generation, especially today, as the business operating models begin shifting towards offsite cloud computing and software as a service.

A solution to the problem

Multihoming is a method whereby a company can connect to multiple ISPs simultaneously. The concept was born out of the need to protect Internet access in the event of either an ISP link failure or an ISP internal failure. In the earlier days of Internet access, most traffic was outbound, except email. An Internet link failure left internal users with no browsing capability and with email backing up on inbound ISP mail gateways. Once the link was restored, so was browsing and email delivery. The direct impact on the business was small and mostly not revenue-effective. Early solutions to this problem were to connect multiple links to the same ISP, but while this offered some level of link resilience, it could provide no safeguards against an internal ISP failure.

Today, however, most organizations deploy many on-site Internet-access services such as VPNs, voice services, webmail, and secure internal system access while using business-critical offsite services such as software as a service (SaaS) and other cloud-based solutions. Furthermore, while corporate front-end websites are traditionally hosted offsite with web hosting firms, the real-time information on the corporate websites and B2B sites is provided by back-end systems based in the corporate data center or DMZ. Without a good quality Internet connection, these vital links would be severed.

Varied requirements and complexity

That said, the requirements for multihoming are varied and could range from the simple need for geographic link diversity (single ISP) to full link and ISP resilience, where separate links are run from independent data centers to different ISPs. While the complexity varies for each option, the latter forms the most complex deployment option but affords the highest availability, with the former providing some degree of protection but requiring a higher grade of ISP. A major component of the complexity comes in around IP addressing.

The Internet IP addressing system works because each ISP applies for a range of addresses from the central Internet registrar in their region. They would then allocate a range of IP addresses, called an address space, to their customers from this pool. No two ISPs can issue the same address space to a customer. Why would this be a problem? Simply put, it’s all about routing. Routing is the process whereby the Internet finds out how to get traffic to your particular server. It’s a bit like the Google map for the Internet.

For somebody to find your server, a “route” or path needs to exist to the IP address of your server. Since you are getting your Internet service, and hence your IP address space, from your ISP, they are responsible for publishing the route to your server across the entire Internet. They are effectively the source of your way; nobody else can do that for your particular address space. You can see how things can go wrong if the ISP suffers internal failure. If your specific route disappeared, your server would vanish from the Internet, even if your Internet link ran. This is precisely the kind of issue multihoming tries to solve, but for completeness, we will start at the simpler options and work our way up.

Single Link, Single ISP, Multiple address spaces

While not a multihoming solution in the strictest sense, the single link, multiple address option can be useful for small sites. In this scenario, the publicly accessible host is assigned two IP addresses from two different address spaces. You would, of course, need two address spaces from your ISP for this to work. Thus, theoretically, if a routing issue impacts one of the address spaces, the other may still be available. The single physical ISP link is, of course, a single point of failure, and this option would seem to offer little in the form of real resilience.

Multiple links, Single ISP, Single address spaces per link

This scenario, generally called multi-attached, is a variation of the above. The site now connects through multiple links, each with a different IP address space, but still via a single ISP. If one of the links fails, its IP addresses will become unreachable. However, the other IP address on the remaining link will remain available, and your server will still be reachable. Internet Service Providers use a control protocol to manage their IP routes called Border Gateway Protocol or BGP. This protocol is used to drive the traffic re-routing over the live link. BGP can be complex and demands a lot from the equipment it runs on. Of course, with complexity comes a cost; however, the BGP deployment for this scenario is not as demanding as a fully multi-homed site and should not attract too much attention from the CFO. While the deployment is a simpler version of full multihoming, it does restrict the corporation to a single ISP, which may not be part of the business’s strategic intent.

Multiple Links, Multiple ISPs, Single address space

This scenario is what is generally meant when discussing multihoming. The BGP protocol manages the visibility of the single address space across the multiple links and ISPs and, thus, maintains the routes. The BGP protocol communicates between the corporate routers and those of the two ISPs, The protocol can detect a link failure and divert traffic to the functioning link, even if this is via a different ISP network.

What’s the catch?

There is always a catch; in this case, several of them exist. To run true dual ISP multihoming and BGP as a corporate, you would need your provider-independent (PI) IP address space, and you would need to apply for a unique BGP Autonomous System Number (ASN). The AS Number is used to identify your site as a valid Internet location in the eyes of BGP. While applying for an ASN is not arduous, it places some significant responsibility squarely on you instead of the ISP. Deploying BGP effectively brings your organization closer to the Internet by making you responsible for advertising your public IP address spaces and, thus, your routes. It also means that any operational mistakes you make will ripple through the entire Internet.

Address space considerations

Most large organizations that operate true multihoming already have their own Provider Independent address space. They requested This address space directly from the local Internet registrar some time ago before IP version 4 (IPv4) addresses started running out. Today, it is virtually impossible to be allocated a PI address space from the IPv4 pool. It is possible to run a multihomed scenario using ISP-provided IP address spaces. Still, the network configurations become considerably more complex and, at some point, start defeating the goal of increasing resilience. In the real world, increased complexity seldom equates to improved stability.