June 5, 2007
In part 1 of this series, I established the problem latency can cause in high speed networks. What one reader correctly referred to as “big long pipes.” To summarize, in large bandwidth networks that span long distances, network latency becomes the bottleneck that retards performance. The reason for this the impact of network delays on TCP windowing. In part, 2 I will discuss what to do about it.
Dealing with latency can be tricky business. The methods used to mitigate the impact of distance depends on many factors including the services being accessed, the protocols being used, and the amount of money you want to spend. What works for a home user does not work for a multi-national corporation. In general, there are 4 approaches one can take to deal with latency:
- Tweak the host TCP settings
- Change the protocol
- Move the service closer to the user
- Use a network accelerator
The first and least effective method is to tweak the TCP settings on your hosts. I say least effective for several reasons: It is hard to determine the correct TCP window size; not all operating systems support the RFC 1323 extensions; you may not have control of all the hosts; available bandwidth may change due to network congestion. Most importantly, some time sensitive applications such as VOIP will still exhibit problems in high latent networks, even if you tweak TCP. Still, if you are a home user on a big long pipe, this is the only option for you. Changing TCP is OS specific. Slaptijack.com has an excellent series on TCP tuning operating systems. Below are links to his specific guides as well as other sources:
If you are going to tweak your TCP settings you need to figure out what values to use. PJ turned me on to Iperf. Iperf is a bandwidth measuring tool that will help you tune your settings by measuring the bandwidth and RTT. The Iperf site also has several good links to other tools and resources for tuning TCP settings. Check it out.
One other host tweak that works strictly for browsing is to make sure your web browser has support for HTTP 1.1 enabled. HTTP 1.1 signals the web server to use gzip compression for file transfers. If you run a web server, you can enable gzip compression to make life a little easier on your users. The downside to gzip is it adds CPU overhead to your server.
For corporate networks, better options exist. Rather than tune individual hosts, networks and applications can be designed with latency in mind. One effective method is to change the protocol. Latency is a problem because TCP waits for an acknowledgement. UDP is considered a “less reliable” protocol because it does not wait for acks. For many applications UDP will work just as well as TCP, and will work better over long connections. UDP works best for temporal data, such as streaming audio or video, or multiplayer games, but can also be used for any application where the only data you care about is the current sample, such as stock prices or weather data. If you control the code, and can deal with lost or mis-ordered packets, UDP may be the way to go. For a good discussion of the differences in coding sockets in UDP instead of TCP, check out: http://michael.toren.net/mirrors/sock-faq/.
Another viable option is to move the service closer to the user. This is the approach major web sites take. Global companies such as Google and Yahoo create multiple points-of-presence (POPs) that are geographically dispersed. Users are then connected to the nearest POP through the use of global server load balancers (GSLBs). GSLBs generally work using DNS to direct the user to the closest server. The two big players in the GSLB space are Cisco GSS and F5 BigIP. I’ve used both in the past and prefer the Cisco GSS if you have a Cisco network. Some web sites use content delivery networks such as Akamai or RapidEdge to move the service closer to the user by creating a distributed content cache. Another approach is to move your network logically closer to the user. For instance, if you have a large number of users accessing your website from a specific ISP, consider getting a connection to their backbone. This will place you several hops closer to the user by keeping all your traffic “on net.”
On corporate networks, placing services closer to the user usually involved placing servers in their offices. Three services that most IT shops support that benefit from localization are authentication, DNS, and file services. Microsoft has three technologies that simplify the process of managing multiple distributed servers providing localized services. First, Microsoft Active Directory Services supports multi-master replication of all changes to the directory. This means that changes made on one domain controller, such as adding a new user, are replicated to all other domain controllers in the domain. Second, Microsoft Active Directory-integrated DNS also supports multi-master replication. This is much improved over the old master-slave DNS schemes. Third, Windows 2003R2 now supports Distributed File Services Replication (DFSR). DFSR provides for multi-master replication of file shares. Both Active Directory and DFSR are location aware, meaning users will always attempt to access the local resource first.
If you cannot move the services, adding network accelerators to your network design is a good alternative. Typically, network accelerators are placed in between your WAN routers and your LAN switches. Accelerators use four tricks to deal with latency:
- Local TCP acknowledgment. The accelerator sends an ack back to the sending host immediately. This ensures that the sender keeps putting packets on the wire, instead waiting for the ack from the actual recipient.
- UDP Conversion. The accelerators change the TCP stream to UDP to cross the WAN. When the packet reaches the accelerator on the far end, it is switched back to TCP. You can think of this a tunneling TCP inside of UDP, although unlike a VPN the UDP tunnel does not add any overhead to the stream.
- Caching. The accelerators notice data patterns and cache repeating information. When a sender transmits data that is already in the cache, the accelerators only push the cache ID across the WAN. An example of this would be several users accessing the same file from a CIFS share across your WAN. The accelerators would cache the file after the first user retrieves it, and use a token to transfer the subsequent requests.
- Compression. In addition to caching, network accelerators are able to compress some of the data being transmitted. The accelerator on the other end of the WAN decompresses the data before sending it to its destination. Compressed data can be sent in fewer packets, thus reducing the apparent time to send.
There are several good vendors of network accelerators. For those of you who care what Gartner thinks, their magic quadrant for this sector can be found here. My personal favorite in this group is the Juniper product line. They work well and scale nicely. Other vendors include:
When designing a global WAN (which was the original premise of this article), using a combination of localized services and network accelerators will drastically improve the performance of your network. As stated in part 1, you need to start your design by profiling the services using your network. Some services can be easily localized, such as authentication or file services. Other services, such as an ERP system or other large centralized database cannot be practically localized. For these applications, network accelerators are the way to go. As a general rule of thumb, if your WAN is expected to average more than 150ms latency end-to-end, you should consider latency mitigation strategies. If you have time sensitive applications such as VOIP or interactive terminal sessions (green screens), your threshold for mitigation is probably 100ms. If you are a home user on a fat pipe with a lot of latency, the best you can do is tweak your OS and wait for your ISP and the web sites or game servers you access to get better.
Other useful links when attempting TCP performance tuning: