On the ‘net: Nothing to Hide, Everything to Gain

The IETF Journal published an article by Shawn and I last week about open source, open standards, and participation by vendors and providers.

Why should a provider—particularly a content provider—care about the open standards and open source communities? There is certainly a large set of reasons why edge-focused content providers shouldn’t care about the open communities. A common objection to working in the open communities often voiced by providers runs something like this: Isn’t the entire point of building a company around data—which ultimately means around a set of processing capabilities, including the network—to hide your path to success and ultimately to prevent others from treading the same path you’ve tread? Shouldn’t providers defend their intellectual property for all the same reasons as equipment vendors?

Middleboxes and the End-to-End Principle

The IP suite was always loosely grounded in the end-to-end principle, defined here (a version of this paper is also apparently available here), is quoted in RFC2775 as:

The function in question can completely and correctly be implemented only with the knowledge and help of the application standing at the endpoints of the communication system. Therefore, providing that questioned function as a feature of the communication system itself is not possible. … This principle has important consequences if we require applications to survive partial network failures. An end-to-end protocol design should not rely on the maintenance of state (i.e. information about the state of the end-to-end communication) inside the network.

How are the Internet and (by extension) IP networks in general doing in regards to the end-to-end principle? Perhaps the first notice in IETF drafts is RFC2101, which argues the IPv4 address was originally a locater and an identifier, and that the locater usage has become the primary usage. This is much of the argument around LISP and many other areas of work—but I think 2101 mistates the case a bit. That the original point of an IP address is to locate a topological location in the network is easily argued from two concepts: aggregation and interface level identification. Host can have many IP addresses (one per interface). Further, IP has always had the idea of a “subnet address,” which means a single address space that describes more than one host; this single subnet address has always been carried as a “single thing” in routing. There have always been addresses that describe something smaller than a single host, and there have always been addresses that describe several hosts, so the idea of a single address describing a single host has been built in to IP from the very beginning.

RFC2101 notes that the use of an IP address for both identifier and locater has meant that applications began relying on the IP address to remain constant early on. This resulted in major problems later on, as documented in various drafts around Network Address Translators (NATs). For instance—

RFC3234 relies on the state required to keep NAT translations in a local table for much of its analysis—although other kinds of middle boxes that do not use NAT to function are considered, as well. Most of the middle boxes considered fail miserably in terms of supporting transparent network connectivity because of the local application specific state they keep. A new, interesting draft has been published, however, that pushes back just a little on the IETFs constant publication of documents discussing the many problems of middelboxes. Beneficial Functions of Middleboxes, a draft written by several engineers at Orange, discusses the many benefits of middle boxes; for instance—

  • Middleboxes can contribute to the measurement of packet loss, round trip times, packet reordering, and throughput bottlenecks
  • Middleboxes can aid in the detection and dispersion of Distributed Denial of Service (DDoS) attacks
  • Middleboxes can act as performance enhancing proxies and caches

So which is right? Should we vote with the long line of arguments against middle boxes made throughout the years in the IETF, or should we consider the benefits of middleboxes, and take a view more along the lines of “yes, they add complexity, but the functionality added is sometimes worth it?” Before answering the question, I’d like to point out something I find rather interesting.

The end-to-end rule only works in one direction.

This is never explicitly called out, but the end-to-end rule has always been phrased in terms of the higher layer “looking down” on the lower layer. The lower layer is never allowed to change anything about the packet in the end-to-end model, including addresses, port numbers, fragmentation—even, in some documents, the Quality of Service (QoS) markings. The upper layer, then, is supposed to be completely opaque to the lower layers. On the other hand, there is never any mention made of the reverse. Specifically, the question is never asked: what should the upper layers know about the functioning of the lower layers? Should the upper layer really be using a lower layer address as an identifier?

What is primarily envisioned is a one-way abstraction; the upper layers can see everything in the lower layers, but the lower layers are supposed to treat the upper layers as a opaque blob. But how does this work, precisely? I suspect there is a corollary to the Law of Leaky Abstractions that must say something like this: any abstraction designed to leak in one direction will always, as a matter of course, leak in both directions.

I suspect we are in the bit of mess we are in with regards to NATs and other middle boxes because we really think we can make a one-way mirror, a one way abstraction, actually work. I cannot think of any reason why such a one-way abstraction should work.

To return to the question: I think the answer must be that we must learn to live with middleboxes. We can learn to put some rules around middleboxes that might make them easier to deal with—perhaps something like:

  • Middleboxes would be a lot easier if we stopped trying to stretch layer 2 networks to the entire globe, and the moon beyond
  • Middleboxes are a fact of life that applications developers need to work with; stop using locators as identifiers in future protocol work
  • Network monitoring and management needs to stop relying on IP addresses to uniquely identify devices

At any rate, whatever you think on this topic, you now have the main contours of the debate in the IETF in hand, and direct links to the relevant documents.


The audio is a bit low on this video; it actually recorded a bit lower than this, and the process of amplifying made it poor quality. I’ve already separated the audio recorder from the camera; now I’m upgrading the mic, and playing with the settings on the recorder to make the audio better quality. I’m also playing with putting text and drawings on the side of the video, a concept I intend to use more often in the future.

So this is a bit of a play video, but with a somewhat serious topic: the importance of learning the history of network engineering. Some resources are included below.

A bit history of the Internet
Net Heads versus Bell Heads
On the History of the Shortest Path Problem
The Elements of Networking Style
Software Defined Networks has a great introductory section with a good bit of history