|Internet2 Security - Reconnections
|Copyright © 2006 by Internet2 and/or the respective authors||Released:
March 29, 2006
Questions and comments are welcome and may be directed to <firstname.lastname@example.org>.
Challenges to the status quo of modern Internet evolution were considered by network researchers and architects at the October 2005 "Reconnections" workshop, which was convened to bridge the gap between the practitioner community responsible for running today's large-scale, high-performance networks and the research community exploring the ideas to be used for their next generation.
Over the course of the 2-day workshop, attendees explored and debated many of the cherished principals on which the Internet was founded. They also explored the meaning of and path to "manageability" of large networks, and discussed opposing approaches of incremental change versus clean slate design. The group was able to agree upon a set of recommended "next step" collaboration elements and communications involving the research community, the vendor community, the standards bodies, and the practitioner community. These are described in the final sections of this report.
Boroumand, Javad -- Cisco Systems
Brammer, Robert -- Northrop Grumman Information Technology
Catlett, Charles -- Argonne National Labs
Clark, David -- Massachusetts Institute of Technology
Corbató, Steve -- Internet2
Cramer, Christopher -- Duke University
Cropp, Richard -- The Pennsylvania State University
DiFatta, Charles -- Carnegie Mellon University
Esaki, Hiroshi -- JAIRC
Gardner, Michael -- University of Illinois At Urbana-Champaign
George, Jeremy -- Yale University
Gettes, Michael -- Duke University
Gray, Terry -- University of Washington
Guerin, Roch -- University of Pennsylvania
Guok, Chin -- ESnet
Hamilton, Marc -- Sun Microsystems
Hutchins, Ron -- Georgia Institute of Technology
Kassabian, Dikran -- University of Pennsylvania
Klingenstein, Kenneth -- University of Colorado/Internet2
Klingenstein, Nate -- Internet2
LaHaye, Michael -- Internet2
Maltz, David -- Microsoft Research
Martin, David -- IBM Corporation
Miller, Kevin -- Duke University
Moore, Jonathan -- University of Pennsylvania
Morton, David -- University of Washington
Neuman, Clifford -- University of Southern California
Olshansky, Steve -- Internet2
Pepin, James -- University of Southern California
Poepping, Mark -- Carnegie Mellon University
Silvester, John -- University of Southern California
St. Arnaud, Bill -- CANARIE, Inc.
Travis, Gregory -- Indiana University
Vitullo, Peter -- Ford Motor Company
Yun, T. Charles -- Internet2
Zhang, Hui -- Carnegie Mellon University
IT practitioners and researchers from the higher education community and industry gathered in the Fall of 2005 to discuss a) problems with the current Internet, b) what the Internet of 2015 should look like, and c) strategies for getting from here to there. As the conversation evolved, a central theme of "manageability" emerged.
The workshop was convened to bridge the gap between the community responsible for running today's research and education networks and those shaping next generation networks. The hope is that by involving deployers in the design stages, the fundamental concept of manageability can be incorporated more fully into the final result.
While the success of the Internet is both undeniable and astounding, its success has brought with it new requirements and new constraints that have significantly transformed the nature of the net. The result has been a global system that is more fragile and more difficult to manage than one would hope, and which seems to be evolving away from the original Internet design principles that contributed to its success.
For example, the basic tenet of keeping the network core simple and transparent, with complexity constrained to the end-systems has been seriously undermined by the proliferation of NAT (Network Address Translation) boxes, perimeter firewalls, and vendor initiatives such as Cisco's "Application-Oriented Networking". Even the Internet cornerstone concept of packet switching is being challenged by the layer-1 personal lambda phenomenon, through the use of optical circuit switching. The rationale and significant benefits of those early principles, especially the "keep the core simple and transparent" concept, are as valid today as they were thirty years ago -- but today the need for greater security has created conflict between "open Internet" advocates and those instead favoring strong "perimeter defense.” Hence a key strategic problem statement for Internet architects might be: "How do we provide open and transparent network connectivity for willing partners, while providing reasonable protection from threat traffic?"
Internet success brought not just scalability challenges, but diversity in multiple dimensions. Not just diversity of applications, but a variety of individual motivations -- not all honorable. Thus, security issues have plagued the Internet since it grew beyond the cozy research community in which it originated. The character of today's Internet is increasingly dominated by the ongoing arms race between attackers and administrators. All signs are that the arms race will continue to escalate. If we continue to respond with more complexity in the core, the rate of silent failure of user applications will likely skyrocket, and our already significant troubleshooting challenges will only worsen.
Security, manageability, reliability and scalability (especially including new domains such as sensor nets) are key challenges that have led to serious re-thinking of assumptions about the Internet. In particular, security requirements leading to pervasive "Traffic Disruption Appliances", such as firewalls, have led to difficulties in user expectation setting, complexity management, and rapid diagnosis of network problems. This in turn defines the other key problem to be addressed by the workshop: "What is to be done about the manageability crisis of the current Internet?"
Is the recent interest among high-end researchers in personal lambda (optical wavelength) networks a manifestation of these challenges and our inability to meet them? Or is that trend driven by more parochial local control interests?
How much network convergence of services, connectivity classes, or even of geographic regions is desirable in the future Internet?
Is it possible to have policy enforcement points (PEPs) sprinkled throughout the network in a way that does not cause users to conclude that the network is broken (with consequent calls to the NOC)?
Are Internet users becoming de-sensitized to "glitches" or transient problems and is the "Mean Time Between Glitches" getting better or worse?
Can/should the original Internet concept of a simple and transparent core be retained as a guiding principle for next-generation networks?
Can we satisfy both the "open" and the "closed" network enthusiasts? And if so, can it be done in a way that does not drive Network Operations staff stark raving mad?
The fundamental tension between those who prefer an "open Internet," with low friction to packet flow -- i.e. few Policy Enforcement Points (PEPs) in the network -- and those who feel the need for extensive perimeter defense to protect from cyber attack threats continues unabated, though with a growing appreciation that adaptations to the presence of firewalls (e.g. tunneling and encryption) will ultimately render current perimeter defense strategies ineffective.
Still, many believe that networking is now more about selective isolation than about pervasive connectivity. At best, selective isolation implies additional complexity and manageability challenges; at worst, it might mean partitioning the One Network into multiple nets for distinct communities -- which implies a significant loss in value, if you accept the premise that a network's value goes up as the square of the number of users or end points (Metcalf's law).
Manageability is a vague term, sometimes considered a polite form of "control." Complexity is the enemy of manageability, but complexity is an inevitable response to accommodating conflicting needs. While simplification is one key to manageability, that often means abstracting and hiding the inevitable complexity of any particular solution space so that it doesn't directly confront users or administrators. However, hiding it doesn't always make problems borne of that complexity go away -- it just makes them harder to diagnose.
Control, or autonomy, is a recurring theme that relates to the desire for network isolation. In a network, who gets to control what? Where are the boundaries that limit intentional or accidental damage? Isolation of traffic classes can be achieved at different layers of the protocol stack (e.g. separate fiber, or VLANs, or MPLS, or IPsec); whereas geographic or organizational network autonomy/isolation requires a different Internet paradigm with redefinition of "edges" and "end-to-end" -- more of a federation of networks with connecting gateways, than a single Internet with global addressing.
Campus infrastructure shops are caught in the middle. Deke Kassabian surveyed perceptions of security problems at the University of Pennsylvania. That the network was "Open and highly available" topped the list.
Pressures on the Internet and its original design principles are varied. Economics, sociology, and psychology all mix with technical realities to form a complex brew of fears, territoriality, and frustration -- all leading to players in every role singing the same hymn: we want to understand and control the aspects of the network that are important to our success.
Can we preserve the wisdom of the goals and principles that helped to bring about the success of the Internet, in an updated architecture that responds to these new pressures? "Abstractions don't fail," Dave Clark remarked, but the principles upon which the current Internet is teetering are at risk of doing just that as they are increasingly violated to meet some of the goals of modern Internet users.
How did we get here? As networking requirements evolve to include fundamental shifts in connectivity (security) needs as well as unfathomable numbers of connected devices, how has the current Internet coped? Said differently, how did we get to have a network that is incredibly successful, but increasingly fragile, glitch-prone, and difficult to diagnose? How is it that many aspects of this necessary evolution seem to have negative consequences? Each adaptation and accommodation to changing requirements has had both good and bad consequences. Let's review some of them here.
One of the early Internet adaptations to changing security requirements (accelerated dramatically by the advent of Slammer and Blaster in 2003) was the installation of firewalls around administrative domains. Early on, large perimeter firewalls were recognized as a poor solution by those who believed security should be handled at the user edge, and the network should not get in the way of connectivity -- but for many, resistance proved to be futile. Thus, while the original Internet was transparent and largely open, artificial barriers to unfettered connectivity now abound. Now pervasively deployed throughout the net, these devices have blocked countless attacks against unprepared end-systems, but they have also made debugging more difficult and encouraged a false sense of security as attack vectors move from the "outside" to the "inside," all the way to the human central nervous system (think Phishing attacks, for example.) This reliance on perimeter defense acts like isolated communities of people in the physical world: as soon as a pathogen crosses the border, the entire population quickly becomes infected.
One manageability challenge follows from the fact that firewalls in the network do not in general have a way to communicate directly with users, thus leading to confusion on the part of users about whether they have run afoul of policy, or the network is broken. This is exacerbated by the absence of any "path policy discovery" protocol in the Internet suite. (This would be a protocol, analogous to "Path MTU Discovery," that would allow an end-system or user to determine whether connectivity required for a particular application between two end-points was in fact available, e.g. whether the necessary ports are unblocked all the way between the end-points.)
Financial accountability requirements for public corporations have upped the ante for security measures, and institutional firewalls are essentially mandated by auditors. Even in higher-ed, IT professionals quickly learn that the correct answer to the question "Do you have a firewall" is "Yes," as Ken Klingenstein said years ago. Terry Gray noted: "An IT manager once told me she budgeted one FTE for each of her border firewalls... but one of my staff observed that if you choose not to have a border firewall, you also need to budget one FTE just to explain why you don't have one." (Deke has found that pointing people to a document explaining why they don't have a border firewall has been effective. After years of writing similar documents, Terry finally solved his problem by installing a border intrusion prevention system.)
Ultimately, attackers hold the trump card in this hand. Encrypted traffic is by definition impossible to inspect. As more and more data flows over TLS or IPsec tunnels, perimeter and network-based protection strategies based on port blocking or packet content inspection are doomed. The rules of the game must be changed soon.
Intrusion Prevention Systems
Increasingly subtle attacks on services that could not be blocked without creating a "self-imposed Denial Of Service attack" led to more sophisticated firewalling strategies such as real-time Intrusion Prevention Systems (IPS) which monitor traffic patterns and packet contents for patterns matching known threat signatures. Such innovations avoid endless debates about which ports should be blocked at the border of an institution, but also bring new failure modes to the network. As adversaries accommodate to various perimeter defense strategies, the consequence is a growth in the use of encryption for back-door traffic, and edge-centric attacks using email and web functionality.
Not surprisingly, users have often found themselves squeezed between the connectivity needs of their applications and institutional security policies, which sometimes default to "closed". Policy makers were not always amenable to opening up ports for "non-standard" applications, and even if they did, there were no guarantees of corresponding open ports across the Internet. The problem is compounded by applications that dynamically select ports.
As an inevitable consequence, application developers have accommodated these firewall realities in predictable ways. There are now a growing number of "firewall-friendly" applications that tunnel their traffic over port 80, originally reserved for HTTP. Because the web has become an essential application that is rarely blocked by policy makers, port 80 (and its corresponding secure/encrypted port, 443) are now the natural refuge for developers of applications, both legitimate and otherwise. VPNs are even tunneled through port 80 at this point, leading Jon Moore to say, "Just publish a spec for running TCP/IP over HTTP and you're done." It turns out this has already been done through other ports to defeat wireless network access control.
Security requirements for controlled access to certain network resources have led to a variety of "hacks" such as "captive portals" which play games with routing and addressing and can make problem diagnosis more difficult. While there are other strategies evolving, e.g. 802.1x, captive portals show no sign of going away soon, since they work with a wider variety of clients, and in places that have lots of desktop hubs/switches.
DOS Attacks and Botnets
A security challenge for which there has not yet been an adequate solution is the Denial of Service attack. These seek to disable one or more hosts via intensive data barrage from swarms of infected computers, or "Bots" -- machines with a security vulnerability that enabled them to be taken over and controlled by an adversary. A Botnet might be comprised of hundreds of thousands of compromised hosts.
Now that cyberattacks are increasingly driven by organized crime, botnets are assembled for their commercial possibilities rather than bragging rights. They can be used to launch spam, adware, or DOS attacks against extortion victims who failed to pay up. "DoS for fun is gone; now it's for profit," remarked John Sylvester. Botnets may spend much of their time battling other botnets to establish marketplace dominance. "If there were honor amongst thieves we'd be doomed."
Network Address Translation
NAT is a great example of a technology that breaks the fundamental end-to-end design of the Internet. NAT is generally considered a stop-gap method for address conservation, yet it shows no sign of dying even after IPv6 becomes more widely available. Could that be because networking people failed to understand why NAT is so entrenched? Perhaps NAT's security (asymmetric "unlisted number" connectivity) and autonomy characteristics (preserving site addressing autonomy even when you don't have Provider-Independent IP addresses) are more important than its address-conservation capability. And the darkside? NAT boxes have significantly varying semantics, and wildly varying state timeouts, which affect the ability of certain applications to maintain a connection for very long, and put the question mark back into problem diagnosis. Not to mention the fact that NAT causes problems for apps that put global IP addresses inside their packets, as SIP and most peer-to-peer apps do. Of course, even the perceived virtue of asymmetric 'unlisted number" connectivity is no panacea, as we already see attack vectors changing to work around all existing perimeter defense strategies including NAT.
Virtual Private Networks
Another security accommodation is the Virtual Private Network, which can be used to connect remote users to secured enterprise nets, or partner institutions to each other. While providing authenticated network admission control to perimeter protected networks, VPNs are also a "nested" accommodation that are sometimes needed only because the perimeter defense firewall interferes with an application. The VPN creates a "clear channel" between the client and the server, through the perimeter defenses. On the other hand, "VPNs are great attack gateways, and they're hard to diagnose. What's not to love?" said Terry.
Ironically, the encryption inherent in VPNs defeats other security measures that rely on traffic inspection.
Perhaps the ultimate consequence of increasing friction in the Internet (Jim Pepin's term for the effect of firewalls, NAT boxes, etc.) is the move to abandon the Internet entirely in favor of switched optical networks. Cost-effective DWDM and dynamically-switched lambda networks are a recent phenomenon, and some feel that the inability to provide predictable performance, stability, and diagnosability in a conventional shared layer-3 IP network is a primary motivator for the trend.
Here is an example of an implementation decision that makes the network more fragile. It has to do with protocol timeouts. By default, Microsoft's TCP/IP stack terminates a connection if there is no response after five retries. That is less than 20 seconds, which is less time than most switches take to reboot. This decision may have been a response to user impatience -- not wanting users to wait forever for a server -- but implementing that goal at the transport level is a formula for phantom glitches in any applications that maintain persistent connections.
Is it possible to make incremental changes to address the manageability problems of the current Internet, or do we need a "clean slate" approach to achieving a secure, reliable, scalable, and *manageable* Internet? Or do the new layer-1 optical networks constitute that clean-slate approach?
Here is an example of an implementation decision that makes problems in the current Internet difficult to diagnose: Most routers provide lots of information about successful packet forwarding, but very little information about packets that are dropped. While they usually provide aggregate error counters, real-time information on specific packet loss in specific flows is hard to come by, and this makes it difficult to diagnose transient performance problems.
SNMP and ICMP are two of the most basic Internet diagnostic technologies, but security and privacy concerns have led to widespread blocking of those ports, thus undermining their utility for interrealm problem diagnosis. Completely disabling ICMP seems to have been a popular (if overzealous) response to a set of attacks that exploited a limited number of ICMP implementation vulnerabilities.
Privacy concerns also lead to restricted access to diagnostic info. Details about network architecture and function could potentially be exploited by attackers. Some useful control plane and diagnostic information might safely be revealed if the infrastructure could distribute such information selectively based on the identity of the requestor.
Another tangential issue that needs to be addressed is the ratio of signal to noise. It's not too hard to coax a lot of data out of the network, but to filter, correlate, and transform this data into useful information is much harder. Particularly since this data is generally distributed amongst many systems and technologies, assembling a useful information stream is a challenging and often manual and resource consuming process. How can important events be flagged for administrators? Could a warn/error/fatal-style methodology be applied to network problems?
In Particular, Silent
Even worse than too much noise is no signal. Silent failure is the most significant shortcoming of current diagnostic technology. Under a "best effort" packet routing credo, there is often no indication given by network-layer failures as to what failed or why. This too is filled with security issues, as verbose failure could be used to map networks or enable DoS attacks if poorly implemented. There was general agreement that all network failures in the future Internet should return some information about what occurred and why, although the security/privacy concerns noted above may modify that goal.
User perception of network failure is colored by the current approach, leading to confusion, frustration, and misplaced blame. Diagnostic information should not be limited only to network administrators, but should be delivered in a coherent fashion to end users as well.
There is a growing sense that the future of network connectivity is in white lists and black lists, implemented via federated trust systems. (Gray observes that protection measures implemented for email tend to show up at layer 3 about two years later.) Buddy lists in IM are perhaps an even better example of application-layer security providing inspiration for future network security paradigms.
Federated identity is designed to facilitate large numbers of peerwise trust relationships. An arbitrary number of authorities are responsible for certifying basic trust information and descriptions for individual entities as a basis for peer-to-peer trust establishment. This allows for trust to form with more of a web topology than a hierarchical topology.
One of the important conclusions of the workshop was that future networks needed to embrace the concept of "trust mediated transparency," which means that it should be possible to establish open/transparent connections among trusted partners. As levels of trust are established, so can levels of transparency.
Can a network ever provide adequate security/privacy protection for applications? The consensus is: no, a prudent planner will assume that the network cannot be trusted, even inside a firewall, thus, application developers should take precautions to protect their data to ensure privacy and integrity. For example, use application-level end-to-end encryption such as that which SSH and SSL offer, with mutual authentication. Nevertheless, the pressure to have the *network* protect hosts and applications is relentless. As encryption and tunneling make it increasingly difficult to distinguish good traffic from hostile, network security may increasingly become focused on verifying traffic sources and destinations. That's where federated trust mechanisms may come into play. Nevertheless, there are still good reasons for application security to be maintained, even if the network takes on some of the security responsibility.
Would a concrete guarantee that the data being passed up from the network stack were protected in transit from a properly authenticated source remove the need for applications to handle this themselves? Simply knowing who is sending and that the channel is secure is no guarantee that the data received is benign or that that host itself hasn't been compromised, so application security will always be a requirement.
Convergence and Overlays... or not
Building a new kind of network on top of an existing physical network is a tried-and-true strategy for innovation. Many advanced networking research efforts, and even some production services, are based on overlay approaches, in which the object identifiers, routing and topology of the overlay system have little direct relationship to those of the underlying infrastructure. However, certain characteristics of the base infrastructure can't be masked, and can interfere with the objectives of the overlay net -- for example, latency and jitter. This may be a strong motivation for stronger isolation techniques, such as use of DWDM to provide independent wavelengths.
MPLS (Multi-Protocol Label Switching) is growing in popularity. MPLS was originally designed as a generalized routing protocol to sit between layers 2 and 3 of the stack. The real-world impact of MPLS has been more broad, as it essentially represents yet another flow ID.
It is currently one of the premier methods for constructing custom overlay networks for carriers -- and even enterprises are starting to use MPLS for isolating traffic classes within their own networks.
Static vs. Dynamic Configuration
We have dynamic routing protocols in place, but many overlay networks are statically configured. Moreover, enterprise networks are under pressure to move toward organizational, rather than geographic, topologies -- and this is often accomplished via static configuration of VLANs.
Many MPLS networks are also configured statically, which may either limit their complexity or lead to management nightmares. One of the workshop participants suggested that the concept of Label Switched Paths on demand might render this static approach inadequate and obsolete. Considering the challenges for manageability and diagnosability, will dynamic MPLS drive us toward "active network" technology?
Dave Clark mentioned that analysis by AT&T researchers showed that Internet routing protocols could converge as quickly as SONET if their parameters were tuned for local use, rather than assuming a network of global proportions. Does this suggest that the concept of a single global network, as compared to federations of smaller networks, is a flawed concept? Clearly we need connections that span the globe, and we need fast route convergence when a path fails. Future network design efforts may need to find better ways to reconcile those conflicting goals as applications such as VOIP and conferencing demand glitch-free path fail-over.
An alternative to overlay networks is "virtualization". Both are attempts to use the same infrastructure for different -- possibly conflicting -- services. Overlays do this by creating logical topologies that do not necessarily map directly to the underlying physical topology. Examples include organizational network topologies implemented via VLANs (wherein all "accounting" systems might be on the same subnet and all "physics" systems might be on another subnet, regardless of where those machines are connected physically).
Often, a new "value added network," or overlay, is implemented by adding servers to a network that perform the function of routers for the new network abstraction. In contrast, virtualization slices the infrastructure "vertically" via software to allow multiple logical instances of a device, rather than "horizontally" (by adding layers). Major router vendors support this notion, although implementation details and constraints vary.
In both cases we are talking about converged services, and in both cases the motivation is economics, since the idea is to avoid replicating physical infrastructure unnecessarily. As with overlay strategies, and indeed with convergence in general, virtualization is a two-edged sword. It saves money on infrastructure, but it means that different services are sharing that same infrastructure -- for better or worse. Sometimes it is difficult to share network facilities without one service class interfering with another, or at the very least, having common points of failure.
While the economic advantages of (all forms of) convergence are clear, more controversial is whether virtualization helps or harms manageability. Virtualization can lead to extremely complicated configuration and deployment scenarios. Operators of converged infrastructures live or die by the quality of their monitoring and diagnostic tools. As deployments grow more complex, and multiple topologies overlap and link at different layers on the same underlying network, visualization tools become increasingly important. A set of advanced visualization tools could allow for intelligent monitoring and modeling of the underlying structures. Competing vendor standards, a lack of interoperability, and the inherent complexity in the system make this less useful today than it could be.
"Make implementers think hard: not deployers, managers, or users." This extremely beneficial overarching philosophy was used in the initial design of TCP to great success. There's very little configuration that needs to be done to manage TCP-based communication, a strength which should be preserved and expanded in future designs.
Autoconfiguration and tuning can be risky. For example, the TCP stack in Windows NT attempted to defend itself from some forms of denial-of-service attack, but high-end streaming applications were inadvertently throttled by that code. In another example, Microsoft intentionally reduced the performance of the TCP stack implementation shipped in their Windows OS after an earlier implementation, with too aggressive a retransmit algorithm, caused virtually every packet to be sent twice.
Should the ideal communication network mimic human social interaction norms?
Networks serve people. All applications, protocols, boxes, wires, and chips have the eventual goal of enabling humans to communicate. When there were serious technical hurdles to overcome already and the network was used by a small, focused community, this theme was generally marginalized by the nuts-and-bolts required for interoperability and performance. The broadened scope of the Internet invited everyone to participate, but this ubiquitous interconnection and the ephemeral and anonymous nature of the digital persona violates traditional protections. Nobody ever went back to add the social contract to the network.
This has manifested itself in the repackaging of standard malfeasance in new words for a new world. Spam, phishing, and identity theft are all ages-old scourges that have been enabled and empowered on a scale that has never been seen before. Beyond being a severe nuisance and a potential source of security breaches, there is an underlying assault on the network from these sources. As they propagate, usability itself is slowly eroded.
New applications that represent broad communities of individuals have begun to include some interesting adaptations. "There's lots of subtle social engineering in the instant messaging model, which may reflect reality," mused Dave. He likened the Buddy List to Victorian society: you don't talk to anyone unless you've been introduced to them. It's impossible to send or receive messages to an individual until mutual awareness has been established, and a third party or out-of-band communication generally serves as a broker for this process.
There are more interesting aspects of these implementations. Even in such an essentially peer-to-peer application such as messaging, the major instant messaging programs have all decided to funnel all messages through a central server. Clearly, this is a more expensive, slow, and technically challenging model than direct communication would have been. This conscious decision was apparently made for security reasons, embodying fears that knowing the IP addresses or having direct access to clients would lead to all sorts of attacks.
Visibility is a feature IM systems closely control as well. While initially this seems very similar to the buddy list, this is a recapitulation of a behavior very widely exhibited on the Internet at lower layers as well. Again, a primary benefit touted of firewalls is their ability to limit visibility. People like to see everyone and to have nobody see them.
The violations of societal rules on the network is not limited to those broad categories, either. One of the most contentious is the widespread illegal distribution of intellectual property. Boxes have been introduced that scour network traffic for TCP streams resembling transfers of copyrighted music and sending out spurious TCP RSTs to interrupt the data stream -- a rather blunt technical response to a political problem! Should networks be engineered so as to limit or prevent the proliferation of illicit file sharing? Can it be done without sacrificing other core values such as decentralization and privacy? Is the addition of some sort of mechanism to protect intellectual property and manage digital rights necessary?
Many phishing attacks rely on applications' inability to resolve the identity of a service in a way that's meaningful to users. More robust identification schemes would improve this. In the end it may be that the network would have to be re-thought to allow it to support a more natural model of human interaction.
Economic pressures also drive the way network devices function. Some companies may attempt to achieve competitive advantage through expanding the capabilities of a box beyond original intent. Others may strip away functionality that is too expensive to implement for its benefit.
Techies don't get to dictate the economics of the systems they build, but cost is always a design constraint and an awareness of marketplace realities may influence the design. The Internet connectivity industry is a sunk cost industry: put the fiber in the ground at great expense, and hope the investment pays off. This sort of economic model can easily lead to overcapacity, as happened during the dot-com boom and in the airline industry. The result is ferocious price competition and elimination or assimilation of competitors until competitive pressure is reduced and prices rise. Such economic climates do not encourage growth, innovation, or visionary risk-taking.
The basic nature of a packet switched network also makes the development of economic models for its use inherently difficult. "What does an ISP sell? What do we buy?" posed Dave. As a backbone operator with only settlement-free peering, there are no economic incentives to carrying other network operators' traffic, and end users buy nothing more than an uplink. Many in the industry are starting to believe the core value of a pipe is that control of the pipe allows for some degree of control of the content passing through it. This would allow for revenue streams through preferential content delivery, but it's unclear how that fits with an egalitarian network. Hence the current policy interest in "network neutrality". Analogous paid search functionality has not seen much pushback, though.
Security and manageability are also inherently vulnerable to economic pressures, where the costs of neglect are extremely rare but potentially catastrophic, and delayed. Failure to adequately manage these risks will invite regulatory intervention, as has already been seen in the corporate sector via Sarbanes-Oxley.
Lessons from History
Older networking technologies can offer useful insight into future design choices. For example, the Internet does not have separate control and data channels, unlike some legacy networks, such as X.25 and SS7. "Are you serious about putting management bits in the data plane? Didn't escape bits cure us of that idea?" Ken Klingenstein asked. Revisiting what worked and what didn't in the light of today's world and hardware would be a useful exercise.
Learning lessons from non-technical disciplines such as economics and psychology would also be useful in defining future network goals. The original Internet was designed to share resources by interconnecting them; it was not designed to thwart the abusive exploitation of interconnected resources. Dave related the comments of students of sociology: "We've been studying human nature for 2000 years; you didn't think of spam?"
Under the heading of "original design principles that still work," there was a deep, shared belief that the decentralized nature of the Internet has been essential to its rapid expansion, inclusiveness, and usefulness. Upper-layer technologies modeled on this distributed nature have tended to be more successful than those that tried to impose a central authority or hierarchy. PKI's long struggle for traction is a good example of this. However, decentralized control can cause serious manageability and predictability problems.
6. CLEAN SLATE
The NSF recently approached Dave Clark, Senior Research Scientist at MIT, to serve as an advisor for the GENI initiative. This initiative intends to build a testbed to test revolutionary new networking technology in which nothing is presupposed. FIND is a separate, broader research effort which would benefit from the use of this infrastructure. Together, these projects form the biggest opportunity yet to challenge some of the basic assumptions on which the original Internet was built.
The FIND solicitation was issued in the weeks following the workshop, and it was unclear how the academic community would respond to the challenge. The NSF model is not designed to provide central organization and management for projects under its solicitations. The standard funding model of fire and forget works so well because, in Dave's words, "your peers are far more savage than the NSF could ever be." While that leads to excellent competition and good research, it doesn't breed co-operative solutions. Issuance of a very large single grant is one approach used to encourage team formation, but no team was known that would be naturally poised to respond to FIND.
The first few years of this solicitation will probably focus on the development of such a community and convergence towards a set of ideas. The group suggested that additional workshops and conferences would be one useful platform for this conversation. More sessions between the research and deployment communities, particularly including more researchers, were recommended.
Iterative or evolutionary change has a mixed track record. The design of IPv6 began in 1990 with a mandate of backward compatibility. After 15 years of the good ideas imbedded in IPv6 leaking gradually into IPv4, IPv6 deployment still is confined to a relatively small portion of the Internet. Is that because benefit does not exceed the cost of conversion (and/or work-arounds such as NAT and CIDR are deemed adequate), or is the evolutionary precept fundamentally flawed? Dave posed the question, "Would [IPv6] have gone better if it were a bigger change?" With that challenge in mind, the conference suggested some potentially radical changes.
7. RADICAL IDEAS
What if the protocol stack were compressed? Would that simplify, and thereby help make the network more manageable?
Could we do without layer-3 and its IP addresses? What if there was a way to directly route URLs? (And you thought current routing tables were big!) Even if topology-laden IP addresses were abandoned, there would still need to be end-point identifiers. Could MAC addresses satisfy that role? Could layer-2 networking be redefined to avoid its current scaling limitations? This would require replacing all broadcast protocols, e.g. ARP. If that happened, would the replacement protocols become as complex as the ones we sought to eliminate?
And what about TCP/UDP ports? Has perimeter defense already made them obsolete? Are we really headed for a two-port (80, 443) Internet? What do we need for the future? There obviously needs to be some form of flow multiplexing/demultiplexing for processes sharing the same end-point, so if the current trends in perimeter defense do render port multiplexing obsolete, this will result in additional host complexity to compensate -- or a radical shift in the use of network addresses for not just end-point identification, but process identification as well, presumably in combination with virtual machine concepts on the hosts.
Redefining the edge
Finally, what if it turned out that thin clients really were the right answer? Would complexity then be concentrated in a comparatively small number of servers, or perhaps even distributed throughout the core of the network? We already have some prototypes of this paradigm in the form of web-based application-service-providers (ASPs). How would this concept, were it to become fundamental, change the network?
The design principle of keeping the network core simple (and putting complexity in the edge systems) undoubtedly fueled the growth of the Internet, both because it helped moderate the cost of network devices, and because it encouraged innovation and new applications by end users (rather than network owners). On the other hand, we now have hundreds of millions of very complicated end-systems that are very hard to manage. The corresponding accommodation has been an industry-wide effort to improve patch management.
The delineation of the responsibilities of the network versus the devices using the network has become blurry. Michael Gettes of Duke pointed out that it may be desirable to remove much of the burden of networking properly from machines and applications. Dave expanded on this, saying that all the vulnerabilities in Windows and similar suites were not simply the result of corporate idiocy; tightly interlinked, enormously complicated systems are just virtually impossible to secure.
Should there be a security boundary between hosts and the network, such that network-borne attacks never make it to the host? Is such an idea feasible? Conversely, should end-systems be constrained so that they are incapable of harming the network, or do networks just become smarter about defending themselves from the infinitely capable and malleable host?
To do so implies moving current TCP/IP functionality to the other side of the host/network boundary. Abstracting the host interface to higher-level functions might bring us full-circle to the days of ARPANET host protocols. Currently, the network only responds to very low-level commands, leaving the details of communication and functional abstraction to the hosts themselves. Is that optimal? How should devices communicate connection needs to the network in the future? Can the network itself instead be asked to "retrieve that pile of data," or, "create a connection with a particular service out in cyberspace?" Could it be possible for the network to handle much of the functionality future devices need to communicate, allowing them to be simpler and cheaper?
Moving more functionality into the network sounds a lot like re-building a telephone network. This has some attractive security properties, but if end-devices retain the capability to run executable code of any kind, we could end up with the worst of both worlds. And if they don't we may have totally destroyed the ability for end-users to innovate -- the very property that arguably made the Internet a success.
This workshop did not provide answers to the ultimate question of future Internet design, but it did shed light on the issues, and possible next steps.
The wide range of suggestions and presentations at Reconnections confirmed a) the need for some immediate efforts to fix certain aspects of the current Internet, and b) the enticing opportunities that result from throwing away the requirement for backward compatibility. It is hard to envision a strategy of incremental improvement that would lead to an elegant solution to today's conflicting connectivity requirements, especially since it is not yet clear that an elegant solution is possible even in a clean-slate context.
The Internet must change. To what extent that change is evolutionary vs. revolutionary remains to be seen, but even evolutionary efforts need to be guided by a better sense of the ideal goal. Hence the importance of approaching the problem along parallel tracks:
The clean-slate studies would be followed by an analysis to determine:
In any case, extensive further discussion is needed to select the core ideas for any new network paradigm. This must occur even before an appropriate testbed could be constructed. Similarly, an action plan must be created to address the pressing manageability problems of the current Internet.
.Short-term action items
a. This report.
b. One or more whitepapers for CIOs and upper administration on philosophy, drivers, inverted economics, fault blocks as well as on firewalls, important regulations, and short/med/longterm issues.
c. Focused work with vendors and IETF to try to better address the problem of silent failure, including both inadequate real-time router diagnostics and policy-based silent failures.
d. A workshop on "Living with Lambdas," bringing together the science community who want Lambdas and the IT community who run today's networks.
e. Diagnostics/Telemetry Reconnections writeup
f. Consider establishing or engaging a consortium focused on real-world manageability. (The CalConnect calendaring and scheduling consortium is a positive reference model.)
Medium-term action items
a. Collaborative work within R&E on the integration of identity and trust into networks and their openness. Particular challenges exist in testing.
b. Whitepapers on high-availability vs. convergence, and open vs. closed tensions, identifying long-term service requirements. In particular, assess the extent to which One-Size-Fits-All networks need to be supplanted by multiple service classes and/or networks that are dynamically reconfigurable by end-users or local administrators.
c. Build a corporate testbed/bridge.
d. Inform NSF, etc. on medium-term efforts.
Long-term (Clean Slate) action items
a. Maintain strong ties between the network research community, network architects, and network operators to collaborate over time on problem statements and solutions design.
b. Inform meetings next summer with GENI PIs on operational and manageability concerns.
c. Promptly communicate likely Clean Slate directions to campus network architects as they surface.
d. Alert GENI on campus traversal issues