You are here: Home » coding » Payload-Envelope-Transport (PET) model for peer-to-peer overlay networks

Payload-Envelope-Transport (PET) model for peer-to-peer overlay networks

by David M. Doolin, PhD on October 13, 2009

Preface

Several years ago I did some design and programming work in the peer-to-peer (P2P) space. What follows is a few notes I wrote during that time, but never got together for a conference paper. This is a good place to put it, and if you manage to get all the way through it, you will understand a lot more about what a P2P network is, and one way it does it’s business.

Note: this is a very high level overview! The details are difficult. If you’re not well-versed in P2P, consider just skipping this. It’s “For Record Only” as we used to say in the Indiana Cave Survey, back in the day.

Abstract

A concise description of the United States Postal Service as a time-proven architecture providing physical transport for constructing overlay networks is presented.

A virtual network architecture

Consider the task of communication in a corporate organization, let’s call it “BigCorp,” a hundred years ago, before telephony, fax, etc. We may assume that much of the internal and external communication was in written form, letters. BigCorp may be defined by a set of people occupying certain parcels of real estate conducting business of some arbitrary nature. For example, a steel company of 100 years ago could have had offices in Pittsburgh, Youngstown, New York City and Duluth. The “company” itself is an abstraction, a virtual object, a set of ideas defined by an arrangement of physical entities such as people, places and things.

However, the company communicates by way of letters using the United States Postal Service (USPS) for delivery. By design, the architecture of the USPS system is independent of the identity of the recipients, and the nature of the payload. USPS cares not at all who the letter recipients are; the only thing that matters is how to find the mailbox. Conversely, the company is unconcerned about how letters come to be delivered. For example, personnel actions come from company communication: “Veeblefester, headquarters says you’re fired as of yesterday.” It is entirely irrelevant to Veeblefester whether the letter comes by locomotive, Pony Express or 20 Mule Team, because the letter is from HQ, not from the Post Office in Death Valley.

The brilliance of the modern mail system is to separate the transportation of messages (i.e., letters), from the routing of letters (addressed envelope), from the payload which is the letter itself inside the envelope. The payload/letter is what specifies the company, the virtual organization, the group of peers engaged in a common activity. In fact, the USPS could operate very efficiently delivering stamped, addressed envelopes that do not actually carry anything at all! Envelopes with letters, empty envelopes, it’s irrelevant to the USPS. As long as an envelope is stamped and addressed, it can be delivered.

Corporations need only be concerned with the payloads. The method of transport, whether passenger pigeon or Concorde supersonic jet, is completely irrelevant to the content of the payload. Routing is also irrelevant. Everything could be sent through Memphis, Tennessee (Federal Express) or Louisville, Kentucky (United Parcel Service). Customers don’t care, nor should they.

This model can be followed by peer-to-peer network application development. The network can be partitioned into a physical transport component that carries packets, an envelope component that allows routing in the physical network, and a payload component, which can be completely arbitrary. For example, the payload component could be used to construct “peer groups”.

PET stack

PET stack

The following figure show the relationship of the payload-envelope-transport (PET) messaging stack to network activities, with supporting protocols on far right:

The physical network

Generally speaking, carriers do not own the medium, or exert more than second order control. The USPS does not own the roadways, and airline routes are controlled by federal regulations. Airliners can’t just take off and fly anywhere they want. In the same way, the wires and the airwaves are not completely controlled by those using wires and airwaves to construct virtual organizations.

The virtual network

Virtual networks are constructed from the contents of payloads. The information content of payloads is completely arbitrary.

While in extreme cases, payload may dictate transport, this is not completely coupled. While it is inconceivable to ship, say, a Nordik Track in an envelope, UPS would happy to deliver a Nordik Track box with a postcard inside. Or indeed, deliver an empty box.

The physical-virtual interface

Large corporate offices have more facilities to handle the chores of the physical network. They may be located in major urban areas well served by roads, planes, trains, etc. To some extent, the size of the virtual entity may depend on its physical capacity. One very interesting aspect of gnutella networks is that “super peers” or “hubs” emerge by self-organization. Gnutella nodes with high physical capacity (bandwidth) ended up having a dominant presence on the gnutella virtual network.

The primary coupling in a postal system is the actual recipient of the payload through the network address (i.e., street number). Many people can reside at one street address and one person can have multiple mailing addresses. The USPS doesn’t care. It is the responsibility of the virtual organization to route from the street address (or P.O. Box, etc) to actual recipients. Any further coupling creates more expense in the transport and physical routing, which quickly offsets any so-called optimizations induced by specializing the physical network or transport functions.

An example of poor coupling would be to require the USPS to open each envelope to find any part of the recipients physical address, and/or requiring routing along a certain path. In the first case, having addresses inside the envelope requires opening and resealing the envelope at each hop in the routing path. In the second case, requiring some certain path relies on the existence of that path in the physical layer, which the virtual layer should have no knowledge of. That is, no one specifies that Mother’s Day cards be routed through, say, Peoria, Illinios on its way to Chattanooga, Tennessee.

Some coupling is required between the virtual network and the physical network, and part of this has to be out-of-band. For example, phone books are used to look up phone numbers for people, who can then be called to obtain addresses, etc. There are many such analogues in the physical world, and these should be exploited for building virtual networks.

Engineering

Hoare’s First Law: There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.
-C.A.R.Hoare

The obvious and overriding implication of the preceding observations is that the engineering effort would benefit from a split into three relatively independent teams. The first team would handle the transport infrastructure, that is, build the roads, string the wires from point to point, launch satellites, etc. corresponding to the bottom row in the Stack figure above. The second team would handle the network infrastructure: message transport and routing in the physical domain of wires and airwaves corresponding to the middle row in the figure. The third team would handle virtual network infrastructure to create toolkits for virtual networks, “peer groups” etc. to be easily formed, corresponding to the top row in Stack illustration above.

Communication between “rows” is by interface. For example, the transport team may use TCP and specify an API, call it “sockets”, for use by the routing team.

At present, the Internet provides a largely complete transportation substrate. Building a virtual network (p2p) overlaying the Internet requires only teams 2 and 3. Just as the original unix socket’s authors (e.g., Bill Joy) are not necessary participants for programming based on the sockets API, the construction of a routing substrate for crossing firewalls and NATs need not include members of a team building virtual networks over the routing layer, and vice-versa.

Bootstrapping the network

One easy-to-understand way to encourage adoption is to implement an underlaying routing layer that provides an electronic envelope allowing programmers to send messages without regard to topology, and dedicate enough server/bandwidth to allow that piece of it to be transparent to programmers. The physical analogue is UPS or Federal Express, both of which use hubs. Given that the developer provided the source code for such a system, and actively encouraged its use, it seems reasonable that the community would jump to provide “safe” alternatives to prevent one company from dominating the network.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • Sphinn
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • TwitThis

Leave a Comment

CommentLuv Enabled