Peer-to-Peer (P2P) Networks

P2P Networks Book
P2P Tutorial Videos
Global Scale P2P Networks
Existing P2P Networks and P2P Technologies
P2P Network Topologies
- Unstructured P2P Topologies
- Structured P2P Topologies
Polymorph Polyring P2P Topology
P2P Network Challenges
P2P Tutorial Video

Jakob Jenkov
Last update: 2024-05-04

Peer-to-Peer (P2P) networks is a class of distributed systems in which peers (computers or "nodes") work together without a central server to both provide and use one or more services. Thus, in a peer-to-peer network the peers typically acts as both clients and servers - meaning clients of other peers, and servers for other peers. This is the reason the nodes in a peer-to-peer network are referred to as peers - because the nodes are each other's equals (peers), - unlike in a client-server topology where the client and server have different roles and responsibilities.

P2P networks are also decentralized systems. However, not all decentralized systems are P2P systems. For instance, a database such as Cassandra may use P2P techniques internally, where all nodes in a Cassandra cluster are both clients and servers of each other. However, a Cassandra cluster will also have a set of external clients which are only clients of the cluster, and not participants in providing the database service. Thus, a system of clients using a Cassandra database is not classified as a P2P network. However, P2P techniques can still be very useful within such systems.

P2P Networks Book

I am currently writing a book about P2P network technology. In a book I can present the topics as a longer, coherent stream - instead of the more independent nature of articles. You can find my P2P Networks book here:

P2P Networks

P2P Tutorial Videos

I have published a few videos about P2P networks too. You can find them here:

P2P Networks Video Playlist

Global Scale P2P Networks

In this P2P network tutorial I will focus on global scale P2P networks - on how to get millions of peers to work together - without a central server. Global scale P2P networks can typically also be used for smaller workgroups, but small workgroup P2P technologies are not always good for global scale P2P networking. That's why I focus on large scale P2P technologies.

Existing P2P Networks and P2P Technologies

There are already many active P2P networks out there, and several P2P technologies. Some of the products that are centered around a P2P network are:

LBRY
IPFS
Beaker Browser
Skype
BitTorrent
Gnutella

There is also an open source toolkit named libp2p which provides some implementations of the Kademlia P2P algorithm.

P2P Network Topologies

By the term P2P network topology I mean the structure peers in a P2P network use to connect to each other. There are several different P2P network topologies. I will get into those later.

The P2P network topologies tend to dictate the P2P network algorithms such as how peers find each other and route messages between each other, as described in the section P2P Network Challenges below.

P2P network algorithms are typically centered around a specific P2P topology. These P2P topologies can basically be categorized into two classes of P2P networks:

Unstructured P2P Topologies
Structured P2P Topologies

I will explain both in a bit more detail in the following sections.

Unstructured P2P Topologies

Unstructured P2P topologies do not attempt to organize all peers into a single, structured topology. Instead, each peer attempts to keep a "sensible" set of other known peers in the network in a routing table.

In an unstructured P2P topology there is no real system to which other peers a given peer keeps in its routing table. Thus, an unstructured P2P topology typically cannot give any guarantees about whether it is possible to find or reach a given peer in the network. Nor can it give any guarantees about how long time it will take to find or reach a given peer.

This lack of findability and reachability guarantee means that an unstructured P2P network could fragment into multiple separate networks, thus making findability and routability in the total network impossible.

The reason unstructured P2P topologies are still used sometimes is, that they are simple to implement, and it is reasonably simple to maintain each peer's routing table at runtime.

Some of commonly known unstructured P2P network topologies are:

Gnutella
Dandelion
Dandelion++
Swim

Structured P2P Topologies

Structured P2P topologies attempt to organize all peers in the network into a single, structured topology. This structured topology then provides reasonably predictable findability of peers and routability of messages. Structured P2P topologies attempt to avoid network fragmentation too.

Some of the structured P2P network topologies I know of, are:

N-Dimensional Grid
Tree
Uniring
Polyring

These structured P2P topologies are illustrated here:

There are several implementations of the above P2P topologies. Some of the more well-known implementations are listed in this table:

P2P Implementation	P2P Topology
Chord	Uniring
Kademlia	Uniring
Tapestry	Uniring
Pastry	Uniring
Polymorph Polyring	Polyring

While some of these P2P implementations may implement the same topology, they will typically vary in how they implement this topology.

The Chord, Kademlia, Tapestry and Pastry P2P implementations are all similar. They all organize the peers into a single virtual ring (uniring) with each peer holding a routing table pointing to a subset of the other peers in the network. In other words, all peers are part of the same ring. The main difference between each peer in the ring is which other peers it references in its routing table.

Chord and Kademlia use similar types of routing tables and routing / lookup algorithms. Also, Pastry and Tapestry use similar types of routing tables. More specifically, Pastry and Tapestry use a routing table design where you can adjust the number of peers kept in each peer's routing table. The more peers in a routing table, the fewer hubs it takes to find a specific peer, but the more memory / storage is needed to keep the routing table within each peer.

Other algorithms can be layered on top of the fundamental connection, location and routing algorithms. Algorithms such as distributed consensus, distributed hashtables (DHT), broadcast, multicast, caching etc. However, in this P2P tutorial I will focus mostly on the underlying P2P network organizing algorithms - meaning the algorithms used to organize the given P2P network topology.

The Polymorph polyring topology is different from the uniring topologies though - so I will talk a bit more about that in the following section.

Polymorph Polyring P2P Topology

I am currently working on an "art" project called Polymorph which which can also be used to implement P2P networks (among other network topologies). Here is my article about the Polymorph Polyring P2P Topology and Algorithms.

The Polymorph network topology is designed to enable greater control over network topology and performance than Chord, Kademlia, Pastry and Tapestry. Instead of striving for a uniform topology like Chord, Kademlia, Pastry and Tapestry does, Polymorph strives to enable an asymmetric polymorphic topology - meaning a topology that is able to morph to suit various different use cases.

P2P Network Challenges

A P2P network has to solve a set of challenges in order to be functional. The exact challenges depends on what service the P2P network is trying to provide. However, there are some basic challenges all P2P networks need to address. These challenges are:

Connectivity
Addressability
Findability and / or Routability

Each of these challenges are explained in more detail below. Once each of these challenge have been addressed you can start building a service on top of the P2P network. Depending on the service, other challenges may then emerge, such as:

Broadcast and multicast
Caching
Lookup (DHT)
Load balancing
Authentication and authorization
Privacy
etc.

My project Polymorph will eventually have to address several of these challenges too.

Connectivity

For peers in a P2P network to be able to communicate, they must be able to connect to each other. As many peers will be running behind NAT or firewalls, a P2P network must provide some solution to make it possible to still communicate with these peers behind NAT / firewall.

I cover peer connectivity in a bit more detail in this article: Peer Connectivity.

Addressability

For peers in a P2P network to be able to communicate they must also be able to address each other uniquely. Otherwise, a peer cannot specify which peer it wants to talk to. P2P networks typically address this through a peer address or GUID which is unique to each peer.

I cover peer addressability in more detail in this article: Peer Addressability

Findability and Routability

Once a peer has and address of a peer it wants to communicate with, it needs to be able to find that peer in the network so it can establish a connection to it. Alternatively the peer needs to be able to have messages routed through the P2P network to the destination peer.

Typically, P2P networks address these challenges via a combination of the peer addresses (GUIDs) and routing tables. The GUIDs are structured in a way that enables some system of organization of the GUIDs which then enables findability of peers and routability of messages. Therefore you will often see a close connection between the structure of peer GUIDs and the structure of the routing tables used to organize the network and provide findability and routability.

I cover peer findability and message routability in this article: Peer findability and message routability

P2P Tutorial Video

Some years ago I created a video explaining the basics of how P2P network algorithms such as Chord and Kademlia works. Here it is:

I will probably make a new set of videos for P2P network algorithms as I am updating this tutorial series (from 2021 and forward).

Next: Peer Connectivity

Tweet
	Jakob Jenkov