|
| 1 | +--- |
| 2 | +title: Distributed Hash Tables (DHTs) |
| 3 | +--- |
| 4 | + |
| 5 | +# Distributed Hash Tables (DHTs) |
| 6 | + |
| 7 | +## What is a DHT? |
| 8 | + |
| 9 | +[Distributed Hash Tables](https://en.wikipedia.org/wiki/Distributed_hash_table) (DHTs) are distributed key-value stores where keys are [cryptographic hashes](/essentials/hashing). |
| 10 | + |
| 11 | +DHTs are, by definition, distributed. Each "peer" (or "node") is responsible for a subset of the DHT. |
| 12 | +When a peer receives a request, it either answers it, or the request is passed to another peer until a peer that can answer it is found. |
| 13 | +Depending on the implementation, a request not answered by the first node contacted can be: |
| 14 | + |
| 15 | +- forwarded from peer to peer, with the last peer contacting the requesting peer |
| 16 | +- forwarded from peer to peer, with the answer forwarded following the same path |
| 17 | +- answered with the contact information of a node that has better chances to be able to answer. **IPFS uses this strategy.** |
| 18 | + |
| 19 | +DHTs' decentralization provides advantages compared to a traditional key-value store, including: |
| 20 | + |
| 21 | +- _scalability_, since a request for a hash of length _n_ takes at most _log2(n)_ steps to resolve. |
| 22 | +- _fault tolerance_ via redundancy, so that lookups are possible even if peers unexpectedly leave or join the DHT. Additionally, requests can be addressed to any peer if another peer is slow or unavailable. |
| 23 | +- _load balancing_, since requests are made to different nodes and no unique peers process all the requests. |
| 24 | + |
| 25 | +## How does the IPFS DHT work? |
| 26 | + |
| 27 | +### Peer IDs |
| 28 | + |
| 29 | +Each peer has a `peerID`, which is a hash with the same length _n_ as the DHT keys. |
| 30 | + |
| 31 | +### Buckets |
| 32 | + |
| 33 | +A subset of the DHT maintained by a peer is called a 'bucket'. |
| 34 | +A bucket maps to hashes with the same prefix as the `peerID`, up to _m_ bits. There are 2^m buckets. Each bucket maps for 2^(n-m) hashes. |
| 35 | + |
| 36 | +For example, if _m_=2^16 and we use hexadecimal encoding (four bits per displayed character), the peer with `peerID` 'ABCDEF12345' maintains mapping for hashes starting with 'ABCD'. |
| 37 | +Some hashes falling into this bucket would be *ABCD*38E56, *ABCD*09CBA or *ABCD*17ABB, just as examples. |
| 38 | + |
| 39 | +The size of a bucket is related to the size of the prefix. The longer the prefix, the fewer hashes each peer has to manage, and the more peers are needed. |
| 40 | +Several peers can be in charge of the same bucket if they have the same prefix. |
| 41 | + |
| 42 | +In most DHTs, including [IPFS's Kademlia implementation](https://github.com/libp2p/specs/blob/8b89dc2521b48bf6edab7c93e8129156a7f5f02c/kad-dht/README.md), the size of the buckets (and the size of the prefix), are dynamic. |
| 43 | + |
| 44 | +### Peer lists |
| 45 | + |
| 46 | +Peers also keep a connection to other peers in order to forward requests if the requested hash is not in their own bucket. |
| 47 | + |
| 48 | +If hashes are of length n, a peer will keep n-1 lists of peers: |
| 49 | + |
| 50 | +- the first list contains peers whose IDs have a different first bit. |
| 51 | +- the second list contains peers whose IDs have first bits identical to its own, but a different second bit |
| 52 | +- ... |
| 53 | +- the m-th list contains peers whose IDs have their first m-1 bits identical, but a different m-th bit |
| 54 | +- ... |
| 55 | + |
| 56 | +The higher m is, the harder it is to find peers that have the same ID up to m bits. The lists of "closest" peers typically remains empty. |
| 57 | +"Close" here is defined as the XOR distance, so the longer the prefix they share, the closer they are. |
| 58 | +Lists also have a maximum of entries (k) — otherwise the first lists would contain half the network, then a fourth of the network, and so on. |
| 59 | + |
| 60 | +### Using the DHT |
| 61 | + |
| 62 | +When a peer receives a lookup request, it will either answer with a value if it falls into its own bucket, or answer with the contacting information (IP+port, `peerID`, etc.) of a closer peer. The requesting peer can then send its request to this closer peer. The process goes on until a peer is able to answer it. |
| 63 | +A request for a hash of length n will take at maximum log2(n) steps, or even log2m(n). |
| 64 | + |
| 65 | +### Keys and hashes |
| 66 | + |
| 67 | +In IPFS's Kademlia DHT, keys are SHA256 hashes. |
| 68 | +[PeerIDs](https://docs.libp2p.io/concepts/peer-id/) are those of [libp2p](https://libp2p.io/), the networking library used by IPFS. |
| 69 | + |
| 70 | +We use the DHT to look up two types of objects, both represented by SHA256 hashes: |
| 71 | + |
| 72 | +- [Content IDs](/essentials/content-addressing) of the data added to IPFS. A lookup of this value will give the `peerID`s of the peers having this immutable content. |
| 73 | +- [IPNS records](/essentials/ipns). A lookup will give the last Content ID associated with this IPNS address, enabling the routing of mutable content. |
| 74 | + |
| 75 | +Consequently, IPFS's DHT is one of the ways to achieve mutable and immutable [content routing](https://docs.libp2p.io/concepts/content-routing/). It's currently the only one [implemented](https://libp2p.io/implementations/#peer-routing). |
| 76 | + |
| 77 | +You can learn more in the [libp2p Kademlia DHT specification](https://github.com/libp2p/specs/blob/8b89dc2521b48bf6edab7c93e8129156a7f5f02c/kad-dht/README.md). |
| 78 | + |
| 79 | +## Usage |
| 80 | + |
| 81 | +### Adding an entry |
| 82 | + |
| 83 | +Adding a blob of data to IPFS is the equivalent of advertising that you have it. Since DHT is the only content routing implemented, you can just use: |
| 84 | +`ipfs add myData` |
| 85 | +IPFS will automatically chunk your data and add a mapping on the DHT between the Content ID and your `peerID`. Note that there can be other `peerID`s already mapped to that value, so you will be added to the list. Also note that if the provided data is bigger than 124KB, it will be chunked into "blocks", and both those blocks and the overall data will be mapped. |
| 86 | + |
| 87 | +You can publish an IPNS record using [`ipfs.name.publish`](/essentials/ipns). |
0 commit comments