Skip to content
This repository was archived by the owner on Jun 2, 2020. It is now read-only.

Commit 314345b

Browse files
Merge pull request #364 from ipfs/feat-copyedits-dht-conceptdoc
Copy edits on @bertrandfalguiere's new DHT concept doc
2 parents 641bc88 + 4b67ad6 commit 314345b

File tree

1 file changed

+39
-39
lines changed

1 file changed

+39
-39
lines changed

content/guides/concepts/dht.md

Lines changed: 39 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "Distributed Hash Table (DHT)"
2+
title: "Distributed Hash Tables (DHT)"
33
menu:
44
guides:
55
parent: concepts
@@ -9,77 +9,77 @@ menu:
99

1010
## What is a DHT?
1111

12-
[Distributed Hash Tables](https://en.wikipedia.org/wiki/Distributed_hash_table) (DHT) are a distributed key-value store where keys are [cryptographic hashes](https://docs.ipfs.io/guides/concepts/hashes/).
12+
[Distributed Hash Tables](https://en.wikipedia.org/wiki/Distributed_hash_table) (DHTs) are distributed key-value stores where keys are [cryptographic hashes](https://docs.ipfs.io/guides/concepts/hashes/).
1313

14-
DHTs are distributed. Each "peer" (or "node") is responsible for a subset of the DHT.
15-
A receiving peer either answers the request, or the request is passed to another peer until a peer can answer it.
16-
Depending on implementations, a request not answered by the first contacted node can be :
17-
- forwarded from peer to peer and the last peer contact the requesting peer
18-
- forwarded from peer to peer and the answered is forwarded following the same path
19-
- answered with contacting information of a node having better chances to be able to answer (IPFS uses this strategy)
14+
DHTs are distributed. Each "peer" (or "node") is responsible for a subset of the DHT.
15+
When a peer receives a request, it either answers it, or the request is passed to another peer until a peer that can answer it is found.
16+
Depending on the implementation, a request not answered by the first node contacted can be:
17+
- forwarded from peer to peer, with the last peer contacting the requesting peer
18+
- forwarded from peer to peer, with the answer forwarded following the same path
19+
- answered with the contact information of a node that has better chances to be able to answer. **IPFS uses this strategy.**
2020

21-
DHT's decentralization provides advantages compared to a classic Key-Value store:
22-
- *scalability* as a request for a hash of length *n* takes at most *log2(n)* steps to resolve.
23-
- *fault tolerance* via redundancy, so that lookups are possible even if peers unexpectedly leave or join the DHT. Additionally, requests can be addressed to any peer if one is slow or unavailable.
24-
- *load balancing* as requests are made to different nodes and no unique peers process all the requests.
21+
DHTs' decentralization provides advantages compared to a traditional key-value store, including:
22+
- *scalability*, since a request for a hash of length *n* takes at most *log2(n)* steps to resolve.
23+
- *fault tolerance* via redundancy, so that lookups are possible even if peers unexpectedly leave or join the DHT. Additionally, requests can be addressed to any peer if another peer is slow or unavailable.
24+
- *load balancing*, since requests are made to different nodes and no unique peers process all the requests.
2525

26-
## How do DHTs work?
26+
## How does the IPFS DHT work?
2727

2828
### Peer IDs
29-
Each peer have a peerID which is a hash with the same length *n* as the DHT keys.
29+
Each peer has a `peerID`, which is a hash with the same length *n* as the DHT keys.
3030

3131
### Buckets
32-
A subset of the DHT maintained by a peer is called a 'bucket'.
33-
A bucket maps to hashes with the same prefix as the PeerID up to *m* bits. There are 2^m buckets. Each bucket maps for 2^(n-m) hashes.
32+
A subset of the DHT maintained by a peer is called a 'bucket'.
33+
A bucket maps to hashes with the same prefix as the `peerID`, up to *m* bits. There are 2^m buckets. Each bucket maps for 2^(n-m) hashes.
3434

35-
- For example, if m = 2^16 and with an hexadecimal encoding (4 bits per displayed character), the peer with PeerID "ABCDEF12345" maintains mapping for hashes starting with "ABCD".
36-
Some hashes falling into this bucket would be *ABCD*38E56, *ABCD*09CBA or *ABCD*17ABB ...
35+
For example, if *m*=2^16 and we use hexadecimal encoding (four bits per displayed character), the peer with `peerID` 'ABCDEF12345' maintains mapping for hashes starting with 'ABCD'.
36+
Some hashes falling into this bucket would be *ABCD*38E56, *ABCD*09CBA or *ABCD*17ABB, just as examples.
3737

38-
The size of the buckets are related to the size of the prefix. The longer the prefix, the less hashes each peer has to manage, the more peers are needed.
38+
The size of a bucket is related to the size of the prefix. The longer the prefix, the fewer hashes each peer has to manage, and the more peers are needed.
3939
Several peers can be in charge of the same bucket if they have the same prefix.
4040

41-
In most DHTs, including IPFS's Kademlia implementation, the size of the buckets (and the size of the prefix), are dynamic.
42-
Buckets size growths and prefix size shortens when many nodes leaves the DHT, and vice versa. **(/!\ or does it depends on the number of records and not the number of nodes? Is it at bucket level or DHT level?)**
41+
In most DHTs, including [IPFS's Kademlia implementation](https://github.com/libp2p/specs/blob/8b89dc2521b48bf6edab7c93e8129156a7f5f02c/kad-dht/README.md), the size of the buckets (and the size of the prefix), are dynamic.
42+
4343

4444
### Peer lists
4545

46-
Peers also keep connection to other peers to forward requests if the requested hash is not in their own bucket.
46+
Peers also keep a connection to other peers in order to forward requests if the requested hash is not in their own bucket.
4747

48-
If hashes are of length n, a peer will keep n-1 lists of peers:
49-
- the first list contains peers which ID have a different 1st bit.
50-
- the second list contains peer which have their first bits identical to its own, but a different second bit
48+
If hashes are of length n, a peer will keep n-1 lists of peers:
49+
- the first list contains peers whose IDs have a different first bit.
50+
- the second list contains peers whose IDs have first bits identical to its own, but a different second bit
5151
- ...
52-
- the m-th list contains peer which have their first m-1 bits identical, but a different m-th bit
52+
- the m-th list contains peers whose IDs have their first m-1 bits identical, but a different m-th bit
5353
- ...
5454

55-
The higher m, the harder it is to find peers which have the same ID up to m bits. The lists of "closest" peers typically remains empty.
55+
The higher m is, the harder it is to find peers that have the same ID up to m bits. The lists of "closest" peers typically remains empty.
5656
"Close" here is defined as the XOR distance, so the longer the prefix they share, the closer they are.
57-
Lists also have a maximum of entries k, otherwise the first lists would contain half the network, then a fourth, etc.
57+
Lists also have a maximum of entries (k) — otherwise the first lists would contain half the network, then a fourth of the network, and so on.
5858

5959
### Using the DHT
6060

61-
When a peer receives a lookup request, it will either answer with a value if it falls into its own bucket, or answer with the contacting information (IP + port, peerID, etc) of a closer peer. The requesting peer can then send its request to this closer peer. The process goes on until a peer is able to answer it.
62-
A request for a hash of length n will take at maximum log2(n) steps, or even log2m(n).
61+
When a peer receives a lookup request, it will either answer with a value if it falls into its own bucket, or answer with the contacting information (IP+port, `peerID`, etc.) of a closer peer. The requesting peer can then send its request to this closer peer. The process goes on until a peer is able to answer it.
62+
A request for a hash of length n will take at maximum log2(n) steps, or even log2m(n).
6363

64-
# The DHT of IPFS
64+
### Keys and hashes
6565

66-
In IPFS Kademlia's DHT, keys are SHA256 hashes.
66+
In IPFS's Kademlia DHT, keys are SHA256 hashes.
6767
[PeerIDs](https://docs.libp2p.io/concepts/peer-id/) are those of [libp2p](https://libp2p.io/), the networking library used by IPFS.
6868

69-
We use a DHT to lookup two types of objects (both represented by a SHA256):
70-
- [Content IDs](https://docs.ipfs.io/guides/concepts/cid/) of the data added to IPFS. A lookup of this value will give the peerIDs of the peers having this immutable content
71-
- [IPNS records](https://docs.ipfs.io/guides/concepts/ipns/). A lookup will give the last Content ID associated with this IPNS address, enabling routing mutable content
69+
We use the DHT to look up two types of objects, both represented by SHA256 hashes:
70+
- [Content IDs](https://docs.ipfs.io/guides/concepts/cid/) of the data added to IPFS. A lookup of this value will give the `peerID`s of the peers having this immutable content.
71+
- [IPNS records](https://docs.ipfs.io/guides/concepts/ipns/). A lookup will give the last Content ID associated with this IPNS address, enabling the routing of mutable content.
7272

73-
Consequently, IPFS's DHT is one of the way of mutable and immutable [Content Routing](https://docs.libp2p.io/concepts/content-routing/). It's currently the only one [implemented](https://libp2p.io/implementations/#peer-routing).
73+
Consequently, IPFS's DHT is one of the ways to achieve mutable and immutable [content routing](https://docs.libp2p.io/concepts/content-routing/). It's currently the only one [implemented](https://libp2p.io/implementations/#peer-routing).
7474

75-
Per specification, the default bucket size k is 20 meaning each of the 255 lists of peers contain at most 20 peers.
75+
You can learn more in the [libp2p Kademlia DHT specification](https://github.com/libp2p/specs/blob/8b89dc2521b48bf6edab7c93e8129156a7f5f02c/kad-dht/README.md).
7676

7777
## Usage
7878

7979
### Adding an entry
8080

81-
Adding a blob of data to IPFS is just advertizing that you have it. Since DHT is the only content routing implemented, you can just use
81+
Adding a blob of data to IPFS is the equivalent of advertising that you have it. Since DHT is the only content routing implemented, you can just use:
8282
` ipfs add myData`
83-
IPFS will automatically chunk your data and add a mapping on the DHT between the Content ID and your PeerID. Note that there can be other Peer IDs already mapped to that value, so you will be added to the list. Also note that if the provided data is bigger than 124kb, it will be chunked in "blocks" and both blocks and overall data will be mapped.
83+
IPFS will automatically chunk your data and add a mapping on the DHT between the Content ID and your `peerID`. Note that there can be other `peerID`s already mapped to that value, so you will be added to the list. Also note that if the provided data is bigger than 124KB, it will be chunked into "blocks", and both those blocks and the overall data will be mapped.
8484

8585
You can publish an IPNS record using [`ipfs.name.publish`](https://docs.ipfs.io/guides/concepts/ipns/).

0 commit comments

Comments
 (0)