Skip to content
This repository was archived by the owner on Feb 8, 2023. It is now read-only.

Comparison of IPFS and BitTorrent for Archives #208

Open
flyingzumwalt opened this issue Dec 30, 2016 · 10 comments
Open

Comparison of IPFS and BitTorrent for Archives #208

flyingzumwalt opened this issue Dec 30, 2016 · 10 comments

Comments

@flyingzumwalt
Copy link

For a project that's looking to store a lot of data redundantly and validate it (ie. #ClimateMirror, what's the best way to explain the differences between IPFS and BitTorrent? What advantages and weaknesses should a project like that consider?

As a starting point, there's this bit on page 4 of the ipfs whitepaper

Unlike BitTorrent, BitSwap is not limited to the blocks in one torrent. BitSwap operates as a persistent marketplace where node can acquire the blocks they need, regardless of what files those blocks are part of. The blocks could come from completely unrelated files in the filesystem. Nodes come together to barter in the marketplace

@flyingzumwalt
Copy link
Author

The main distinction I'm aware of is the fact that BitTorrent relies on torrent files, each of which contains a content-addressed manifest of the blocks that make up particular content. This has some ramifications:

  • forces you to choose what is in each torrent file -- ie. do you create one huge torrent file for all of your datasets or do you make a torrent file per-dataset?
  • forces you to track the torrent files themselves with some other tool/system
  • requires you to create metadata about the torrent files
  • does not natively provide a way to identify torrent files themselves using cryptographic hashes
  • does not handle versioning of content

By contrast, IPFS lets you build a DAG of arbitrary size and structure.

Some advantages that occur to me:

  • You can track both the content and the metadata in the IPFS DAG
  • You can add multiple versions of a dataset to IPFS. Each version gets a unique hash and IPFS does its best to avoid storing duplicate blocks
  • You have complete control over which blocks are stored on which IPFS node -- this has huge advantages for distributing storage/backup (see ipfs-cluster)

@flyingzumwalt
Copy link
Author

Oh- and you can reference contents/files within a datasets using merkle paths and link to them with merkle links.

@meyerzinn
Copy link

meyerzinn commented Jan 2, 2017

For Climate Mirror, the big advantages include:

  • Being able to access files in folders without downloading an entire dataset (especially for the researchers who need to use this data)
  • IPNS. Need I say more? We can host an index of both IPFS hashes and normal mirrors, and update it frequently. Thus, we have a content discovery mechanism. https://ipfs.io/ipns/QmRsCTmkqL35LZ7uBGDoPnLtgJuyiEDDXjLaFYmMWsmTaM
  • No duplicate blocks is huge.

That's among several other advantages, but those are some key points I've found.

NOTE: That index is simply a sampling for an explorer I'm building. The real index will have IPFS datasets, etc.

@yousefamar
Copy link

yousefamar commented Aug 9, 2017

@flyingzumwalt, @20zinnm I'd be interested to hear your thoughts on how IPFS compares to BitTorrent v2 — it seems to me the gap has gotten smaller.

@flyingzumwalt
Copy link
Author

The key distinguishing factor in my mind is the fact that IPFS allows you to use any hash, of any content or any subset of content, as an identifier. You can use that hash to ask the network who has that exact content. This makes the system much more flexible than bittorrent, because you can precisely identify exactly the content you are providing or requesting, regardless of whether it's a huge set of files, a single file, a part of a file, or a single entry from some dataset. Contrast this with bittorrent's reliance on torrent files, which bundle data together according to however that torrent file was originally structured by its creator.

As far as I can tell, bittorrent v2 does not decrease this reliance on torrent files.

@DougAnderson444
Copy link

or a single entry from some dataset

@flyingzumwalt This seems like a big advantage over torrent, can you point to the reference on how to do that in IPFS ?! Thanks!

@TUSF
Copy link

TUSF commented Jan 12, 2020

@DougAnderson444 This is a consequence of the merkle tree structure of IPFS. BitTorrent breaks up a folder of files into equally size blocks, that cut between files, whereas IPFS treats each file as its own unit (and some times a collection of units). A "folder" in IPFS is a merkle node that contains links to other nodes, that IPFS then retrieves down a tree, using content addressed data.

For people more familiar with BitTorrent, one can think of IPFS as a single large swarm, where each folder is a link to another torrent (within the same swarm).

@dwiyatci
Copy link

What about the content immutability?

@balupton
Copy link
Member

balupton commented Apr 11, 2021

This is a consequence of the merkle tree structure of IPFS. BitTorrent breaks up a folder of files into equally size blocks, that cut between files, whereas IPFS treats each file as its own unit (and some times a collection of units).

BitTorrent can do per-file blocks by adding padding blocks that are then ignored by the clients:

https://en.wikipedia.org/wiki/BitComet#Padding_files

BitTorrent v2 also does per-file blocks.

@sarvagyaa
Copy link

sarvagyaa commented Sep 27, 2021

Climate Mirror seems to have gone with torrents, not IPFS. @flyingzumwalt @meyerzinn Was this your decision? If yes, would you be able to share the rationale?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants