Implement multiple store backends #127

pdf · 2016-07-10T09:20:12Z

The first additional backend is BoltDB. A store flag has been added
to commands, that accepts the name of a store (as of this commit, valid
values are boltdb and rocksdb).

Stores register their availability via init(), which means that new
stores should be imported for side-effects wherever they may be
required directly (predominantly in commands or tests).

Store implementations can be included via build tags.

BREAKING: The default store has been set to boltdb, primarily because
this store is native Go, and can be built against even when CGO is not
available. To produce builds that include rocksdb support, the
rocksdb tag must be supplied, ie:

go install -tags rocksdb

This initial implementation does not include a functional merge or dlist
command for boltdb.

I saw the comment on #71 regarding the reasoning for choosing RocksDB (it should be noted that while BoltDB only allows a single writer, it does support multiple readers), but the
ability to build without CGO can be preferable for portability, and the important
performance characteristics (read vs write perf for instance) will differ from
project to project, so choice is also desirable.

I have not done significant testing or updated the documentation yet, but discussion would be
welcomed.

The tests/benches should be updated with Go 1.7, as multiple backends are particularly suited to sub-tests/-benchmarks.

This change is

CLAassistant · 2016-07-10T09:20:13Z

All committers have signed the CLA.

coveralls · 2016-07-10T09:28:08Z

Coverage decreased (-1.07%) to 51.568% when pulling 87279f8 on pdf:multi_store into 1df0d28 on dgraph-io:master.

coveralls · 2016-07-10T09:32:18Z

Coverage decreased (-1.1%) to 51.539% when pulling a948b1b on pdf:multi_store into 1df0d28 on dgraph-io:master.

The first additional backend is BoltDB. A `store` flag has been added to commands, that accepts the name of a store (as of this commit, valid values are `boltdb` and `rocksdb`). Stores register their availability via `init()`, which means that new stores should be imported for side-effects wherever they may be required directly (predominantly in commands or tests). Store implementations can be included via build tags. BREAKING: The default store has been set to `boltdb`, primarily because this store is native Go, and can be built against even when CGO is not available. To produce builds that include rocksdb support, the `rocksdb` tag _must_ be supplied, ie: ``` go install -tags rocksdb ``` This initial implementation does not include a functional `merge` command for boltdb.

coveralls · 2016-07-10T09:37:20Z

Coverage decreased (-1.2%) to 51.399% when pulling 60a77ae on pdf:multi_store into 1df0d28 on dgraph-io:master.

manishrjain · 2016-07-10T10:19:52Z

Multiple backend stores are not something we intend to support. There's an expectation of tight coupling between Dgraph and whatever we use to store data on disk. Which we've decided to be RocksDB, given the reasoning that you've rightly pointed out.

I don't buy the argument that each project is different so read/write concurrency doesn't matter. To begin with, our bulk loader is so efficient purely because of write concurrency, that BoltDB doesn't provide. Also, in live Dgraph instance, we periodically merge dirty posting lists, which currently go unnoticed performance-wise, but would be very negatively impacted if reads and writes become mutually exclusive. If we really wanted to kill concurrency, we could have just used Python.

Starting v0.4, which we're aiming to release this week, we'll be making it easier for people to use Dgraph by providing pre-built binaries, which can be readily run on Linux or Mac. That should tackle the main issue you're trying to solve here, which is ease of installation.

Sorry, I can't accept this pull request.

pdf · 2016-07-10T11:06:35Z

Multiple backend stores are not something we intend to support.

That could be a valid reason to reject this PR (maintenance complexity, though the surface area is quite small, had you looked at the code), but not for the reasons you've outlined.

There's an expectation of tight coupling between Dgraph and whatever we use to store data on disk.

Why? That's certainly not evident from the code except in a couple of the helper utilities. A KV store is pretty much a KV store, except in edge-cases or performance-related areas.

Which we've decided to be RocksDB, given the reasoning that you've rightly pointed out.

What reasoning is that? BoltDB provides good read throughput, so for read-heavy applications, it would possibly be an improvement.

I don't buy the argument that each project is different so read/write concurrency doesn't matter.

That's not what I said. I said that projects differ in their requirements for read or write throughput. For some projects, read throughput is vastly more important than write throughput.

To begin with, our bulk loader is so efficient purely because of write concurrency, that BoltDB doesn't provide.

I guess that's good, if a project cares about that functionality.

Also, in live Dgraph instance, we periodically merge dirty posting lists, which currently go unnoticed performance-wise, but would be very negatively impacted if reads and writes become mutually exclusive.

Read operations in BoltDB occur concurrently, writes are serialized. Do you have any benchmarks to back up your assertions here, or are you just making stuff up?

If we really wanted to kill concurrency, we could have just used Python.

Sick burn... or something.

Starting v0.4, which we're aiming to release this week, we'll be making it easier for people to use Dgraph by providing pre-built binaries, which can be readily run on Linux or Mac. That should tackle the main issue you're trying to solve here, which is ease of installation.

Actually, I was looking at embedding Dgraph in a cross-platform/-arch project, and did not want to carry a dependency on a C library.

Sorry, I can't accept this pull request.

Yeah, looks like Dgraph is not for us.

I have to say that based on the poor attitude in your response here, I think I may be better off not contributing to this project anyway. I was going to clean up a bunch of other stuff in the code-base after this merge, but I guess you can keep it.

manishrjain · 2016-07-10T12:25:55Z

Sorry you feel that way. But, it's understandable given you already put effort into something without talking to us about it. The best first step here would have been to bring up your intentions in discuss.dgraph.io, so we could have warned you about this up front, before you spent the effort.

Cgo is bad, I understand that you want to stay away from it. But, RocksDB doesn't have a Go equivalent. Bolt DB is surely not it.

pdf · 2016-07-10T12:51:21Z

Sorry you feel that way. But, it's understandable given you already put effort into something without talking to us about it. The best first step here would have been to bring up your intentions in discuss.dgraph.io, so we could have warned you about this up front, before you spent the effort.

My disappointment is not so much at the wasted effort (it was only about an hour, and I specifically mentioned in the PR that I was happy to discuss this), but in the dismissive and condescending tone of your response - this is not how you engage with potential community members. I notice you can't be bothered elaborating on your position with a response to my queries either.

Cgo is bad, I understand that you want to stay away from it. But, RocksDB doesn't have a Go equivalent. Bolt DB is surely not it.

So, you understand that cgo is problematic, but don't want to do anything about it? Once you have multiple backends, it's easy to add something that satisfies more use-cases, or is objectively better - by whatever metric the user measures 'better' as - and the user gets to choose the characteristics that are important to them. In a description on your discuss board, you even specify that you are simply using RocksDB as a layer to read/write data to disk, so I don't know why you'd be so attached to it, especially without data to support your performance claims.

manishrjain · 2016-07-10T13:03:42Z

I don't think my tone was condescending.

Choosing a library which allows reads and writes concurrently over a library which makes reads and writes mutually exclusive -- is a pretty important design decision, particularly for something that's going to be the basis of Dgraph. I'm not sure what a benchmark is going to prove here, it's basic computer science.

On the other hand, you haven't given any logic for your PR, other than you don't "want" to include any Cgo.

mohitranka · 2016-07-10T13:38:04Z

@pdf Thanks for your PR and effort. I am sorry that you feel disappointed on the decision of it not getting accepting, which is completely understandable given your effort for this PR.

Whether or not to support multiple backends is an important design decision, and there are successful projects with both ideologies -- eg. Postgres and MySQL. At this point, Dgraph is very young, and have decided to focus on tightly couple backend, as apart from engineering choices (which I am not going to get into, as I am not most qualified when it comes to Go eco-system) it allows to utilise the limited available bandwidth more effectively.

You can find more about the roadmap of dgraph at #1 and discuss major ideas at discuss.dgraph.io.

Once again, we are thankful for your contribution, and sorry for your disappointment. I do however hope you continue to contribute.

pdf · 2016-07-10T15:11:07Z

@mohitranka thanks, this is a sensible response. The implementation to support multiple backends is certainly fairly trivial though, as you can see here.

To add some actual data WRT performance, below are the Get/Set benchmarks that were already in the codebase, using 8 cores, for each of the backends. By default you're using async writes on RocksDB, which may not be safe. I should also note here that the RocksDB write benchmarks would not complete with less than ~12GB of RAM free, otherwise triggering the OOM killer - something is definitely not right in RocksDB-land (obviously can't use the go memory profiling tools, because cgo).

RocksDB vs BoltDB (default configuration: RocksDB async, BoltDB sync)

benchmark                       old ns/op     new ns/op     delta
BenchmarkGet_valsize1024-8      1473          1669          +13.31%
BenchmarkGet_valsize10KB-8      4468          1678          -62.44%
BenchmarkGet_valsize500KB-8     339189        1680          -99.50%
BenchmarkGet_valsize1MB-8       690710        1682          -99.76%
BenchmarkSet_valsize1024-8      8764          2496036       +28380.56%
BenchmarkSet_valsize10KB-8      116203        2970832       +2456.59%
BenchmarkSet_valsize500KB-8     11092719      12904272      +16.33%
BenchmarkSet_valsize1MB-8       12327737      21963188      +78.16%

RocksDB vs BoltDB (RocksDB async, BoltDB async)

benchmark                       old ns/op     new ns/op     delta
BenchmarkGet_valsize1024-8      1473          1662          +12.83%
BenchmarkGet_valsize10KB-8      4468          1750          -60.83%
BenchmarkGet_valsize500KB-8     339189        2110          -99.38%
BenchmarkGet_valsize1MB-8       690710        2211          -99.68%
BenchmarkSet_valsize1024-8      8764          64554         +636.58%
BenchmarkSet_valsize10KB-8      116203        180974        +55.74%
BenchmarkSet_valsize500KB-8     11092719      3853470       -65.26%
BenchmarkSet_valsize1MB-8       12327737      5952093       -51.72%

Frequent growth of the BoltDB backing store can hurt performance, but growth size can be increased to reduce this churn:

RocksDB vs BoltDB (RocksDB async, BoltDB async+AllocSize)

benchmark                       old ns/op     new ns/op     delta
BenchmarkGet_valsize1024-8      1473          1654          +12.29%
BenchmarkGet_valsize10KB-8      4468          1641          -63.27%
BenchmarkGet_valsize500KB-8     339189        2167          -99.36%
BenchmarkGet_valsize1MB-8       690710        2195          -99.68%
BenchmarkSet_valsize1024-8      8764          59262         +576.20%
BenchmarkSet_valsize10KB-8      116203        113899        -1.98%
BenchmarkSet_valsize500KB-8     11092719      2856296       -74.25%
BenchmarkSet_valsize1MB-8       12327737      5476866       -55.57%

And finally, the safe configuration for both options, using sync writes, without any tuning:

RocksDB vs BoltDB (RocksDB sync, BoltDB sync)

benchmark                       old ns/op     new ns/op     delta
BenchmarkGet_valsize1024-8      1516          1669          +10.09%
BenchmarkGet_valsize10KB-8      4592          1678          -63.46%
BenchmarkGet_valsize500KB-8     353271        1680          -99.52%
BenchmarkGet_valsize1MB-8       680144        1682          -99.75%
BenchmarkSet_valsize1024-8      1825995       2496036       +36.69%
BenchmarkSet_valsize10KB-8      2116578       2970832       +40.36%
BenchmarkSet_valsize500KB-8     6944236       12904272      +85.83%
BenchmarkSet_valsize1MB-8       12376387      21963188      +77.46%

mohitranka · 2016-07-11T05:49:50Z

@pdf Moving the discussion to Dgraph's discourse

pdf force-pushed the multi_store branch from 82279c5 to 87279f8 Compare July 10, 2016 09:25

pdf force-pushed the multi_store branch from 87279f8 to a948b1b Compare July 10, 2016 09:29

pdf force-pushed the multi_store branch from a948b1b to 60a77ae Compare July 10, 2016 09:33

manishrjain closed this Jul 10, 2016

arijitAD pushed a commit that referenced this pull request Oct 15, 2020

move keyEncode to common and rename to swapNibbles (#127)

10be9ce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement multiple store backends #127

Implement multiple store backends #127

Uh oh!

pdf commented Jul 10, 2016 •

edited by manishrjain

Loading

Uh oh!

CLAassistant commented Jul 10, 2016 •

edited

Loading

Uh oh!

coveralls commented Jul 10, 2016 •

edited

Loading

Uh oh!

coveralls commented Jul 10, 2016 •

edited

Loading

Uh oh!

coveralls commented Jul 10, 2016 •

edited

Loading

Uh oh!

manishrjain commented Jul 10, 2016 •

edited

Loading

Uh oh!

pdf commented Jul 10, 2016

Uh oh!

manishrjain commented Jul 10, 2016

Uh oh!

pdf commented Jul 10, 2016

Uh oh!

manishrjain commented Jul 10, 2016

Uh oh!

mohitranka commented Jul 10, 2016 •

edited

Loading

Uh oh!

pdf commented Jul 10, 2016

Uh oh!

mohitranka commented Jul 11, 2016

Uh oh!

Uh oh!

Implement multiple store backends #127

Implement multiple store backends #127

Uh oh!

Conversation

pdf commented Jul 10, 2016 • edited by manishrjain Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Jul 10, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jul 10, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jul 10, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jul 10, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

manishrjain commented Jul 10, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pdf commented Jul 10, 2016

Uh oh!

manishrjain commented Jul 10, 2016

Uh oh!

pdf commented Jul 10, 2016

Uh oh!

manishrjain commented Jul 10, 2016

Uh oh!

mohitranka commented Jul 10, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pdf commented Jul 10, 2016

RocksDB vs BoltDB (default configuration: RocksDB async, BoltDB sync)

RocksDB vs BoltDB (RocksDB async, BoltDB async)

RocksDB vs BoltDB (RocksDB async, BoltDB async+AllocSize)

RocksDB vs BoltDB (RocksDB sync, BoltDB sync)

Uh oh!

mohitranka commented Jul 11, 2016

Uh oh!

Uh oh!

pdf commented Jul 10, 2016 •

edited by manishrjain

Loading

CLAassistant commented Jul 10, 2016 •

edited

Loading

coveralls commented Jul 10, 2016 •

edited

Loading

coveralls commented Jul 10, 2016 •

edited

Loading

coveralls commented Jul 10, 2016 •

edited

Loading

manishrjain commented Jul 10, 2016 •

edited

Loading

mohitranka commented Jul 10, 2016 •

edited

Loading