-
Notifications
You must be signed in to change notification settings - Fork 843
Resolve fields asyncronously #213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Having a hard time trying to figure out why tests are failing. It seems to work fine in my own project. |
executor.go
Outdated
finalResults[responseName] = resolved | ||
go func(responseName string, fieldASTs []*ast.Field) { | ||
defer wg.Done() | ||
resolved, state := resolveField(p.ExecutionContext, p.ParentType, p.Source, fieldASTs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is resolveField
goroutine-safe? From the first sign, it mutates the passed context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my bad for assuming p.ExecutionContext
was just a regular context.Context
.
executor.go
Outdated
continue | ||
} | ||
finalResults[responseName] = resolved | ||
go func(responseName string, fieldASTs []*ast.Field) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A half year ego I wanted to make similar changes, but then decided that spawning a goroutine on every resolve might be much more expensive than doing it in series. So now I wonder, have you checked the impact on the execution time and resource allocation? Does it really make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This undoubtedly uses more resources due to the fact that is trying to do more at the same time. However, performance is immediately increased from parallelism when GOMAXPROCS > 1
.
It especially makes sense when calling many long running procedures where the majority of the work is just waiting (i.e. RPCs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, not sure if you know this, but goroutines are incredibly cheap and it's not uncommon to have thousands to goroutines at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although goroutines are cheap, they aren't free. To my knowledge, a goroutine allocates about 4KB of memory for its stack. So, I believe, in a project with 100RPS, 1000th of goroutines (spawned for every request) might lead to unexpected over-consumption of memory. And worth to mention, that such load profile would impact on a way how the data-source behaves (e.g. db or some external http service(-s)).
Concurrency doesn't just make things faster. A silly benchmark that fills a slice of 1000 items one by one (serially vs concurrently) gives almost two orders of magnitude difference in execution time. So in the case of the resolving, everything might depend on how many such "long running procedures" the graph has, comparing to total number of resolvers.
Don't get me wrong, I'm not judging the changes or PR. As I said, some time ago I decided that doing resolving in a concurrent way (at leat by simply spawn a goroutine per resolve) might bring some extra issues in a high-loaded project, but I was too lazy to actually do benchmarks. And now I'm interested if I was wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're getting at now, but I feel like it's reasonable to assume that someone that handles 100s of requests per second would allocate the necessary resources to do so. Also that someone with such a large schema would have their own optimizations for hitting external resources.
On the other hand, perhaps just spawning a goroutine per resolve is a little lazy. Another solution could be to only spawn a goroutine when a func() (interface{}, error)
is returned by the resolve function. Similar to a how it works in graphqljs with promises.
This commit should address the thread safety of |
executor.go
Outdated
func (eCtx *ExecutionContext) addError(err gqlerrors.FormattedError) { | ||
eCtx.errorsMutex.Lock() | ||
eCtx.Errors = append(eCtx.Errors, err) | ||
eCtx.errorsMutex.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend, even though in this case it's benefits are null, a defer eCtx.errorsMutex.Unlock()
immediately after the lock. The benefit here is potential reduction in future errors if this gets modified. This is just a style comment though.
Using "deferred resolve functions"
I've modified to to be an opt-in concurrency, only running concurrently if a |
Any update? |
fix a race condition, make things faster, profit
This is working very well for me. I fixed a couple of bugs, shoved I've had a look at the other concurrent loading PR (#132) and it's significantly more complex than this one. Granted, this implementation may not be as comprehensive, but it's much easier to understand its method of operation. |
Does anybody know what the status is of this PR? |
@dvic works fine for me - been in production for a month now with no problems. Not sure what the plan is regarding merging though. |
@deoxxa Works fine indeed with dataloader 😃 @chris-ramon Do you know if this will be merged in the near future? Thanks 👍 |
I took the changes from this branch and added the concurrent resolving of list values #132 in https://github.com/dvic/graphql/tree/resolve-asynchronously, that was an important missing part for me. This is a perfect match with https://github.com/nicksrandall/dataloader. There were some data races but I managed to fix these as well, all tests are passing with the |
Some benchmarks comparing this branch (+ parallel list evaluation):
|
This PR looks really great, is there any reason in particular why it hasn't been merged? |
@dvic can you help me understand when/why resolving list items in goroutines is useful? If they're scalars/enums they'll resolve very quickly, and if they're objects they'll go down the same |
@gburt For me it's basically for libraries such as https://github.com/nicksrandall/dataloader, where one can do some smart batching/caching of queries. |
@dvic oh I totally get that, I'm just struggling to see how goroutining within |
This ensures that the `deferredResolveFunction` is returned by the executor and not a resolve function. i.e. ensure that a resolved value of type `function() interface{}` will not be run concurrently.
Has there been any forward progress on this? Seems like there's a real want here. |
Unfortunately, it doesn't seem like the owner of this repo has any interest in accepting this PR. I've left this PR open for when they change their mind, as well as for this solution to be more easily found by someone who may be looking for one. I'll also try to keep this patch fresh as updates come. |
I pulled down @dvic 's branch with this change (https://github.com/dvic/graphql/tree/resolve-asynchronously) and implemented a resolver function with Nick Randall's dataloader. It seems to work very well. My use case is to reduce the n+1 query problem on belongsto relations, i.e. replace n Get(id) calls with a single List(idList) call. To do that, in the resolver, I just return a func() (interface{}, error). To be specific, the key line in the resolve function is Anyways, I mostly wanted to say thanks to the folks on this pull thread and ask the question of whether this can be merged into the core. I would be open to the other async solutions as well, but the code in this one is very easy to understand and it's very easy to integrate with Nick Randall's dataloader. Or if this one can't be merged, I'd love to understand the criteria for merging one of the other ones. Thanks!! |
Thanks a lot guys! 👍 — Closing this one in favor of #388, you might want to take a look to a real-use case working example that shows the support for concurrent resolvers. |
Modified
executeFields
to resolve each field in their own go routine.I suspect that this may be the intended behaviour because it is just a copy of
executeFieldsSerially
right now.