Skip to content

Solver: Fix space leak in 'addlinking' (issue #2899) #3530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

grayjay
Copy link
Collaborator

@grayjay grayjay commented Jul 10, 2016

I took a pass at fixing #2899. I created a tree type with subtrees that have dummy arguments, so that addLinking can copy the subtrees without causing them to share data. I'm not sure if this is the best approach, though. Another option would be to add the linked nodes during the "build" phase.

This isn't ready to be merged yet. I at least need to document and rename the tree type.

/cc @kosmikus @edsko

grayjay added 3 commits July 9, 2016 22:43
This commit creates linked nodes before recursing to the nodes' children.
The solver creates linked nodes by copying existing unlinked nodes. Previously,
the subtrees shared data, which caused a space leak. This commit gives the
subtrees dummy arguments to prevent sharing.
@ezyang
Copy link
Contributor

ezyang commented Jul 11, 2016

Unfortunately, the dummy argument method is not guaranteed to thwart sharing, because inlining and floating can reintroduce the sharing. For example, try compiling:

module F where
g =
    let f () = [1..]
    in (f (), f ())

You might be OK in this case because your functions are big enough that not much inlining is present, but in cases like this it's very tricky.

As far as I can tell, the only way to guarantee sharing (or lack thereof) is to work in the IO monad. But that is probably a bigger refactoring than you are willing to bear.

In any case, this sort of trick deserves a big honking comment. Maybe do a GHC style Note.

@phadej
Copy link
Collaborator

phadej commented Jul 11, 2016

One way to prevent sharing is to use church encoding, i.e. not carry a value, but a function to consume that value (e.g. \c n -> foldr cn xs).

data ListF a b
    = Nil
    | Cons a b

foldr :: (a -> b -> b) -> b -> [a] -> b
cata :: (ListF a b -> b) -> [a] -> b

i.e. Tree a ~ forall b. TreeF a b -> b, but even then we have to sure to produce RHS variant in the way, so the big closure (instead of big value) isn't persisted,

I'm exploring this way of sharing-prevention for universe package. I could try to prioritise that experimentation, if this is urgent problem. dmwit/universe#23

@edsko
Copy link
Contributor

edsko commented Jul 12, 2016

Hmmm, @kosmikus has expressed an preference for adding the link options as part of the build phase ; if that fixes this problem, perhaps that's a nicer solution rather than trying clever tricks to prevent sharing?

@grayjay
Copy link
Collaborator Author

grayjay commented Jul 12, 2016

Thank you for the feedback, and thank you @phadej for offering to look into preventing sharing. I didn't realize that the dummy argument trick was unreliable. Combining the build and linking phases should work, because the linked and unlinked subtrees will differ from the start. I'll try that solution when I have more time.

@grayjay grayjay closed this Jul 12, 2016
@grayjay
Copy link
Collaborator Author

grayjay commented Jul 17, 2016

I just realized that the linked and unlinked subtrees are still identical after the linking traversal, because linking at one level doesn't affect how nodes are linked farther down in the tree. So combining the build and linking phases may not be enough to guarantee that the subtrees are not shared. I don't think it would even reduce the likelihood of a space leak over this PR.

Currently, buildTree creates a search tree with only unlinked choices for each package. Then addLinking traverses the tree and adds linking choices under some of those packages by transforming the existing unlinked choices. addLinking adds a choice to link each package to each other instance of that package that has already been chosen, whether or not that choice is consistent with the choices that were made higher up in the tree. Since linking has no effect on the linking choices lower in the tree, the subtrees under linking choices are equal to the subtrees under the unlinked choices that they were created from. E.g., the tree under the choice to link package C to A-setup.C-1 is equal to the tree under the choice to link to B-setup.C-1, or the choice to not link at all.

AFAICT, the subtrees only differ after the validateLinking step, when the solver finally prunes nodes whose linking is inconsistent with the choices that were made above. If we combined all steps up to validateLinking, then the subtrees of the initial tree would all be different. I hope that isn't necessary, though, because it would be significantly less modular.

That said, I would be surprised if the subtrees were shared if we only combined buildTree and addLinking. The logic that leads to identical subtrees isn't straightforward.

I'm not really sure what to do. We could make the subtrees differ in some way, such as only creating one new linking choice for each distinct group of linked packages. That also seems like a trick, though, even if it has its own benefits.

@grayjay
Copy link
Collaborator Author

grayjay commented Jul 17, 2016

I created a test case with the solver DSL: grayjay@1f1b95d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants