Skip to content

Change Graph.fromList to fromDistinctList and fix conseqeunces #4027

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from

Conversation

dcoutts
Copy link
Contributor

@dcoutts dcoutts commented Oct 22, 2016

It's really an error to try and build a graph where you have duplicate node keys, so remove Graph.fromList and add Graph.fromDistinctList. This check is always on, not just an assertion, because we get it for free given the way Maps can be constructed.

All uses of Graph.fromList are ok to convert to fromDistinctList except for the one taking the solver output and building the solver plan. It turns out the solver produces many duplicates. See issue #4026.

So we add a special duplicate check here, but nub out duplicates that are fully identical (ie == not just same keys). This should be removed when the solver is updated to not produce duplicates.

As part of that, add a ordNubBy to the Utils.

This builds on top of #4014 and #4022.

@mention-bot
Copy link

@dcoutts, thanks for your PR! By analyzing the history of the files in this pull request, we identified @ezyang, @erikd and @BardurArantsson to be potential reviewers.

| let hackIdenticalInstances = filter ((>1) . length) . map nub
, dups <- hackIdenticalInstances --FIXME: should not have to do this
$ duplicatesBy (comparing Graph.nodeKey) pkgs ]
-}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grayjay if this check is enabled we get lots of

Exception: not yet configured: buildDepends

in the solver unit-tests. No doubt this is because of the nub above that forces the whole values, to check they're identical. But the bigger question is why we get duplicates in the first place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had a chance to look at the code yet, but, judging by the test failures, I think the duplicate nodes might be the linked packages from independent goals. For example, this test solves for A and B as indpendent goals, which both depend on the same instance of C. Then there is a duplicate node for C:

[__0] trying: 0.A-1.0.0 (user goal)
[__1] trying: 0.C-1.0.0 (dependency of 0.A-1.0.0)
[__2] trying: 1.B-1.0.0 (user goal)
[__3] next goal: 1.C (dependency of 1.B-1.0.0)
[__3] trying: 1.C~>0.C-1.0.0
[__4] done
Unit Tests
  UnitTests.Distribution.Solver.Modular.Solver
    Backjumping
      bj8: FAIL
        Exception: internal error: could not construct a valid install plan.
        The proposed (invalid) plan contained the following problems:
        Package C-1.0.0 has duplicate instances. There are 2 distinct instances.
        Proposed plan:
        Configured C-1.0.0
        Configured B-1.0.0
        Configured C-1.0.0
        Configured A-1.0.0

        CallStack (from HasCallStack):
          error, called at ./Distribution/Client/Dependency.hs:705:33 in main:Distribution.Client.Dependency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@dcoutts
Copy link
Contributor Author

dcoutts commented Oct 24, 2016

@grayjay so there's no need to fix the solver immediately. Are you happy for this check to go in? And that the new disabled check would make sense, once it's true.

@ezyang @23Skidoo yay / na?

imho, it's relatively uncontroversial.

@ezyang
Copy link
Contributor

ezyang commented Oct 25, 2016

The invariant is checked so I don't see any problem with it. I would prefer it, though, if we passed in a CallStack to the error site (use WithCallStack from Distribution.Compat.Stack)

@grayjay
Copy link
Collaborator

grayjay commented Oct 25, 2016

@dcoutts I think the two checks are fine. I found a comment that explains the behavior. Linked packages cause duplicates, so the nodes should also be identical.

-- The graph will still contain all the installed packages, and it might
-- contain duplicates, because several variables might actually resolve to
-- the same package in the presence of qualified package names.

@grayjay
Copy link
Collaborator

grayjay commented Oct 26, 2016

I made PRs that fix the undefined fields (#4038) and duplicate packages (#4039). In #4039, I just removed packages with duplicate SolverIds immediately after the package qualifiers are removed. If my PR is the right solution to the duplicate instances, do you think it would be better to move the check in this PR to the same place (where I called nubBy in 9378165)? Then we could check for packages with duplicate SolverIds that aren't complete duplicates before we remove the duplicates.

@ezyang
Copy link
Contributor

ezyang commented Oct 27, 2016

Both PRs merged.

--FIXME: the solver produces large numbers of duplicate but identical
-- instances, so we currently eliminate duplicates here, and the
-- duplicate check below has to cope with there being identical instances.
graph = Graph.fromDistinctList (ordNubBy Graph.nodeKey pkgs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So given that the fixes to solver have landed, I guess that this can be removed now?

@dcoutts
Copy link
Contributor Author

dcoutts commented Oct 28, 2016

Hmm, not sure why github closed this. I just did a force push update to the branch. Well, nevermind, I've opened it as #4053.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants