Skip to content

Attempt to find optimal build plans #2860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jdnavarro opened this issue Oct 11, 2015 · 21 comments
Open

Attempt to find optimal build plans #2860

jdnavarro opened this issue Oct 11, 2015 · 21 comments

Comments

@jdnavarro
Copy link
Collaborator

This includes a weight penalty system to measure the quality of different build plans. The preferred versions ranges from hackage should be taken into account for quality of each build plan.

/cc @kosmikus

@kosmikus
Copy link
Contributor

I also talked to @grayjay about this before.

@kosmikus
Copy link
Contributor

Plan of doing this incrementally:

  1. Modify PSQ to take an extra parameter for weights / penalties, which generally should be an instance of Ord. Don't use it yet, instantiate it to () everywhere.
  2. Replace the current reorderings (mainly in Preference) to modify weights instead. Try to reimplement the current behaviour precisely. This will require to use lexicographic ordering, i.e., a weight type that is in essence a tuple.
  3. Either prior to or during exploration, calculate the total penalty of each node (that is visited). When finding a success node, return the accumulated penalty, so that it can e.g. be printed together with the found solution.
  4. Add some functionality for pruning, again either immediately before or during exploration. This can appear in multiple flavours. For example, we could just allow a command line flag to specify a "maximum" score and prune everything that exceeds that maximum score. But we could also continue after finding the first solution and dynamically adjust the limit, trying to find incrementally better and ultimately optimal solutions.
  5. Tweak the actual penalties used, adjusting the preference preference phase so that we get what feels really like optimal build plans. This is mostly heuristics and experience.

@grayjay
Copy link
Collaborator

grayjay commented Oct 12, 2015

Should we compare solutions found with different goal orderings? The solutions might vary more than those found at one end of a large tree.

@kosmikus
Copy link
Contributor

@grayjay Goal reordering is somewhat independent. It can significantly speed up the overall search. It can also change the score of the "first few" solutions, although it will obviously not change the score of the optimal solution. Implementing the machinery to do the scoring is necessary in any case.

@grayjay
Copy link
Collaborator

grayjay commented Oct 18, 2015

@kosmikus I've started working on this. Here is my design, so far:

  • I created a type newtype WeightedPSQ w k v = WeightedPSQ [(w, k, v))] with the invariant that the list is always sorted by weight. All choice nodes use WeightedPSQ, except for goal choices. GoalChoice still uses PSQ because goal order shouldn't affect the score of the install plan.
  • The weight type is a positive Double. Smaller values are better, with 0 being perfect. Each preferences-related traversal adds to the existing weight in the WeightedPSQ. Most traversals leave the weight of at least one child of each node unchanged. For example, in preferLinked, the solver adds 1 to non-linked packages, but only when they have at least one linked sibling. I'm sure I'll have to spend a lot of time adjusting the rules. I first tried [Double] as the weight type to ensure that I could reproduce cabal's current behavior. Then I switched to Double so that I could weight the different preferences more evenly.
  • A commandline flag, --max-score, exposes the feature. I haven't looked into any ways to automatically restrict the score.
  • I added a tree traversal, maxScorePhase, between heuristicsPhase and explorePhase. The traversal keeps track of the sum of the weights of the preceding nodes and prunes nodes where the sum exceeds the maximum. maxScorePhase also stores the total score on Done nodes.
  • When a node exceeds the max score, the conflict set is the union of all variables that contributed non-zero weight, and each of their QGoalReasonChains.

I tried two variations on pruning. The first prunes all nodes that exceed the max score, in order to reduce the tree size as much as possible. The second prunes only leaf-level nodes. I thought that might lead to smaller conflict sets in areas of the tree that don't even contain high-penalty solutions. I only did a small amount of testing, but I found one case where the performance was comparable and another where leaf-level pruning ran at least ten times longer.

I was disappointed by the difficulty of finding better-scoring solutions to real dependency problems. I tested by first running cabal install with a package from Hackage without a max score. Then I reran cabal with a score that was slightly lower than the previous solution's score. Usually, cabal was still running after several minutes, so I stopped it. I was eventually able to find two packages where my branch found a better-scoring install plan in a reasonable amount of time, hackage-server and yi. In both cases, finding the second solution took an order of magnitude more time than the first solution. Working on this issue made me realize that returning the first solution after searching packages in order of preference is usually pretty effective. Maybe looking beyond the first solution is mainly useful as a fallback for cases where the user is unhappy with the install plan. It's also possible that my implementation has a bug, or that changing the scoring or other heuristics would significantly improve the usefulness of the feature.

@grayjay
Copy link
Collaborator

grayjay commented Oct 19, 2015

I found one way to produce more solutions in a reasonable amount of time. In the cases that I let finish, the speedup ranged from 2x to 680x for finding solutions that are better than cabal's initial solution. Unfortunately, it's messy.

I realized that if a variable only appears in a conflict set because it contributed to a score that was above the maximum, there is no reason to try any other assignments at that level, since the values are sorted in increasing order of weight. I annotated the variables in conflict sets with whether or not the conflict can be resolved by trying a value farther to the right. When the solver calculates a node's conflict set from its children during backjumping, if it finds a child meeting the following requirements, it can simply use the union of the conflict sets of the children it has already traversed:

  • The child's conflict set contains the current level's variable.
  • The variable is marked as not resolving the conflict when given a value farther to the right.

Even with this improvement, the conflict sets can be very large, and cabal can run "forever" searching for a better solution. Also, this change relies on children not being rearranged between pruning high-penalty nodes and backjumping.

@kosmikus Do you think I'm on the right track? I can clean up the code later and then link to it.

@BardurArantsson
Copy link
Collaborator

I know next to nothing about this area of the code, but... I wonder if this would be something that could "just"[1] be farmed out to an (industrial-strength) SMT solver by transforming the problem appropriately?

[1] Yeah, I know, right...?

@ttuegel
Copy link
Member

ttuegel commented Oct 21, 2015

I just wanted to chime in here about using an off-the-shelf SMT solver for this sort of thing. I was recently trying to use Z3 (an industrial-strength solver if anything is) and I found that, in the event a solution cannot be found, it tends to generate rather large conflict sets full of terms that actually aren't relevant to the conflict. I think this would have a detrimental effect on our ability to generate good error messages.

@BardurArantsson
Copy link
Collaborator

Oh, that's very interesting. Suggestion retracted. Though, perhaps an SMT solver could also be applied to minimize the conflict set? :)

@kosmikus
Copy link
Contributor

@grayjay Thanks for working on this. I'd like to see the code if possible. Yes, don't feel too bad if there's no immediate improvement. This is one of several things I'd like to do that I think if used together might have a chance of significantly improving overall speed and quality. The other main ingredient being dynamic goal reordering. In many or even most situations where we do find a solution, in particular if we find it quickly, the first solution is likely to be pretty much near the optimum already. So you should indeed not be surprised if you do not find significantly better ones. The problem is that we cannot guarantee it's the global optimum, so it is good to have at least the option to improve things.

The whole stuff about conflict sets is very subtle indeed, and I'm not sure if I understand your descriptions. It would be good to be able to try myself and look at the code.

The approach you took even without the extra work you put in already sounds very close to what I had in mind, so it would be good to have it available, and if there's no big slowdown in general, we could plausibly merge it.

We should however also have a close look at the actual weights you're assigning.

Thanks so far.

@kosmikus
Copy link
Contributor

@BardurArantsson Using an SMT solver or similar as a solver replacement has been suggested many times. It's a valid suggestion, and certainly very interesting. Together with a student and with input from @hvr, I have developed an almost usable z3-based solver that can in principle be used as a drop-in replacement for the current modular solver. I'm also currently looking at CUDF solvers. There are significant tradeoffs though, as also @ttuegel mentioned, error messages being one of them (not that the one of the modular solver are generally good). It's not a silver bullet (yet), I'm afraid. The notion of scoring and measuring the quality of different install plans is very relevant even for e.g. SMT solvers. An SMT solver is very good at finding a solution to a constraint problem. It's much less good in my experience so far in finding a good solution to such a problem. There's no need to "retract" the suggestion. It remains valid. I'd ask to continue discussion about completely different solver approaches in different tickets though.

@BardurArantsson
Copy link
Collaborator

Yeah, I was somewhat aware (but only peripherally) about the limitations of SMT solvers, but I just didn't stop to think about error messages. Error messages are probably in the HUMAN-Hard set... and thus not yet in the realm of SMT solvers. :)

@kosmikus
Copy link
Contributor

@BardurArantsson Well, error messages alone are also not a reason to dismiss SMT solvers completely. It's entirely conceivable that once we know a problem is unsolvable, we use a completely different algorithm to come up with a good explanation for the "why".

@grayjay
Copy link
Collaborator

grayjay commented Oct 21, 2015

@kosmikus I haven't had a chance to clean up my code yet, but here is the branch, if you want to try it out: https://github.com/grayjay/cabal/tree/optimal-build-plans I'll probably have more time to work on it in the next few days. I actually switched back to ordering nodes the way cabal does currently, so that my branch should find the same solutions when --max-penalty is not used, and it's easier to compare performance.

@grayjay
Copy link
Collaborator

grayjay commented Oct 26, 2015

@kosmikus I refactored my branch, so it's ready for review: https://github.com/grayjay/cabal/tree/optimal-build-plans . The flag is now --max-score. There are still a few things that I want to improve, including the scoring.

The weight type is currently [Double], and the weights are assigned in a way that should reproduce the ordering on master. Final scores are calculated only from the index of a node in its list of siblings. I would like to change the algorithm to use Double for weights and then sum the weights to calculate the score, though I don't know the best way to assign the weights. One problem is that scores must be non-negative in order for the solver to backtrack when an interior node exceeds the max. But non-negative scores excessively penalize install plans with higher numbers of packages or flags.

I also added a comment to the commit.

@grayjay
Copy link
Collaborator

grayjay commented Oct 29, 2015

@kosmikus I just fixed a space leak on my branch. I'm not sure if it's still using too much memory, though. During one test with GHC 7.10.2, the memory usage on optimal-build-plans increased at a rate that was 20% higher than on master. With GHC 7.6.3, the memory usage was almost twice as high as master. This branch also runs 2-30% slower on different tests. I'm hoping that switching the weight type to Double will help.

@grayjay
Copy link
Collaborator

grayjay commented Nov 4, 2015

I looked into the solver's memory use and I think I found the problem. There are several space leaks in the existing logging code and another in backjumping. The backjumping bug is present on master and my branch, but my change probably increased the size of the data structures that are retained. The solution looks relatively difficult. I'll try to create a separate pull request or issue when I have time.

@kosmikus
Copy link
Contributor

kosmikus commented Nov 6, 2015

Sorry for the long silence. I really appreciate all your work on this, but there's unfortunately no chance I'll have the time to take a serious look at this before next week. I hope I can get half a day free next week to try to work on my cabal-install backlog.

@kosmikus kosmikus assigned kosmikus and grayjay and unassigned kosmikus and grayjay Nov 6, 2015
@grayjay
Copy link
Collaborator

grayjay commented Nov 7, 2015

@kosmikus No problem. I'll also try to work on the remaining space leak.

@grayjay
Copy link
Collaborator

grayjay commented Nov 9, 2015

Since I fixed the problem with memory usage, I rebased my branch onto #2916 and made a pull request (#2917).

@Blaisorblade
Copy link
Collaborator

On using SAT/SMT: has anybody already tried using (weighted) MaxSAT to find a good solution? MaxSAT seems on vogue and I've seen some impressive applications. In particular, one could give (different) non-zero weight to flags (for defaulting them), installed packages, fixed constraints and any preference. Of course tuning weights and testing this is nontrivial work, and errors remain a problem.

Sorry for commenting here, but this issue seems the most complete SAT/SMT discussion.
(I'm following up on @hvr from #1783 (comment)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants