hls-graph speculatively builds all prior dependencies for a key even if they might have changed #3423

wz1000 · 2022-12-23T15:59:43Z

I ran into this bug while debugging a flaky test case -

A.hs is initially not an FOI, so GetModIface A.hs has two dependencies IsFileOfInterest A.hs and GetModIfaceFromDiskAndIndex A.hs. Then A.hs is opened, turning it into an FOI

haskell-language-server/ghcide/src/Development/IDE/Core/Rules.hs

Lines 934 to 957 in dfd8f0b

    
           getModIfaceRule recorder = defineEarlyCutoff (cmapWithPrio LogShake recorder) $ Rule $ \GetModIface f -> do 
        
             fileOfInterest <- use_ IsFileOfInterest f 
        
             res@(_,(_,mhmi)) <- case fileOfInterest of 
        
               IsFOI status -> do 
        
                 -- Never load from disk for files of interest 
        
                 tmr <- use_ TypeCheck f 
        
                 linkableType <- getLinkableType f 
        
                 hsc <- hscEnv <$> use_ GhcSessionDeps f 
        
                 let compile = fmap ([],) $ use GenerateCore f 
        
                 se <- getShakeExtras 
        
                 (diags, !hiFile) <- writeCoreFileIfNeeded se hsc linkableType compile tmr 
        
                 let fp = hiFileFingerPrint <$> hiFile 
        
                 hiDiags <- case hiFile of 
        
                   Just hiFile 
        
                     | OnDisk <- status 
        
                     , not (tmrDeferredError tmr) -> liftIO $ writeHiFile se hsc hiFile 
        
                   _ -> pure [] 
        
                 return (fp, (diags++hiDiags, hiFile)) 
        
               NotFOI -> do 
        
                 hiFile <- use GetModIfaceFromDiskAndIndex f 
        
                 let fp = hiFileFingerPrint <$> hiFile 
        
                 return (fp, ([], hiFile)) 
        
             pure res

At this point A.hs should have dependencies IsFileOfInterest A.hs, Typecheck A.hs ... (the IsFOI branch). However, I was seeing GetModIfaceFromDiskAndIndex still getting run for A.hs.

Investigating further, it turns out to be the implementation of refresh in hls-graph:

haskell-language-server/hls-graph/src/Development/IDE/Graph/Internal/Database.hs

Lines 145 to 148 in dfd8f0b

    
           refresh db stack key result = case (addStack key stack, result) of 
        
               (Left e, _) -> throw e 
        
               (Right stack, Just me@Result{resultDeps = ResultDeps (toListKeySet -> deps)}) -> do 
        
                   res <- builder db stack deps

It seems like it speculatively builds all the previous dependencies of the key even if they might have changed. This is wrong as rules can have side effects or require preconditions (GetModIfaceFromDiskAndIndex has both). This also can lead to spurious diagnostics from these pseudo-dependencies.

We should instead build dependencies in the order that they occurred (which we currently do not maintain), and recompute with RunDependenciesChanged if any of them are dirty, short circuiting the evaluation of later dependencies (since they might not be true dependencies). However, this might have a negative impact on performance.

In the above case, that would mean skipping GetModIfaceFromDiskAndIndex after we discover that IsFOI has changed.

The text was updated successfully, but these errors were encountered:

wz1000 · 2022-12-23T16:02:44Z

@pepeiborra thoughts?

pepeiborra · 2022-12-23T18:52:48Z

It seems like it speculatively builds all the previous dependencies of the key even if they might have changed.

It doesn't build the previous dependencies, only recomputes them. This can be a no-op if the dependencies are not dirty.

Either way, this behaviour is by design, to implement early cutoff. That's my understanding at least. And as far as I remember, Shake does the same thing.

This is wrong as rules can have side effects or require preconditions (GetModIfaceFromDiskAndIndex has both). This also can lead to spurious diagnostics from these pseudo-dependencies.

Yup, I can see how the side effects and the spurious diagnostics can be annoying. Maybe GetModIfaceFromDiskAndIndex needs to check whether the file is a FOI.

Not sure what you mean about preconditions?

We should instead build dependencies in the order that they occurred (which we currently do not maintain),

The build system makes no promises around evaluation order, and for good reason since we want maximal parallelism. The only promise is that if rule B depends on rule A the build system guarantees that A will build before B.

and recompute with RunDependenciesChanged if any of them are dirty, short circuiting the evaluation of later dependencies (since they might not be true dependencies). However, this might have a negative impact on performance.

So you want to give up early cutoff? I suspect the performance impact will be massive.

wz1000 · 2022-12-23T19:07:31Z

It doesn't build the previous dependencies, only recomputes them. This can be a no-op if the dependencies are not dirty.

What is the distinction between 'build' and 'compute' here? It seems like all the code in the GetModIfaceFromDiskAndIndex rule is being executed.

Not sure what you mean about preconditions?

GetModIfaceFromDiskAndIndex is only supposed to be called on non-FOIs.

The build system makes no promises around evaluation order, and for good reason since we want maximal parallelism. The only promise is that if rule B depends on rule A the build system guarantees that A will build before B.

We could add an ordering for dependencies so that dependencies using applicative style or uses can be built in parallel, but monadic bind imposes a sequencing order on them.

So you want to give up early cutoff? I suspect the performance impact will be massive.

I'm not sure why this means we have to give up early cutoff. After recomputing previous dependencies, since the rule has finished execution, we should have the result and be able to do early cutoff using that?

pepeiborra · 2022-12-23T19:43:51Z

What is the distinction between 'build' and 'compute' here? It seems like all the code in the GetModIfaceFromDiskAndIndex rule is being executed.

Computing a build target involves checking for dirtiness, computing its dependencies, and then running the rule giving it a RunMode:

If the target is not dirty, the rule won't be run
If the RunMode is RunDependenciesSame, defineEarlyCutoff will return the last cached value (this is what implements early cutoff)

GetModIfaceFromDiskAndIndex is only supposed to be called on non-FOIs.

This is not a property of the build rule, but an invariant that we want to maintain. It seems difficult to encode this invariant in the build system though

pepeiborra · 2022-12-23T19:48:36Z

So the gist of your proposal is to not evaluate all the dependencies in parallel. Instead, evaluate them "only as much as needed", and stop as early as possible, i.e. short-circuit. Is that right?

I am unconvinced that this will prevent running GetModIfaceFromDisk for FOIs. Can you illustrate with an example?

It will also make the build system less performant in subtle ways, specially when combined with ApplicativeDo.

wz1000 · 2022-12-26T09:56:34Z

So the gist of your proposal is to not evaluate all the dependencies in parallel. Instead, evaluate them "only as much as needed", and stop as early as possible, i.e. short-circuit. Is that right?

Yes.

I also noticed in profiles that hls-graph spawns essentially a new thread for each build target, which seems excessive and probably not very good for performance (see our old friend https://gitlab.haskell.org/ghc/ghc/issues/18224). Of course the exact improvement depends on how many threads are blocked on MVars or IO, versus have actual work that can be performed.

I am unconvinced that this will prevent running GetModIfaceFromDisk for FOIs. Can you illustrate with an example?

At least in the case I mentioned in the ticket body, the second time GetModIface is built, hls-graph would first check if IsFileOfInterest has changed (which it has), and then skip computing the rest of the dependencies.

wz1000 · 2022-12-26T10:00:18Z

On a project with 3000 modules, it looks like hls-graph spawns and hold on to 31000 TSOs for the duration of the entire build.

pepeiborra · 2022-12-26T10:51:33Z

I also noticed in profiles that hls-graph spawns essentially a new thread for each build target, which seems excessive and probably not very good for performance (see our old friend https://gitlab.haskell.org/ghc/ghc/issues/18224). Of course the exact improvement depends on how many threads are blocked on MVars or IO, versus have actual work that can be performed.

This is by design, hls-graph is an evolution of Shake, removing all the unessential including the fancy scheduler, which is replaced by the RTS scheduler.

depends on how many threads are blocked on MVars or IO, versus have actual work that can be performed.

Build target threads block on blackholes, IIRC, see splitIO. Of course those blackholes can be blocking on other concurrency primitives but that's up to the build rule.

On a project with 3000 modules, it looks like hls-graph spawns and hold on to 31000 TSOs for the duration of the entire build.

Sounds about right. Rebuilds after edits should spawn much fewer TSOs though, have you measured that?

pepeiborra · 2022-12-26T21:06:39Z

So the gist of your proposal is to not evaluate all the dependencies in parallel. Instead, evaluate them "only as much as needed", and stop as early as possible, i.e. short-circuit. Is that right?

Yes.

This makes sense to me, but I still have concerns about the performance impact. Your suggestion of grouping dependencies and evaluating the group in parallel would go a long way here. So I agree that this would be a good change.

soulomoon · 2024-02-18T17:54:13Z

In summary.
The current approach pros:
1 easy to implement
2 can kick all dependencies concurrently.

Cons:
1 wrong semantic.
2 also performance penalty of running
unnecessary rules which might be computational heavy.(that is why we branch FOI in the first place, some non-FOI only rules would actually be running even the file is opened... as in the issue head)

Some thoughts:
A:

Introduce a small dependency dsl with branching semantic to hls-graph so the dependencies only runs when it meet the preconditions, use it to branch the runs of dependencies.
This actually also gives easy fix to a potential dual problem about missing dependencies(missing since it's parent does not run it the first time) that might not kick it's parent when dirty even if it's precondition is met. This can happen if the branching condition can change without the change of some rule.

B:
Alternatively we rearrange the rules with branching dependencies which might be nasty.

soulomoon · 2024-02-20T21:42:43Z

I am doing a proof of concept draft in #4087 (comment)
Branching the deps with meta in the rule as the way alwaysrun do, to see if performance is working ok.

wz1000 added type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc.. status: needs triage labels Dec 23, 2022

wz1000 mentioned this issue Jan 10, 2023

Use retry action, don't retry otherwise #3408

Open

fendor added performance Issues about memory consumption, responsiveness, etc. status: in discussion Not actionable, because discussion is still ongoing or there's no decision yet and removed status: needs triage labels Feb 22, 2023

wz1000 mentioned this issue Mar 13, 2023

9.6 support for HLS #3480

Merged

This was referenced Feb 20, 2024

Fix hls-graph: phantom dependencies invoke in branching deps (resolve #3423) #4087

Merged

Flaky test failure result in error of GetLinkable #4093

Closed

soulomoon closed this as completed in 82148dc Mar 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hls-graph speculatively builds all prior dependencies for a key even if they might have changed #3423

hls-graph speculatively builds all prior dependencies for a key even if they might have changed #3423

wz1000 commented Dec 23, 2022 •

edited

Loading

wz1000 commented Dec 23, 2022

pepeiborra commented Dec 23, 2022 •

edited

Loading

wz1000 commented Dec 23, 2022 •

edited

Loading

pepeiborra commented Dec 23, 2022

pepeiborra commented Dec 23, 2022

wz1000 commented Dec 26, 2022 •

edited

Loading

wz1000 commented Dec 26, 2022

pepeiborra commented Dec 26, 2022

pepeiborra commented Dec 26, 2022

soulomoon commented Feb 18, 2024 •

edited

Loading

soulomoon commented Feb 20, 2024 •

edited

Loading

hls-graph speculatively builds all prior dependencies for a key even if they might have changed #3423

hls-graph speculatively builds all prior dependencies for a key even if they might have changed #3423

Comments

wz1000 commented Dec 23, 2022 • edited Loading

wz1000 commented Dec 23, 2022

pepeiborra commented Dec 23, 2022 • edited Loading

wz1000 commented Dec 23, 2022 • edited Loading

pepeiborra commented Dec 23, 2022

pepeiborra commented Dec 23, 2022

wz1000 commented Dec 26, 2022 • edited Loading

wz1000 commented Dec 26, 2022

pepeiborra commented Dec 26, 2022

pepeiborra commented Dec 26, 2022

soulomoon commented Feb 18, 2024 • edited Loading

soulomoon commented Feb 20, 2024 • edited Loading

wz1000 commented Dec 23, 2022 •

edited

Loading

pepeiborra commented Dec 23, 2022 •

edited

Loading

wz1000 commented Dec 23, 2022 •

edited

Loading

wz1000 commented Dec 26, 2022 •

edited

Loading

soulomoon commented Feb 18, 2024 •

edited

Loading

soulomoon commented Feb 20, 2024 •

edited

Loading