-
Notifications
You must be signed in to change notification settings - Fork 710
Can re-linking on cabal build be avoided? #1177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
What's the structure of your project? I guess that you have a library defined in the same .cabal file that both ex-1 and ex-2 depend on, otherwise relinking shouldn't happen. |
Yes, exactly. Why does (should?) it happen in this case? |
My hypothesis is that the library is reinstalled into the in-place package db each time, which changes the timestamp. |
Yes, that happens:
Do we have to treat the library as new if it hasn't changed? |
Cabal is dumb in this regard: there is no mechanism to tell whether the library hasn't changed and doesn't need to be reinstalled. This hurts |
So you would say it is a lacking feature that would be useful in other places as well? In |
It'd be useful for
Possibly - need to look closer at the implementation. Perhaps this can be done as an |
GHC does check the timestamps on the libraries it is linking against when deciding whether to relink. So the best solution would be for Cabal to check whether it is installing a library file that is identical to the existing file, and not touch the file if so. |
Related: #1121 |
@simonmar You are right, I tried that, ghc itself avoids it and that is ~10 times faster than actual linking. Unfortunately, I will have a look at that. |
Cabal also calls ghc twice for linking: Once with |
Done! The change is quite small and works incredibly well for me. For my project with 40 executables, my no-op The link time is now dominated by the two ghc invocations I mentioned above. So the remaining points are:
And independent of my change:
|
For 4: With help of merijn from |
This is great, but it needs to be checked individually on each platform sadly. I remember the pain I went through to work out which ar flags we have to use on each platform. It's not ok to just assume. (You've mentioned BSD, there's also Solaris, whatever older binutils mingw uses on Windows etc) So my suggestion is that we set flags to include -D only for the specific platforms we can confirm it works on, and leave the |
I'm working on alternative solution: The |
As for the Is there really a measurable performance difference there? I'd be surprised if there's that much extra startup overhead. Oh, hmm, perhaps if we're still calling ghc in |
Wait a moment! I already have code to generate Ar format files, and yes that bit is easy. The problem is you still need to use the "real" ar program because it's the symbol index that is the tricky bit, and that is sadly different on each platform, you don't want to write code to generate that. So you'd still need to run ranlib (which is basically ar) to generate the index, and it may go and insert timestamps again... |
I wouldn't call it "overhead" - in a no-op build, the two ghc invocations are the only things that take time, so using only one saves me half the time. Your argument for why this is done makes sense. Although I think the two links are done unconditional of that - I don't think my executables generate any
I'm not suggesting writing the |
I'd expect to see a performance difference. Re-linking test suite executables that haven't changed takes a considerable amount of time when building e.g. containers (which has many test suites, one for each data structure). |
Any update on this? We've been running it for a few weeks now and it works very well. |
@dcoutts could you please have a look? |
This change is very non-invasive, so it would be great if it could make it into 1.18. |
@tibbe Which platforms do we support? |
Note: Re-linking still happens every time when |
When I build my project repeatedly, I always get
even though no code has changed, and it takes significant time.
Is it possible to detect that no input files have changed so that re-linking can be skipped?
This would speed up no-op builds a lot (e.g. factor 3 for 3 executables, and it scales up) and be very useful for integration into editors where a quick rebuild response is helpful.
The text was updated successfully, but these errors were encountered: