Skip to content

Last step in cabal-nix sometimes takes a loooooooooong time #3019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
edsko opened this issue Jan 2, 2016 · 12 comments
Closed

Last step in cabal-nix sometimes takes a loooooooooong time #3019

edsko opened this issue Jan 2, 2016 · 12 comments

Comments

@edsko
Copy link
Contributor

edsko commented Jan 2, 2016

Building hackage-server:

cabal-nix new-build -v3
...
link: linkables are ...
LinkableM (2016-01-02 08:32:56 UTC) main:Main
   [DotO /Users/e/wt/projects/tuf/hackage-server/dist-newstyle/build/hackage-server-0.5.1/build/PackageTests/PackageTests-tmp/Main.o]
LinkableM (2016-01-02 08:32:55 UTC) main:Distribution.Server.Packages.Unpack
   [DotO /Users/e/wt/projects/tuf/hackage-server/dist-newstyle/build/hackage-server-0.5.1/build/PackageTests/PackageTests-tmp/Distribution/Server/Packages/Unpack.o]
LinkableM (2016-01-02 08:32:56 UTC) main:Distribution.Server.Packages.UnpackTest
   [DotO /Users/e/wt/projects/tuf/hackage-server/dist-newstyle/build/hackage-server-0.5.1/build/PackageTests/PackageTests-tmp/Distribution/Server/Packages/UnpackTest.o]
LinkableM (2015-12-30 15:01:50 UTC) main:Distribution.Server.Util.Parse
   [DotO /Users/e/wt/projects/tuf/hackage-server/dist-newstyle/build/hackage-server-0.5.1/build/PackageTests/PackageTests-tmp/Distribution/Server/Util/Parse.o]
LinkableM (2015-12-30 15:01:49 UTC) main:Distribution.Server.Util.GZip
   [DotO /Users/e/wt/projects/tuf/hackage-server/dist-newstyle/build/hackage-server-0.5.1/build/PackageTests/PackageTests-tmp/Distribution/Server/Util/GZip.o]
/Users/e/wt/projects/tuf/hackage-server/dist-newstyle/build/hackage-server-0.5.1/build/PackageTests/PackageTests is up to date, linking not required.
*** Deleting temp files:
Deleting: 
*** Deleting temp dirs:
Deleting: /var/folders/94/vkm1c1l17cddyvzk2252hnp40000gs/T/ghc81573_0

and then.. nothing.. The cabal-nix process is using some 25% CPU at this point, and I have no idea what it's doing. That temp directory mentioned on the last line is already gone, so it's not just deleting. What it is doing is anyone's guess.. I CTRL-C'ed it after 12 minutes. Retried, same thing. Wiped dist-newbuild, same thing. I've seen long times before but never something like this. No idea what's happening. Will reboot machine and see if that resolved anything. (Incidentally, I thought maybe it was busy waiting for the hackage-server executable, but the server was not running at this point.)

@edsko
Copy link
Contributor Author

edsko commented Jan 2, 2016

.. rebooted machine, same problem. Started with a completely blank ~/.cabal directory, same problem. Started with blank ~/.cabal directory, blank dist-newstyle directory, and blank ~/.ghc directory, still same problem. Now I don't know what to try next anymore.. What is going on??

@ezyang
Copy link
Contributor

ezyang commented Jan 2, 2016

What commit ID? (I hope I didn't push a regression!)

@edsko
Copy link
Contributor Author

edsko commented Jan 2, 2016

f8d8f85 . I haven't updated my nix branch in a while so I doubt it's your doing :)

@ezyang
Copy link
Contributor

ezyang commented Jan 2, 2016

If you can, try strace'ing the process and see what it's churning on.

@edsko
Copy link
Contributor Author

edsko commented Jan 2, 2016

Hmmm, that was a good suggestion. dtruss (on OSX here) provided an important clue:

stat64("/Users/e/wt/projects/tuf/hackage-server/state/blobs/0b/0ba03d1a00d06795579f1d6eb5ace1f7\0", 0x109341F50, 0x7FF0)         = 0 0
open("/Users/e/wt/projects/tuf/hackage-server/state/blobs/0b/0ba03d1a00d06795579f1d6eb5ace1f7\0", 0x20004, 0x1B6)        = 12 0
fstat64(0xC, 0x1093403E0, 0x1B6)         = 0 0
ioctl(0xC, 0x4004667A, 0x70000030BD8C)       = -1 Err#25
ioctl(0xC, 0x40487413, 0x70000030BD90)       = -1 Err#25
read(0xC, "\0", 0x7FF0)      = 273 0
read(0xC, "lass='hs-varid'>swap</span><span class='hs-layout'>)</span>\n<a name=\"line-26\"></a>\n<a name=\"line-27\"></a><span class='hs-keyword'>import</span> <span class='hs-conid'>Language</span><span class='hs-varop'>.</span><span class='hs-conid'>Qux</span><span clas", 0x7FF0)      = 0 0
close(0xC)       = 0 0

and so on and so forth for lots of files. I have no idea what's going on, but if it's somehow trying to read all the blobs in the server state/ directory for whatever reason then that certainly explains why it's taking so long (and why it didn't before -- I just moved to an actual Hackage snapshot).

So that leaves the question, why is it even looking at those files? My project file doesn't have a single wildcard:

packages: hackage-server.cabal

constraint: hackage-security == 0.1.0.0
constraint: hackage-security-HTTP == 0.1.0.0

package hackage-server
  tests: True

package text-icu
  extra-include-dirs: /Users/e/homebrew/Cellar/icu4c/54.1/include
  extra-lib-dirs: /Users/e/homebrew/Cellar/icu4c/54.1/lib

@edsko
Copy link
Contributor Author

edsko commented Jan 2, 2016

I'm also somewhat confused by that dtruss output; as far as I can tell, that particular file (.../blobs...) does not contain that data (lass='hs-valid'>...)... But perhaps I'm reading the output wrong.

@dcoutts
Copy link
Contributor

dcoutts commented Feb 5, 2016

The issue here is we need to track the files that are part of the package as specified by the .cabal file, not just look at all files in the package dir (as there can be lots of them like in this example and we don't want to hash them all).

dcoutts added a commit that referenced this issue Feb 9, 2016
We know the previous mtime and content hash of lots of files, so reuse
these content hashes when the file mtime is the same.

Should fix issue #3019.
@dcoutts
Copy link
Contributor

dcoutts commented Feb 9, 2016

Actually there was another issue. We were not taking advantage of the fact that we know the previous content hashes when we're updating the file monitor state and so were re-hashing every time. Of course we only need to rehash if the mtime has changed. So now we do that.

But for this specific example we do also need to be properly selective in which files we monitor. So I'll leave this open for that aspect.

dcoutts added a commit to dcoutts/cabal that referenced this issue Feb 10, 2016
We know the previous mtime and content hash for all the files in the
previous file set, so reuse these content hashes when we determine they
are still valid (which is when the mtime is the same). This saves
re-hashing files that have not changed.

No test for this as such since it's a performance rather than
correctness thing, but checked ad-hoc that the cache does get a perfect
hit rate.

Fixes one of two parts of issue haskell#3019.
dcoutts added a commit that referenced this issue Feb 19, 2016
We know the previous mtime and content hash for all the files in the
previous file set, so reuse these content hashes when we determine they
are still valid (which is when the mtime is the same). This saves
re-hashing files that have not changed.

No test for this as such since it's a performance rather than
correctness thing, but checked ad-hoc that the cache does get a perfect
hit rate.

Fixes one of two parts of issue #3019.
garetxe pushed a commit to garetxe/cabal that referenced this issue Mar 5, 2016
We know the previous mtime and content hash for all the files in the
previous file set, so reuse these content hashes when we determine they
are still valid (which is when the mtime is the same). This saves
re-hashing files that have not changed.

No test for this as such since it's a performance rather than
correctness thing, but checked ad-hoc that the cache does get a perfect
hit rate.

Fixes one of two parts of issue haskell#3019.
@ezyang
Copy link
Contributor

ezyang commented Mar 30, 2016

I can still confirm that this is a problem for me on latest nix-local-build branch. strace'ing Cabal shows it is statting things it has no business statting.

@ezyang
Copy link
Contributor

ezyang commented Apr 1, 2016

There is an extra problem here, which is that if I'm edit anything in my package directory, nix-local-build will then decide that the package needs to be rebuilt, even if it's completely irrelevant to compilation. For example, I was dumping out build profiles, and Cabal kept deciding that packages needed to be rebuilt because this file had changed.

@ezyang
Copy link
Contributor

ezyang commented Apr 1, 2016

Relevant code in ProjectBuilding (took me a while to find!!)

          --TODO: [required eventually] temporary hack. We need to look at the package description
          -- and work out the exact file monitors to use
          allSrcFiles <- filter (not . ("dist-newstyle" `isPrefixOf`))
                     <$> getDirectoryContentsRecursive srcdir

ezyang added a commit to ezyang/cabal that referenced this issue Apr 1, 2016
@ezyang
Copy link
Contributor

ezyang commented Apr 1, 2016

@edsko, I have a provisional fix for this problem, at least for non-Custom Setup scripts. I did this by piggy backing off of sdist: any files which would get sdisted, we'll track; any files which aren't, we don't track.

There is one problem with this (as well as the previous code): if a user ADDs a new file which gets picked up on a search, the rebuilder won't notice. I don't think we can solve this problem without (1) duplicating all of the sdist code into cabal-install, rewritten to use the rebuild probing functions, or (2) dropping the rebuilder functionality into Cabal so that SrcDist can be written in the correct way.

ezyang added a commit to ezyang/cabal that referenced this issue Apr 2, 2016
As a refactoring, this moves allPackageSourceFiles from
Distribution.Client.Sandbox.Timestamp to
Distribution.Client.SrcDist, makes it thread safe, and has
it return files relative to the source directory.

Signed-off-by: Edward Z. Yang <[email protected]>
ezyang added a commit to ezyang/cabal that referenced this issue Apr 2, 2016
As a refactoring, this moves allPackageSourceFiles from
Distribution.Client.Sandbox.Timestamp to
Distribution.Client.SrcDist, makes it thread safe, and has
it return files relative to the source directory.

Signed-off-by: Edward Z. Yang <[email protected]>
ezyang added a commit to ezyang/cabal that referenced this issue Apr 5, 2016
As a refactoring, this moves allPackageSourceFiles from
Distribution.Client.Sandbox.Timestamp to
Distribution.Client.SrcDist, makes it thread safe, and has
it return files relative to the source directory.

Signed-off-by: Edward Z. Yang <[email protected]>
dcoutts pushed a commit to dcoutts/cabal that referenced this issue Apr 8, 2016
As a refactoring, this moves allPackageSourceFiles from
Distribution.Client.Sandbox.Timestamp to
Distribution.Client.SrcDist, makes it thread safe, and has
it return files relative to the source directory.

Signed-off-by: Edward Z. Yang <[email protected]>
dcoutts pushed a commit to dcoutts/cabal that referenced this issue Apr 10, 2016
As a refactoring, this moves allPackageSourceFiles from
Distribution.Client.Sandbox.Timestamp to
Distribution.Client.SrcDist, makes it thread safe, and has
it return files relative to the source directory.

Signed-off-by: Edward Z. Yang <[email protected]>
@ezyang ezyang closed this as completed in 1f66339 Apr 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants