Skip to content

Add incremental builds by increasing isolation #928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dagood opened this issue Dec 12, 2018 · 2 comments
Open

Add incremental builds by increasing isolation #928

dagood opened this issue Dec 12, 2018 · 2 comments
Labels
area-build Improvements in source-build's own build process

Comments

@dagood
Copy link
Member

dagood commented Dec 12, 2018

I think with a few infra changes (borrowing some ideas from the Microsoft build ProdCon infra), we can enable incremental builds in source-build:

1. Use a unique blob feed output directory for each repo build.

This allows us to understand which artifacts came from which repos and easily delete (or ignore) stale assets.

2. Give each repo build its required inputs only.

Do this by either merging the upstream blob feeds into an input blob feed (like ProdCon) or pointing repos to multiple blob feeds.

Providing only the required inputs improves reliability: if repos can only use the artifacts they explicitly depend on via RepositoryReference, we will likely detect and fix any unknown dependencies and end up with a more correct graph.

3. Use an isolated, fresh package cache directory for each repo build.

This lets us clear the cache to consume new packages from upstream repos. Currently, we can't simply delete it because the source-build infra also shares the single package cache.

4. After each build, make cached packages available to downstream builds.

Without this, each submodule would download a fresh copy of every package even if an earlier repo already downloaded it, slowing the build down considerably and increasing the impact of network flakiness.

5. After harvesting a package cache directory, delete it.

The duplicate packages in each isolated cache would take a lot of space, and our build's already too large for some CI infra now.

For diagnosability, when we delete a package cache, we should note which packages were in it. If any changes were made to a specific package in-place (e.g. by BuildTools init), we should leave it alone in case it's important to understand a problem. (There are likely very few instances of this.)

6. Add a semaphore for each repo build completion.

This way, if a build fails, fix it and rerun the default build command to pick up where it left off.

When a repo starts building, all downstream repo semaphores should be removed so the new build's dependencies flow through to the end without question. This will help when making changes to e.g. CoreFX and expecting them show up in the Runtime without fully cleaning the repo first.

7. We misuse /tmp as general staging.

prep.sh extracts PackageVersions.props to /tmp explictly. This creates a problem with re-running the build or running as a different user than the first time you built it. We probably have over such abuses of /tmp and should use mktemp() or similar instead.


Non-goals

I don't think we should try to detect submodule changes and automatically rebuild. A false positive would potentially rebuild a lot and be very frustrating. Deleting the semaphore doesn't seem like a terrible experience. We can provide utility props to invalidate builds if that seems easier than deleting semaphores.

Future benefits

This would help a lot with the source-build CI legs that we plan to add to each official repo: the runs would go much faster by downloading a cache of the upstream dependencies rather than building all upstreams every single time. Blob feed output isolation would let upstreams zip up their source-built outputs and put them somewhere without any fuss, transferring them through BAR/Maestro++/Darc.

@MichaelSimons
Copy link
Member

@dseefeld - What is the status of this issue? Why is it in the ArPow epic?

@MichaelSimons
Copy link
Member

[Triage] #6 - semaphore work, is the only item completed. The rest of the items become much simpler to implement once ArPow is in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-build Improvements in source-build's own build process
Projects
Status: Backlog
Development

No branches or pull requests

4 participants