You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think with a few infra changes (borrowing some ideas from the Microsoft build ProdCon infra), we can enable incremental builds in source-build:
1. Use a unique blob feed output directory for each repo build.
This allows us to understand which artifacts came from which repos and easily delete (or ignore) stale assets.
2. Give each repo build its required inputs only.
Do this by either merging the upstream blob feeds into an input blob feed (like ProdCon) or pointing repos to multiple blob feeds.
Providing only the required inputs improves reliability: if repos can only use the artifacts they explicitly depend on via RepositoryReference, we will likely detect and fix any unknown dependencies and end up with a more correct graph.
3. Use an isolated, fresh package cache directory for each repo build.
This lets us clear the cache to consume new packages from upstream repos. Currently, we can't simply delete it because the source-build infra also shares the single package cache.
4. After each build, make cached packages available to downstream builds.
Without this, each submodule would download a fresh copy of every package even if an earlier repo already downloaded it, slowing the build down considerably and increasing the impact of network flakiness.
5. After harvesting a package cache directory, delete it.
The duplicate packages in each isolated cache would take a lot of space, and our build's already too large for some CI infra now.
For diagnosability, when we delete a package cache, we should note which packages were in it. If any changes were made to a specific package in-place (e.g. by BuildTools init), we should leave it alone in case it's important to understand a problem. (There are likely very few instances of this.)
6. Add a semaphore for each repo build completion.
This way, if a build fails, fix it and rerun the default build command to pick up where it left off.
When a repo starts building, all downstream repo semaphores should be removed so the new build's dependencies flow through to the end without question. This will help when making changes to e.g. CoreFX and expecting them show up in the Runtime without fully cleaning the repo first.
7. We misuse /tmp as general staging.
prep.sh extracts PackageVersions.props to /tmp explictly. This creates a problem with re-running the build or running as a different user than the first time you built it. We probably have over such abuses of /tmp and should use mktemp() or similar instead.
Non-goals
I don't think we should try to detect submodule changes and automatically rebuild. A false positive would potentially rebuild a lot and be very frustrating. Deleting the semaphore doesn't seem like a terrible experience. We can provide utility props to invalidate builds if that seems easier than deleting semaphores.
Future benefits
This would help a lot with the source-build CI legs that we plan to add to each official repo: the runs would go much faster by downloading a cache of the upstream dependencies rather than building all upstreams every single time. Blob feed output isolation would let upstreams zip up their source-built outputs and put them somewhere without any fuss, transferring them through BAR/Maestro++/Darc.
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
I think with a few infra changes (borrowing some ideas from the Microsoft build ProdCon infra), we can enable incremental builds in source-build:
1. Use a unique blob feed output directory for each repo build.
This allows us to understand which artifacts came from which repos and easily delete (or ignore) stale assets.
2. Give each repo build its required inputs only.
Do this by either merging the upstream blob feeds into an input blob feed (like ProdCon) or pointing repos to multiple blob feeds.
Providing only the required inputs improves reliability: if repos can only use the artifacts they explicitly depend on via
RepositoryReference
, we will likely detect and fix any unknown dependencies and end up with a more correct graph.3. Use an isolated, fresh package cache directory for each repo build.
This lets us clear the cache to consume new packages from upstream repos. Currently, we can't simply delete it because the source-build infra also shares the single package cache.
4. After each build, make cached packages available to downstream builds.
Without this, each submodule would download a fresh copy of every package even if an earlier repo already downloaded it, slowing the build down considerably and increasing the impact of network flakiness.
5. After harvesting a package cache directory, delete it.
The duplicate packages in each isolated cache would take a lot of space, and our build's already too large for some CI infra now.
For diagnosability, when we delete a package cache, we should note which packages were in it. If any changes were made to a specific package in-place (e.g. by BuildTools init), we should leave it alone in case it's important to understand a problem. (There are likely very few instances of this.)
6. Add a semaphore for each repo build completion.
This way, if a build fails, fix it and rerun the default build command to pick up where it left off.
When a repo starts building, all downstream repo semaphores should be removed so the new build's dependencies flow through to the end without question. This will help when making changes to e.g. CoreFX and expecting them show up in the Runtime without fully cleaning the repo first.
7. We misuse
/tmp
as general staging.prep.sh
extractsPackageVersions.props
to/tmp
explictly. This creates a problem with re-running the build or running as a different user than the first time you built it. We probably have over such abuses of/tmp
and should usemktemp()
or similar instead.Non-goals
I don't think we should try to detect submodule changes and automatically rebuild. A false positive would potentially rebuild a lot and be very frustrating. Deleting the semaphore doesn't seem like a terrible experience. We can provide utility props to invalidate builds if that seems easier than deleting semaphores.
Future benefits
This would help a lot with the source-build CI legs that we plan to add to each official repo: the runs would go much faster by downloading a cache of the upstream dependencies rather than building all upstreams every single time. Blob feed output isolation would let upstreams zip up their source-built outputs and put them somewhere without any fuss, transferring them through BAR/Maestro++/Darc.
The text was updated successfully, but these errors were encountered: