Skip to content

Implement incremental prebuilds #4167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 21, 2021
Merged

Implement incremental prebuilds #4167

merged 4 commits into from
May 21, 2021

Conversation

jankeromnes
Copy link
Contributor

@jankeromnes jankeromnes commented May 7, 2021

Design doc

https://www.notion.so/gitpod/Incremental-Prebuilds-49c0a840e54348acba05677bdc450841

How to test

  • Use an io-dev preview (e.g. https://jx-incremental-prebuilds.staging.gitpod-io-dev.com/workspaces/)

  • Use the #prebuild/<repo> URL prefix to trigger a full prebuild for some commit on some repository

  • Use #incremental-prebuild/<repo> to trigger an incremental prebuild for some (different) commit (this will automatically select a suitable parent prebuild, or fall back to a full prebuild)

  • Verify that both full and incremental prebuilds work as expected (e.g. the project is prebuilt & starts as usual)

  • Testing with https://github.com/jankeromnes/gitpod-staging-prebuilds may be helpful (the .gitpod.yml logs every task run, along with timestamp and workspace ID, and prints a summary -- e.g. 2 different workspace IDs mean "full prebuild", 3 IDs mean "incremental prebuild")

  • Check incremental prebuilds in the DB (e.g. parents, commits, durations, states) like so:

mysql> select p.cloneUrl, substr(p.id,1,8) id, substr(w.basedOnPrebuildId,1,8) parentId, p.creationTime, substr(p.commit,1,8) commit, p.state, time_to_sec(timediff(cast(i.stoppedTime as datetime), cast(i.creationTime as datetime))) seconds
  from d_b_prebuilt_workspace p
  left join d_b_workspace w on w.id = p.buildWorkspaceId
  left join d_b_workspace_instance i on i.workspaceId = w.id
  where w.basedOnPrebuildId is not null or p.id in (select basedOnPrebuildId from d_b_workspace where id in (select buildWorkspaceId from d_b_prebuilt_workspace))
  order by p.cloneUrl, p.creationTime desc;
+-------------------------------------------------------------+----------+----------+----------------------------+----------+-----------+---------+
| cloneUrl                                                    | id       | parentId | creationTime               | commit   | state     | seconds |
+-------------------------------------------------------------+----------+----------+----------------------------+----------+-----------+---------+
| https://github.com/gitpod-io/gitpod.git                     | 676df827 | c9b671bc | 2021-05-11 14:48:54.034660 | 066200cd | building  |    NULL |
| https://github.com/gitpod-io/gitpod.git                     | c9b671bc | NULL     | 2021-05-11 14:43:29.295607 | a0a5017b | available |     450 |
| https://github.com/gitpod-io/gitpod.git                     | 0d45a1a1 | cb7566b8 | 2021-05-11 14:40:39.234595 | 79db9d8b | available |     213 |
| https://github.com/gitpod-io/gitpod.git                     | b4a03621 | cb7566b8 | 2021-05-11 14:28:54.040767 | 3e3a6cb8 | available |     262 |
| https://github.com/gitpod-io/gitpod.git                     | 142ddbf2 | cb7566b8 | 2021-05-11 14:28:16.082448 | a0a5017b | available |     234 |
| https://github.com/gitpod-io/gitpod.git                     | 62a62628 | cb7566b8 | 2021-05-11 14:20:16.920090 | 70097b6d | available |     248 |
| https://github.com/gitpod-io/gitpod.git                     | cb7566b8 | NULL     | 2021-05-11 13:46:32.152943 | 2b2702f3 | available |     612 |
| https://github.com/jankeromnes/gitpod-staging-prebuilds.git | 92979128 | 088a5082 | 2021-05-11 13:59:07.449600 | 243426ad | available |      24 |
| https://github.com/jankeromnes/gitpod-staging-prebuilds.git | 964730fb | 088a5082 | 2021-05-11 13:56:44.305709 | 6f2aff3e | available |      32 |
| https://github.com/jankeromnes/gitpod-staging-prebuilds.git | 088a5082 | NULL     | 2021-05-11 13:49:50.546575 | 98b4bfb6 | available |      46 |
| https://gitlab.com/gitlab-org/gitlab.git                    | 2e015621 | 8eef45ed | 2021-05-11 14:25:43.513453 | af8249ab | available |     429 |
| https://gitlab.com/gitlab-org/gitlab.git                    | 76e903e8 | 8eef45ed | 2021-05-11 14:15:08.733923 | c7e05e5a | available |     393 |
| https://gitlab.com/gitlab-org/gitlab.git                    | 8eef45ed | NULL     | 2021-05-11 13:51:22.791788 | 7972bd89 | available |     545 |
+-------------------------------------------------------------+----------+----------+----------------------------+----------+-----------+---------+
13 rows in set, 25 warnings (0.01 sec)

@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 7, 2021

/werft run

👍 started the job as gitpod-build-jx-incremental-prebuilds.17

@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch 4 times, most recently from 4973977 to 32fb208 Compare May 10, 2021 16:19
@jankeromnes
Copy link
Contributor Author

Note: The #prebuild/<repo> manual prefix always triggers a non-incremental (i.e. full) prebuild.

So, in order to facilitate testing, I'm temporarily introducing a #incremental-prebuild/<repo> manual prefix.

@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch 2 times, most recently from 6220005 to 6996cec Compare May 11, 2021 12:58
@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 11, 2021

Works generally well (both full and incremental prebuilds), but my io-dev environment seems to have produced one bad prebuild (676df827, the latest one for the Gitpod repo, an incremental prebuild based on c9b671bc):

(view incremental prebuilds table)
+-------------------------------------------------------------+----------+----------+----------------------------+----------+-----------+---------+
| cloneUrl                                                    | id       | parentId | creationTime               | commit   | state     | seconds |
+-------------------------------------------------------------+----------+----------+----------------------------+----------+-----------+---------+
| https://github.com/gitpod-io/gitpod.git                     | 676df827 | c9b671bc | 2021-05-11 14:52:25.171952 | 066200cd | available |     234 |
| https://github.com/gitpod-io/gitpod.git                     | c9b671bc | NULL     | 2021-05-11 14:43:29.295607 | a0a5017b | available |     450 |
| https://github.com/gitpod-io/gitpod.git                     | 0d45a1a1 | cb7566b8 | 2021-05-11 14:40:39.234595 | 79db9d8b | available |     213 |
| https://github.com/gitpod-io/gitpod.git                     | b4a03621 | cb7566b8 | 2021-05-11 14:28:54.040767 | 3e3a6cb8 | available |     262 |
| https://github.com/gitpod-io/gitpod.git                     | 142ddbf2 | cb7566b8 | 2021-05-11 14:28:16.082448 | a0a5017b | available |     234 |
| https://github.com/gitpod-io/gitpod.git                     | 62a62628 | cb7566b8 | 2021-05-11 14:20:16.920090 | 70097b6d | available |     248 |
| https://github.com/gitpod-io/gitpod.git                     | cb7566b8 | NULL     | 2021-05-11 13:46:32.152943 | 2b2702f3 | available |     612 |
| https://github.com/jankeromnes/gitpod-staging-prebuilds.git | 92979128 | 088a5082 | 2021-05-11 13:59:07.449600 | 243426ad | available |      24 |
| https://github.com/jankeromnes/gitpod-staging-prebuilds.git | 964730fb | 088a5082 | 2021-05-11 13:56:44.305709 | 6f2aff3e | available |      32 |
| https://github.com/jankeromnes/gitpod-staging-prebuilds.git | 088a5082 | NULL     | 2021-05-11 13:49:50.546575 | 98b4bfb6 | available |      46 |
| https://gitlab.com/gitlab-org/gitlab.git                    | ed2dfcde | 1aaf4ba0 | 2021-05-11 15:22:33.835236 | dc61d15f | building  |    NULL |
| https://gitlab.com/gitlab-org/gitlab.git                    | 1aaf4ba0 | NULL     | 2021-05-11 15:18:33.386248 | 74d486b7 | available |     603 |
| https://gitlab.com/gitlab-org/gitlab.git                    | 2e015621 | 8eef45ed | 2021-05-11 14:25:43.513453 | af8249ab | available |     429 |
| https://gitlab.com/gitlab-org/gitlab.git                    | 76e903e8 | 8eef45ed | 2021-05-11 14:15:08.733923 | c7e05e5a | available |     393 |
| https://gitlab.com/gitlab-org/gitlab.git                    | 8eef45ed | NULL     | 2021-05-11 13:51:22.791788 | 7972bd89 | available |     545 |
+-------------------------------------------------------------+----------+----------+----------------------------+----------+-----------+---------+

Opening the specific commit context URL for that prebuild always fails with this error:

cannot initialize workspace: cannot initialize workspace: content initializer failed
(view screenshot) Screenshot 2021-05-11 at 17 22 51

Also, when such a pod fails, ws-daemon says:

(view ws-daemon logs)

One ws-daemon pod says:

$ kubectl logs ws-daemon-65jcs | grep copper-beaver-4xnmft2p
{"instanceId":"2f687ceb-eabb-4bfb-bd94-777ae9e027c3","level":"error","message":"received pod deletion for a workspace, but have not seen it before. Ignoring update.","serviceContext":{"service":"ws-daemon","version":""},"severity":"ERROR","time":"2021-05-11T15:28:49Z","userId":"d8cdd84b-5c9f-44e1-9971-3f11a36ce6b5","workspaceId":"copper-beaver-4xnmft2p"}

and the other says:

$ kubectl logs ws-daemon-sj2kg | grep --binary-file=text copper-beaver-4xnmft2p
{"instanceId":"2f687ceb-eabb-4bfb-bd94-777ae9e027c3","level":"info","message":"InitWorkspace called","serviceContext":{"service":"ws-daemon","version":""},"severity":"INFO","time":"2021-05-11T15:28:20Z","userId":"d8cdd84b-5c9f-44e1-9971-3f11a36ce6b5","workspaceId":"copper-beaver-4xnmft2p"}
{"hooks":2,"instanceId":"2f687ceb-eabb-4bfb-bd94-777ae9e027c3","level":"debug","message":"running lifecycle hooks","serviceContext":{"service":"ws-daemon","version":""},"severity":"DEBUG","state":"initializing","time":"2021-05-11T15:28:20Z","userId":"d8cdd84b-5c9f-44e1-9971-3f11a36ce6b5","workspaceId":copper-beaver-4xnmft2p"}
{"instanceId":"2f687ceb-eabb-4bfb-bd94-777ae9e027c3","level":"info","message":"established IWS server","serviceContext":{"service":"ws-daemon","version":""},"severity":"INFO","time":"2021-05-11T15:28:20Z","userId":"d8cdd84b-5c9f-44e1-9971-3f11a36ce6b5","workspaceId":"copper-beaver-4xnmft2p"}
{"instanceId":"2f687ceb-eabb-4bfb-bd94-777ae9e027c3","level":"debug","message":"found sandbox - adding to label cache","podname":"ws-2f687ceb-eabb-4bfb-bd94-777ae9e027c3","serviceContext":{"service":"ws-daemon","version":""},"severity":"DEBUG","time":"2021-05-11T15:28:21Z","userId":"d8cdd84b-5c9f-44e1-9971-3f11a36ce6b5","workspaceId":"copper-beaver-4xnmft2p"}
{"hooks":1,"instanceId":"2f687ceb-eabb-4bfb-bd94-777ae9e027c3","level":"debug","message":"running lifecycle hooks","serviceContext":{"service":"ws-daemon","version":""},"severity":"DEBUG","state":"disposing","time":"2021-05-11T15:28:43Z","userId":"d8cdd84b-5c9f-44e1-9971-3f11a36ce6b5","workspaceId":"copper-beaver-4xnmft2p"}
{"instanceId":"2f687ceb-eabb-4bfb-bd94-777ae9e027c3","level":"info","message":"stopped IWS server","serviceContext":{"service":"ws-daemon","version":""},"severity":"INFO","time":"2021-05-11T15:28:43Z","userId":"d8cdd84b-5c9f-44e1-9971-3f11a36ce6b5","workspaceId":"copper-beaver-4xnmft2p"}
{"hooks":0,"instanceId":"2f687ceb-eabb-4bfb-bd94-777ae9e027c3","level":"debug","message":"running lifecycle hooks","serviceContext":{"service":"ws-daemon","version":""},"severity":"DEBUG","state":"disposed","time":"2021-05-11T15:28:44Z","userId":"d8cdd84b-5c9f-44e1-9971-3f11a36ce6b5","workspaceId":"copper-beaver-4xnmft2p"}
{"ID":"8d0eeb4dc007dd785664cc7010567c3af840da0abdcdc2a89cfd9dbfc3dc0ab9","instanceId":"2f687ceb-eabb-4bfb-bd94-777ae9e027c3","level":"debug","message":"found workspace container - updating label cache","podname":"ws-2f687ceb-eabb-4bfb-bd94-777ae9e027c3","serviceContext":{"service":"ws-daemon","version":""},"severity":"DEBUG","time":"2021-05-11T15:28:45Z","userId":"d8cdd84b-5c9f-44e1-9971-3f11a36ce6b5","workspaceId":"copper-beaver-4xnmft2p"}

Not sure what that's about, but should be investigated if it can be reproduced.

EDIT: Here is the complete error message from Google Cloud. Problem seems to be caused by merge conflicts due to an older prebuild:

cannot initialize workspace:
    github.com/gitpod-io/gitpod/content-service/pkg/initializer.InitializeWorkspace
        github.com/gitpod-io/gitpod/[email protected]/pkg/initializer/initializer.go:344
  - prebuild initializer:
    github.com/gitpod-io/gitpod/content-service/pkg/initializer.(*PrebuildInitializer).Run
        github.com/gitpod-io/gitpod/[email protected]/pkg/initializer/prebuild.go:96
  - git stash push -u failed (exit status 1): components/supervisor/go.sum: needs merge
components/workspacekit/go.sum: needs merge
components/ws-daemon/go.sum: needs merge
components/ws-manager/go.sum: needs merge

@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch from 30f4a9d to 295497f Compare May 11, 2021 15:49
@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 11, 2021

TODO:

@svenefftinge
Copy link
Member

svenefftinge commented May 12, 2021

Make content initializer resilient to merge conflicts from older prebuilds (see #4167 (comment))

Oh, that reveals a more interesting problem with our init approach. I think we should reset --hard to the context's branch. Also looking at the code for local branch context (i.e. issue context) we don't pass the corresponding remote branch, which is needed now.

@jankeromnes
Copy link
Contributor Author

Make content initializer resilient to merge conflicts from older prebuilds (see #4167 (comment))

Oh, that reveals a more interesting problem with our init approach. I think we should reset --hard to the context's branch.

Indeed. I believe what happened was:

  • the first (full) prebuild c9b671bc checked out commit a0a5017b and ran a full build, producing some untracked go.sum changes
  • the second (incremental) prebuild loaded the first prebuild, then stashed the go.sum changes, checked out commit 066200cd, then attempted to pop the stash, but this caused merge conflicts because the go.sum files had been updated between commits a0a5017b and 066200cd
  • then, whenever I tried to use the incremental prebuild in an interactive workspace, it would try to stash changes again, but this now fails due to the unresolved conflicts (i.e. the incremental prebuild "succeeded" but can never be loaded / used by any workspaces)

The problem is that we intentionally leave "stash pop" conflicts in place when loading a prebuild, delegating the resolution to the user. This no longer makes sense when we load an older prebuild to produce a newer prebuild.

I agree we might want to git reset --hard to the context's Git ref, at least for incremental prebuilds. Or maybe we undo a stash pop that leads to conflicts.

@jankeromnes
Copy link
Contributor Author

Also looking at the code for local branch context (i.e. issue context) we don't pass the corresponding remote branch, which is needed now.

Indeed, great catch! Since origin is fetched prior to the checkout -b, I believe we can achieve correctness by simply basing the new local branch on origin/HEAD, e.g. like so:

ws.Git(ctx, "checkout", "-B", ws.CloneTarget, "origin/HEAD")

jankeromnes added a commit that referenced this pull request May 14, 2021
…base it on origin/HEAD

We previously assumed that a prebuild snapshot holds the latest commits,
but this changes with incremental prebuilds (where an older prebuild
doesn't have the latest commits).

Fixes #4167 (comment)
@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 14, 2021

🎰

/werft run

👍 started the job as gitpod-build-jx-incremental-prebuilds.33

@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch 3 times, most recently from fb98941 to 211b7d7 Compare May 17, 2021 14:18
@svenefftinge
Copy link
Member

svenefftinge commented May 17, 2021

Looking at the code it seems like there is no git fetch happening before the realizeCloneTarget.

@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch from 211b7d7 to b3be5bc Compare May 17, 2021 16:52
@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 17, 2021

Looking at the code it seems like there is no git fetch happening before the realizeCloneTarget.

@svenefftinge Which code do you mean specifically? In #4167 (comment) I pointed out:

err = p.Git.Fetch(ctx)
if err != nil {
return src, xerrors.Errorf("prebuild initializer: %w", err)
}
err = p.Git.realizeCloneTarget(ctx)
if err != nil {
return src, xerrors.Errorf("prebuild initializer: %w", err)
}

@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch 2 times, most recently from 274e508 to debcb34 Compare May 18, 2021 16:47
@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch from debcb34 to acf816e Compare May 19, 2021 15:05
@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch from acf816e to 9196637 Compare May 20, 2021 03:05
@jankeromnes jankeromnes marked this pull request as ready for review May 20, 2021 03:07
@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 20, 2021

Logging a small issue I noticed in testing: Occasionally, the #incremental-prebuild/ prefix fails and tells me "Cannot (re-)start irregular workspace". Refreshing once resolves it. Maybe there is a secondary code path that doesn't like incremental prebuilds? (E.g. when create workspace doesn't start the instance fast enough, and we call start again explicitly?)

@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 20, 2021

Alright, this is now ready for review! 🚀 @svenefftinge @geropl @csweichel please take a look when convenient. 🙏

Notes:

Please ping me if you have any questions! 🌻

@svenefftinge
Copy link
Member

svenefftinge commented May 21, 2021

/werft run

👍 started the job as gitpod-build-jx-incremental-prebuilds.44

@svenefftinge
Copy link
Member

svenefftinge commented May 21, 2021

Works very well using the prebuild and incremental-prebuild context URLs.
It doesn't use a previous prebuild when starting a workspace on a non-prebuild commit as described here
https://www.notion.so/gitpod/Incremental-Prebuilds-49c0a840e54348acba05677bdc450841#de09ae678ff84779940e3237c0483c04

But I think it is fine to do this later when we'Äve gained more experience with incremental prebuilds.

@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch from 9196637 to cd46302 Compare May 21, 2021 08:38
@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 21, 2021

Rebased and squashed as hard as I could! 😁

@jankeromnes

This comment has been minimized.

@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 21, 2021

🎰

/werft run

👍 started the job as gitpod-build-jx-incremental-prebuilds.46

@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 21, 2021

🎰

/werft run

👍 started the job as gitpod-build-jx-incremental-prebuilds.47

@jankeromnes
Copy link
Contributor Author

jankeromnes commented May 21, 2021

Update: Simple re-deploy resolved the awkward content init / ws-daemon / registry-facade errors from #4167 (comment) 👍 everything works again.

Copy link
Contributor

@csweichel csweichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me (sans some style things) :)

…ut the correct revision, even for older clones (e.g. incremental prebuild)

We previously assumed that a prebuild snapshot holds the latest commits,
but this changes with incremental prebuilds (where an older prebuild
doesn't have the latest commits, but 'origin' does).

Fixes #4167 (comment)
…d after checkout fails, throw them away instead of leaving merge conflicts in the workspace
@jankeromnes jankeromnes force-pushed the jx/incremental-prebuilds branch from cd46302 to da6dee0 Compare May 21, 2021 13:20
@jankeromnes jankeromnes merged commit 6e5b4c7 into main May 21, 2021
@jankeromnes jankeromnes deleted the jx/incremental-prebuilds branch May 21, 2021 13:53
jankeromnes added a commit that referenced this pull request May 21, 2021
…ut the correct revision, even for older clones (e.g. incremental prebuild)

We previously assumed that a prebuild snapshot holds the latest commits,
but this changes with incremental prebuilds (where an older prebuild
doesn't have the latest commits, but 'origin' does).

Fixes #4167 (comment)
MatthewFagan pushed a commit to trilogy-group/gitpod that referenced this pull request Nov 17, 2021
…ut the correct revision, even for older clones (e.g. incremental prebuild)

We previously assumed that a prebuild snapshot holds the latest commits,
but this changes with incremental prebuilds (where an older prebuild
doesn't have the latest commits, but 'origin' does).

Fixes gitpod-io#4167 (comment)
MatthewFagan pushed a commit to trilogy-group/gitpod that referenced this pull request Nov 18, 2021
…ut the correct revision, even for older clones (e.g. incremental prebuild)

We previously assumed that a prebuild snapshot holds the latest commits,
but this changes with incremental prebuilds (where an older prebuild
doesn't have the latest commits, but 'origin' does).

Fixes gitpod-io#4167 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants