Skip to content

prefer HEAD to master in update tasks #1502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 11, 2020
Merged

prefer HEAD to master in update tasks #1502

merged 3 commits into from
Sep 11, 2020

Conversation

peff
Copy link
Member

@peff peff commented Sep 10, 2020

When we fetch book and manpage data from other repositories, we hard-code master as the branch name to use. Let's switch instead to the default-branch HEAD, which will do the right thing if those repositories ever change their default branch (e.g., as discussed in #1501).

This isn't quite just s/master/HEAD/ because of the way GitHub's API works. See 0029905 for details. On the plus side, that change helped me identify and fix a possible race condition in the update code (fixed in cee3bf1).

@HonkingGoose
Copy link
Contributor

Hi @peff you might want to amend the commit message for commit 0029905, it contains a small spelling error:

Note that we can't ust swap out "HEAD" here.

I think you meant Note that we can't just swap out "HEAD' here.

When running our nightly book update job, we assume the default branch
for the progit2 repo (and any translation repo) is "master". This may be
changing in the future. Best case that would cause us to stop getting
updates. Worst case it may actually cause subtle bugs, since we pull the
tree sha by asking for "HEAD".

Note that we can't just swap out "HEAD" here. GitHub's refs API expects
us to ask for "heads/foo" or "tags/foo", and passing "HEAD" seems to
imply "heads/HEAD". I didn't check if that's Octokit being clever, or is
how GitHub's API behaves, but either way it doesn't work.

We can hack around it by using the "commit" endpoint, which takes any
resolvable name. And since "HEAD" always has to point to a commit (by
Git's rules), we're in no danger of being confused by another object
type. It does get us a bit of extra information we don't care about (we
just need the sha, not the actual commit message), but the extra cost
should be negligible in practice.
When we index book content, we use the sha1 of the HEAD commit as a
cache key to see whether anything has been updated. But we fetch the
tree content by asking for HEAD separately. It's unlikely, but the
repository could be updated between the two, meaning we'd store a
mis-matched cache key (and because we get the tree first, we'd be stuck
with the old data under the new key, and thus wouldn't even notice on
the next run).

We can fix this by getting the commit first, and then using the tree
sha1 it mentions, which is atomic. This is easy now that we're grabbing
the whole commit (and this also matches how we do documentation
indexing).
When there are no tags, we look for documentation in the hard-coded
master branch. This is unlikely for git.git, but presumably triggers for
translation repos. We should be more flexible and just use HEAD in this
case, which will do the right thing if the translation repos switch
their default branch.

Note that we don't to do anything clever here like we did for the
"book2" case. We're already using the commit API for the remote index,
so we're free to just look up the name "HEAD". And likewise the local
case is just running Git commands, so that name is fine.
@peff peff temporarily deployed to git-scm-pr-1502 September 11, 2020 09:28 Inactive
@peff peff merged commit a92db37 into master Sep 11, 2020
@peff peff deleted the prefer-head branch September 11, 2020 09:29
@peff
Copy link
Member Author

peff commented Sep 11, 2020

Thanks, fixed! I had run this locally when I wrote it yesterday, but I just double-checked by running the tasks manually on the production site, too. Everything seems good.

@peff
Copy link
Member Author

peff commented Sep 11, 2020

I just double-checked by running the tasks manually on the production site, too. Everything seems good.

...and this wasn't quite true. It did build everything correctly, but it broke the commit-sha caching in a way that would blow up next time it tried to run. I pushed the fix straight to master (f945dfa) and am re-running the book task manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants