Skip to content

Plan structure of October 2020 PR to Microsoft #314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mattmccutchen-cci opened this issue Oct 30, 2020 · 8 comments
Closed

Plan structure of October 2020 PR to Microsoft #314

mattmccutchen-cci opened this issue Oct 30, 2020 · 8 comments
Assignees

Comments

@mattmccutchen-cci
Copy link
Member

mattmccutchen-cci commented Oct 30, 2020

I think we should start drafting the PR to Microsoft now to flush out any surprises. For example, we didn't realize until a week ago that we needed to avoid sending our .github/workflows/main.yml to Microsoft, and fortunately, we found a workaround to do so without making the repositories diverge yet. I've taken a quick look and I don't think there's anything more like that, but it would be good to take a closer look sooner rather than later.

First question: Is it OK for me to go ahead and create the PR in Microsoft's repository in draft status, or do we need to talk to them first?

As previously discussed, a PR with a ton of files that are both renamed and heavily edited (due to #299) is awful to review. We should give Microsoft a sequence of stage diffs to review, with one stage containing all the file renames from #299 and no edits, much as we did for our own review of #299. The other stages would be mostly edits but might contain file renames unrelated to #299.

The first idea (by Mike) was to break the overall change into stages between (0) Microsoft's master at the time we submit the PR, (1) the beginning of the file renames in #299, (2) the end of the file renames in #299, and (3) our master (after #313), and then send 0->1, 1->2, and 2->3 as a sequence of PRs. However, we can't send a PR of 2 to Microsoft's master because 2 doesn't build (files have been renamed but references to them haven't yet been updated) and mustn't go onto Microsoft's mainline. (I think 1 builds but haven't tried it.)

The second idea (by me) was to send a single PR of 0->3 but provide links to the diffs for 0->1, 1->2, and 2->3 so that Microsoft can view each of them before accepting the PR. However, 2->3 will include at least one merge from Microsoft (#313), so when Microsoft reviews the 2->3 diff, they will see their own changes to non-3C stuff in there, which will be confusing.

The third idea (by me) was to generate temporary commits 1' and 2' as the merges of 1 and 2, respectively, with 0, and provide diff links for 0->1', 1'->2', and 2'->3. However, we would still have (for example) .github/workflows/main.yml appearing in 0->1' and disappearing again in 2'->3. It would be better if none of the stages contained changes unacceptable to Microsoft.

The fourth idea (by me) is to re-run the rename script on 0 (having removed any parts of the script that would fail due to things not existing in 0). This will produce edits resulting in A, followed by file renames resulting in B, followed by edits resulting in C. Then we discard C and provide diff links for 0->A, A->B, and B->3. The alternate history containing A and B would be generated only for review purposes; it could be archived but wouldn't be in the PR to be incorporated into the permanent history, since the parallel sequences of commits would be confusing. In fact, for a slight simplification, I could modify the rename process so that the file edits between 0 and A are instead performed after B, so we have only 0->B and B->3. I couldn't do that in #299 because I wanted to keep all the automated commits (file renames and edits) consecutive and I needed manual edits both before and after the automated edits, but for the PR to Microsoft, I don't think we need to separate automated changes from manual changes.

So far, I haven't found anything wrong with the fourth idea. Shall I give it a try?

@mattmccutchen-cci mattmccutchen-cci self-assigned this Oct 30, 2020
@mwhicks1
Copy link
Member

Answers:

  1. Yes, go ahead and make a draft PR.
  2. Yes, give it a try.

@mattmccutchen-cci
Copy link
Member Author

Whee! checkedc#930

@mattmccutchen-cci
Copy link
Member Author

Microsoft's bot has already alerted me to one issue: we need to find out the proper way to agree to Microsoft's CLA for the entire PR (and make sure we have the authority to do so). @mwhicks1 How have you handled this in the past? I don't see any precedent on checkedc#891.

@mwhicks1
Copy link
Member

We have done PRs to Microsoft before. I have signed the CLA for that to be possible.

@mattmccutchen-cci
Copy link
Member Author

OK, as per the fourth plan, here are:

@mattmccutchen-cci
Copy link
Member Author

Verifying that we have the authority to submit everything under the CLA: (I don't know if we did this in the past, but we probably should from now on at least.)

Hopefully we can just look at the commit author names:

$ git shortlog --summary --email microsoft/master..origin/master
   154  Aaron Eline <[email protected]>
     3  Aaron Eline <[email protected]>
**   1  Abel Nieto <[email protected]>
   397  Aravind Machiry <[email protected]>
    21  Aravind Machiry <[email protected]>
**   1  David Tarditi <[email protected]>
     1  Ian Sweet <[email protected]>
     1  Ian Sweet <[email protected]>
     7  John Kastner <[email protected]>
   243  John Kastner <[email protected]>
    39  John Kastner <[email protected]>
**   1  Katherine Kjeer <[email protected]>
    53  Machiry Aravind Kumar <[email protected]>
     1  Machiry Aravind Kumar <[email protected]>
**   3  Mandeep Singh Grang <[email protected]>
    60  Matt McCutchen (Correct Computation) <[email protected]>
     1  Michael Hicks <[email protected]>
    76  Michael Hicks <[email protected]>
     4  Shilpa Roy <[email protected]>
    86  Shilpa Roy <[email protected]>
     7  aeline <[email protected]>
**   2  hasantouma <[email protected]>
   205  mwhicks1 <[email protected]>
     1  sroy4899 <[email protected]>

I'm going to assume that everyone with a CCI email address (and Aravind: ping @mwhicks1 again to get him one!!) has assigned copyright to CCI on their other contributions too. That leaves the authors marked **. Their commits:

$ git shortlog --format='* [%h] %s' --email -E --author='<([email protected]|[email protected]|6687333\+kkjeer@users\.noreply\.github\.com|[email protected]|[email protected])>' microsoft/master..origin/master
Abel Nieto <[email protected]> (1):
      * [e273f315be3e] [WIP] Add existential structs to Checked C (#683)

David Tarditi <[email protected]> (1):
      * [1fb6dc9fb3b0] Fix release build failures for existential types. (#685)

Katherine Kjeer <[email protected]> (1):
      * [76091100bcd3] Remove -fno-checkedc-null-ptr-arith flag from null ptr unit tests (#687)

Mandeep Singh Grang <[email protected]> (3):
      * [bffb3010ba67] Fix a few build warnings (#688)
      * [cbebe7ae98cc] Fix a build warning with checked-c-convert (#689)
      * [aa65e7d4566e] Add monorepo announcement to README.md (#690)

hasantouma <[email protected]> (2):
      * [5c803d86be8d] fixed variable name error
      * [eb70a26372aa] added 'dump-intermediate' to default arguments

The first five commits appear to be squashes of the merged PRs still accessible in Microsoft's repository (which were presumably produced by merging the PRs on GitHub), although Microsoft seems to have rewritten their history to contain cherry-picks of the squashes. A terrible script to verify this:

cat >picks <<EOF
e273f315be3e 683 86
1fb6dc9fb3b0 685 1
76091100bcd3 687 2
bffb3010ba67 688 1
cbebe7ae98cc 689 1
EOF
(set -e; while read c p n; do c2=$(git log --format='%h' --grep="Cherry-picked from commit $c" microsoft/master); git fetch microsoft refs/pull/$p/head:microsoft-pull/$p; git diff $h microsoft-pull/$p{~$n,} >$p-diff-pull; git diff $c{^,} >$p-diff-ours; git diff $c2{^,} >$p-diff-cp; diff --color=always -u $p-diff-pull $p-diff-ours || :; diff --color=always -u $p-diff-cp $p-diff-ours || :; done <picks) | less -R

For the sixth commit, we again have a squash of the merged PR, but Microsoft's history instead contains a different commit 4e6d682 that seems to have the changes from the PR and some others:

(set -e; c=aa65e7d4566e; p=690; c2=4e6d6820d1471d8ccae8e381ab21014229fc0dd5; git fetch microsoft refs/pull/$p/head:microsoft-pull/$p; case $p in (683) n=86;; (*) n=1;; esac; git diff $h microsoft-pull/$p{~$n,} >$p-diff-pull; git diff $c{^,} >$p-diff-ours; git diff $c2{^,} >$p-diff-ms; diff --color=always -u $p-diff-pull $p-diff-ours || :; diff --color=always -u $p-diff-ms $p-diff-ours || :) | less -R

I think this is OK too.

That leaves the two small commits from hasantouma. @mwhicks1 What happened there and what should we do?

@mwhicks1
Copy link
Member

mwhicks1 commented Nov 4, 2020

Hasan was working for CCI when he made these changes, so we are OK (he was an intern last summer).

@mattmccutchen-cci
Copy link
Member Author

If a problem comes up later, I'll deal with it; I don't think the possibility merits holding this issue open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants