Skip to content

Add CLANGARM64 support to mingit and archive, various other ARM64-related optimizations #452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 23, 2023

Conversation

dennisameling
Copy link
Contributor

Some final bits before I can set up a full pipeline for creating ARM64 artifacts! 🎉

output_path=
sed_makepkg_e=
force=
src_pkg=
while case "$1" in
--only-32-bit)
--only-i686|--only-32-bit)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left --only-32-bit in place to not break existing CI pipelines and scripts. Ideally these should move to --only-i686 too.

MINGW_ARCH=mingw32
export MINGW_ARCH
;;
--only-64-bit)
--only-x86_64|--only-64-bit)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left --only-64-bit in place to not break existing CI pipelines and scripts. Ideally these should move to --only-x86_64 too.

inno_defines="$inno_defines$LF#define INSTALLER_FILENAME_SUFFIX 'arm64'"
inno_defines="$inno_defines$LF#define INSTALLER_FILENAME_SUFFIX 'ARM64'"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git-update-git-for-windows also expects uppercase ARM64.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about that. The git-update-git-for-windows script only does this if /arm64/ exists, which we're getting away from.

So it's not much of a precedent/requirement.

And if I look at the file names Git-2.39.0-arm64.exe or Git-2.39.0-ARM64.exe, I am not so sure that I prefer the latter.

@dennisameling do you have a strong opinion on this? I don't really have a strong opinion, but I want to "get it right".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about that. The git-update-git-for-windows script only does this if /arm64/ exists, which we're getting away from.

So it's not much of a precedent/requirement.

I mentioned it because $arch_bit is used to filter the release artifacts later on in the script. So it will be looking for Git-2.39.0-ARM64.exe, for example, if that's the suffix of the installer executable.

And if I look at the file names Git-2.39.0-arm64.exe or Git-2.39.0-ARM64.exe, I am not so sure that I prefer the latter.

@dennisameling do you have a strong opinion on this? I don't really have a strong opinion, but I want to "get it right".

I also don't prefer the latter, but (incorrectly) assumed that you preferred this based on some earlier conversations we had, which I can only vaguely remember. It looks like Microsoft themselves are also inconsistent and use ARM64 (uppercase), Arm and Arm64. Oh, well. 😅

The only strong opinion I have is that I'd like to take this opportunity to start moving away from $BITNESS and use full architecture suffixes everywhere. I'm fully aware that this has implications all over the place, but I think it's definitely worth it, and we're already slowly moving towards it with things like the recent changes to make-file-list.sh.

While I think Option 1 in the table below is the easiest to read, I guess Option 2 is a bit easier to implement as it's consistent with the architectures MSYS2 uses. And I quite prefer staying as close as possible to MSYS2's naming and way of working.

Here's some examples of what consistency across the board could look like:

Current Option 1 Option 2 (MSYS2) Option 3 (Golang)
Git-2.39.0-64-bit.exe Git-2.39.0-x64.exe Git-2.39.0-x86_64.exe Git-2.39.0-amd64.exe
Git-2.39.0-32-bit.exe Git-2.39.0-x86.exe Git-2.39.0-i686.exe Git-2.39.0-386.exe
n/a Git-2.39.0-arm64.exe Git-2.39.0-aarch64.exe Git-2.39.0-arm64.exe
PortableGit-2.39.0-64-bit.7z.exe PortableGit-2.39.0-x64.7z.exe PortableGit-2.39.0-x86_64.7z.exe PortableGit-2.39.0-amd64.7z.exe
PortableGit-2.39.0-32-bit.7z.exe PortableGit-2.39.0-x86.7z.exe PortableGit-2.39.0-i686.7z.exe PortableGit-2.39.0-386.7z.exe
n/a PortableGit-2.39.0-arm64.7z.exe PortableGit-2.39.0-aarch64.7z.exe PortableGit-2.39.0-arm64.7z.exe

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if I look at the file names Git-2.39.0-arm64.exe or Git-2.39.0-ARM64.exe, I am not so sure that I prefer the latter.

@dennisameling do you have a strong opinion on this? I don't really have a strong opinion, but I want to "get it right".

I also don't prefer the latter, but (incorrectly) assumed that you preferred this based on some earlier conversations we had, which I can only vaguely remember. It looks like Microsoft themselves are also inconsistent and use ARM64 (uppercase), Arm and Arm64. Oh, well. 😅

ARM is the canonical spelling of the company name, but I agree that lowercase looks better in filenames. Keeping the 64 makes sense to distiguish this from Microsofts earlier short lived 32bit ARM endeavours.

The only strong opinion I have is that I'd like to take this opportunity to start moving away from $BITNESS and use full architecture suffixes everywhere. I'm fully aware that this has implications all over the place, but I think it's definitely worth it, and we're already slowly moving towards it with things like the recent changes to make-file-list.sh.

We'll definitely need to move away from $BITNESS in the long run. It made sense when x86 and AMD64 where the only options we had to care about, but it makes less sense with ARM64 on the table and it'll completely stop making any sense when we at some point in the future drop x86 support.

While I think Option 1 in the table below is the easiest to read, I guess Option 2 is a bit easier to implement as it's consistent with the architectures MSYS2 uses. And I quite prefer staying as close as possible to MSYS2's naming and way of working.

I like the approach of staying close to MSYS2, because it'll probably make our lives easier in the future.

Here's some examples of what consistency across the board could look like:

[…]

What do you think?

If I could mix and match, I'd choose x86 over i686 and 386, amd64 or maybe x86_64 over x64, and have no real preference between aarch64 and arm64.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the users' perspective, I guess x86 and amd64 and arm64 make most sense.

From the correctness' point of view, i686, x86_64 and aarch64 look the best, but I don't believe that this would serve our users well, but instead only the retentive type of people.

I do not have any strong opinion about what we end up with, though.

The thing I do care about is that we announce this properly and leave at least a cycle or two, where we ship both -<bitness>-bit as well as the new artifacts (essentially just adding a copy under the old name), before we drop the old names.

We may want to go the whole nine yards and also rename git-sdk-32 and git-sdk-64, once we decided on the new names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Let's go with x86, amd64 and arm64 then, or do you want to discuss with a broader audience first? If the proposed names look good to you, shall I start creating PRs in various places to:

  • continue moving away from $BITNESS in favor of the full architectures
  • update the release scripts to use the new names and ensure that the binaries are shipped under the old and the new names like you mentioned

Is this release function in please.sh still the function you're using for putting all the binaries together prior to uploading them to GitHub? Where can I find the script that actually generates the GitHub release? Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with x86, amd64 and arm64 then

Sounds good to me!

Is this release function in please.sh still the function you're using for putting all the binaries together prior to uploading them to GitHub?

No, that was only for the local VM-based builds I used starting in 2015, before using automation to build official Git for Windows releases.

These days I am using the Azure Pipeline build "Git artifacts" instead. The definition is unfortunately not really visible, but I ported it over into a GitHub workflow at https://github.com/git-for-windows/git/blob/main/.github/workflows/git-artifacts.yml, which is relatively close to the Azure Pipeline definition (that is not tracked in any Git directory but versioned independently).

The workflow does not yet have the logic to build official releases, though. The short version is that the Pipeline uses:

I fear that there might be a lot of cruft in please.sh that I accumulated over the years that were long replaced by other methods and were kept only for the transitional period, but then the removal was readily forgotten.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dennisameling FWIW my long-term plan was to move over from the Azure Pipeline to the GitHub workflow. With recent developments, I am inclined to accelerate that move. One thing we will definitely want to do is to move git-artifacts.yml into git-for-windows-automation, so that we can really restrict our ARM64 runners to that repository.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with x86, amd64 and arm64 then

Sounds good to me!

Here's a discussion to announce the change, as well as the implications on GfW's tooling: git-for-windows/git#4203

@dennisameling FWIW my long-term plan was to move over from the Azure Pipeline to the GitHub workflow. With recent developments, I am inclined to accelerate that move. One thing we will definitely want to do is to move git-artifacts.yml into git-for-windows-automation, so that we can really restrict our ARM64 runners to that repository.

Sounds good to me. Will create a PR.

@dennisameling dennisameling changed the title mingit + archive: add CLANGARM64 support Add CLANGARM64 support to mingit and archive, various other ARM64-related optimizations Dec 18, 2022
@dennisameling dennisameling force-pushed the more-arm64-work branch 2 times, most recently from b8abc63 to 3431df0 Compare December 18, 2022 20:35
inno_defines="$inno_defines$LF#define INSTALLER_FILENAME_SUFFIX 'arm64'"
inno_defines="$inno_defines$LF#define INSTALLER_FILENAME_SUFFIX 'ARM64'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about that. The git-update-git-for-windows script only does this if /arm64/ exists, which we're getting away from.

So it's not much of a precedent/requirement.

And if I look at the file names Git-2.39.0-arm64.exe or Git-2.39.0-ARM64.exe, I am not so sure that I prefer the latter.

@dennisameling do you have a strong opinion on this? I don't really have a strong opinion, but I want to "get it right".

@@ -5,7 +5,7 @@ _ver_base=1.1
pkgver=1.1.614.d551cf0cc
pkgrel=1
pkgdesc="Git for Windows extra files"
arch=('i686' 'x86_64')
arch=('any')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the architecture to any is probably the right thing to do.

However, we might need to do much more than that. git-extra really is misnamed:: It is a MINGW package and therefore it should be mingw-w64-git-extra and encode the architecture just like the other MINGW packages. In other words, the full name should be mingw-w64-x86_64-git-extra, mingw-w64-aarch64-git-extra and mingw-w64-i686-git-extra, respectively.

The only reason why I did not do that yet is because I dreaded the extra work this would require e.g. in the various files in build-extra, as well as needing to introduce provides, replaces and conflicts sections that ensure that, say, mingw-w64-x86_64-git-extra replaces (and "provides") git-extra and conflicts with the i686 and aarch64 variants of the same package.

The reason why we need those conflicts entries is that the git-extra package installs /git-bash.exe and friends, and those files cannot be provided in different shapes by multiple, concurrently-installed package flavors (you can install, say, mingw-w64-i686-gcc and mingw-w64-x86_64-gcc, but the same will not be true for mingw-w64-i686-git-extra and mingw-w64-x86_64-git-extra).

I guess that the proposed change will not make much of a difference, as the git-extra package versions are hosted together with the MSYS packages, i.e. git-sdk-32 and git-sdk-64 won't be able to see both git-extra variants, but we will have to make that mingw-w64-* change eventually.

One thing that is technically required already is a change to the pkg_files function in please.sh, which is used indirectly e.g. in please.sh build-and-copy-artifacts git-extra (which I don't think we use in the current automation, but we definitely used please.sh upgrade git-extra in the Azure Pipeline that is still kept in working order just in case the current branch-deploy mechanism is still buggy or something). A patch like this is probably required:

diff --git a/please.sh b/please.sh
index 86d90b7dd..94c8e1681 100755
--- a/please.sh
+++ b/please.sh
@@ -2068,11 +2068,8 @@ pkg_files () {
 				printf " $prefix-i686$suffix"
 				printf " $prefix-x86_64$suffix"
 				;;
-			--i686|--x86_64)
-				printf " $prefix${1#-}$suffix"
-				;;
-			'')
-				printf " $prefix-$arch$suffix"
+			--i686|--x86_64|'')
+				printf " $prefix-any$suffix"
 				;;
 			*)
 				die "Whoops: unknown option %s\n" "$1"

tl;dr I think this is fine, with the proposed amendment, and am willing to accept this change as part of this PR, but we will have to do some quite involved follow-up work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the git-extra package versions are hosted together with the MSYS packages

I just realized that this is a real problem. The reason is that we currently have this setup:

graph TD;
    A[git-sdk-64] -->|gets Pacman packages from| C[x86_64];
    A[git-sdk-64] -->|gets Pacman packages from| E["i686 (MINGW packages only)"];
    B[git-sdk-32] -->|gets Pacman packages from| F["x86_64 (MINGW packages only)"];
    B[git-sdk-32] -->|gets Pacman packages from| D[i686];
    C -->|hosted in| G[x86_64/ subdir of wingit];
    F -->|hosted in| G[x86_64/ subdir of wingit];
    D -->|hosted in| H[i686/ subdir of wingit];
    E -->|hosted in| H[i686/ subdir of wingit];
Loading

In other words, all of the i686-flavored packages are hosted in the same subdirectory of the wingit Pacman repository, and all the x86_64-flavored packages in another subdirectory, and each subdirectory has two package indices, one with the full list of packages and the other one with the MINGW-only packages.

Now, it's relatively easy to imagine ARM64 to come in this way:

graph TD;
    I[git-sdk-arm64] -->|gets Pacman packages from| J["aarch64 (CLANG packages only)"];
    I[git-sdk-arm64] -->|gets Pacman packages from| C[x86_64];
    A[git-sdk-64] -->|gets Pacman packages from| C[x86_64];
    A[git-sdk-64] -->|gets Pacman packages from| E["i686 (MINGW packages only)"];
    B[git-sdk-32] -->|gets Pacman packages from| F["x86_64 (MINGW packages only)"];
    B[git-sdk-32] -->|gets Pacman packages from| D[i686];
    J -->|hosted in| K[aarch64/ subdir of wingit];
    C -->|hosted in| G[x86_64/ subdir of wingit];
    F -->|hosted in| G[x86_64/ subdir of wingit];
    D -->|hosted in| H[i686/ subdir of wingit];
    E -->|hosted in| H[i686/ subdir of wingit];
Loading

But if we keep hosting git-extra in the MSYS parts of the Pacman repositories, then the aarch64/ subdirectory will never get its own git-extra package!

Therefore, we will have to embark on this rather involved package name change first :-(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like switching git-extra to any. There are some parts of git-extra that are compiled natively. We could theoretically abuse the x86 build as any, but that feels like a big clutch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like switching git-extra to any. There are some parts of git-extra that are compiled natively. We could theoretically abuse the x86 build as any, but that feels like a big clutch.

Right.

The correct thing to do really is to rename it to mingw-w64-<arch>-git-extra at the same time as using the any suffix. It'll be a ton of work, but I fear that we will have to do that first, before merging this here PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it 🔥 My productivity levels have been really high lately, so I'd love to keep up the pace. Apologies if this sometimes results in lower accuracy - I really appreciate all your extensive reviews and comments 👍🏼

I think I found a solution here: #455

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My productivity levels have been really high lately, so I'd love to keep up the pace.

Excellent! I just hope that my time off won't lower your productivity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! I just hope that my time off won't lower your productivity.

Haha, I'll be slowing down quite a bit too next week. But I'm super happy that we were able to get some major blockers out of the way before Christmas - that gives some peace of mind and motivation to pick things up further in the new year 😊

I might still be creating some PRs, but will not expect you or anyone to review them before the new year. If that spams your inbox too much, let me know and I'll just create some branches without creating the actual PRs 👍🏼

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your PRs will always be welcome in my inbox. Even if I may let them sit there until after the holidays ;-)

$(uname -m) returns the architecture of the MSYS2 shell. This works for
x64 and x86 on Windows, but not for ARM64, for which we'll add support
in a follow-up PR. The reason is that MSYS2 can only be run using x64
emulation on ARM64 due to lacking Cygwin support for that platform.

To prepare for ARM64 support and to be more consistent with other MSYS2
logic, let's leverage $MSYSTEM instead of $(uname -m).

Ref: https://www.msys2.org/wiki/arm64/
Signed-off-by: Dennis Ameling <[email protected]>
This replaces our earlier efforts to support arm64 for Git for Windows.
It uses the new CLANGARM64 MSYSTEM, so arm64 now becomes a first-class
citizen.

Signed-off-by: Dennis Ameling <[email protected]>
$(uname -m) returns the architecture of the MSYS2 shell. This works for
x64 and x86 on Windows, but not for ARM64, for which we'll add support
in a follow-up PR. The reason is that MSYS2 can only be run using x64
emulation on ARM64 due to lacking Cygwin support for that platform.

To prepare for ARM64 support and to be more consistent with other MSYS2
logic, let's leverage $MSYSTEM instead of $(uname -m).

Ref: https://www.msys2.org/wiki/arm64/
Signed-off-by: Dennis Ameling <[email protected]>
This replaces our earlier efforts to support arm64 for Git for Windows.
It uses the new CLANGARM64 MSYSTEM, so arm64 now becomes a first-class
citizen.

Signed-off-by: Dennis Ameling <[email protected]>
Now that we can build mingw-w64-git for the clangarm64 MSYSTEM too,
let's add it to please.sh so that CI pipelines can build for this
architecture in a way that's consistent with the other architectures.

Signed-off-by: Dennis Ameling <[email protected]>
In contrast to an older effort to add ARM64 support to Git for Windows,
we now use the /clangarm64 prefix that is used by upstream MSYS2 as
well.

Signed-off-by: Dennis Ameling <[email protected]>
There was a typo in the artifact suffix which resulted in
"PortableGit-2.39.0-ARM64-bit.7z.exe" instead of the expected
"PortableGit-2.39.0-ARM64.7z.exe" (without -bit). This commit fixes
that.

Signed-off-by: Dennis Ameling <[email protected]>
@dennisameling dennisameling marked this pull request as ready for review January 23, 2023 09:57
@dennisameling dennisameling requested a review from dscho January 23, 2023 10:17
@dennisameling
Copy link
Contributor Author

This should be good to go now. git-extra keeps complaining about its version, though. Bumped the version already but it doesn't seem to help 🤷🏼‍♂️

@rimrul
Copy link
Member

rimrul commented Jan 23, 2023

Looks like both bump commits missed the hash of HEAD and instead picked the hash of HEAD~1.

Signed-off-by: Dennis Ameling <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho merged commit 5b43e7a into git-for-windows:main Jan 23, 2023
@dennisameling dennisameling deleted the more-arm64-work branch January 23, 2023 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants