Skip to content

[CIR][Dialect] Remove 22 prefix and suffix from type aliases #826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 10, 2024

Conversation

smeenai
Copy link
Collaborator

@smeenai smeenai commented Sep 9, 2024

We were previously printing the type alias for struct S as !ty_22S22
instead of just !ty_S. This was because our alias computation for a
StructType did the following:

os << "ty_" << structType.getName()

structType.getName() is a StringAttr, and writing a StringAttr to
an output stream adds double quotes around the actual string [1][2].
These double quotes then get hex-encoded as 22 [3].

We can fix this by writing the actual string value instead. Aliases that
would end in a number will now receive a trailing underscore because of
MLIR's alias sanitization not allowing a trailing digit [4] (which
ironically didn't kick in even though our aliases were always previously
ending with a number, which might be a bug in the sanitization logic).
Aliases containing other special characters (e.g. ::) will still be
escaped as before. In other words:

struct S {};
// before: !ty_22S22 = ...
// after:  !ty_S = ...

struct S1 {};
// before: !ty_22S122 = ...
// after:  !ty_S1_ = ...

struct std::string {};
// before: !ty_22std3A3Astring22
// after:  !ty_std3A3Astring

I'm not a big fan of the trailing underscore special-case, but I also
don't want to touch core MLIR logic, and I think the end result is still
nicer than before.

The tests were mechanically updated with the following command run
inside clang/test/CIR, and the same commands can be run to update the
tests for any in-flight patches. (These are for GNU sed; for macOS
change the -i to -i ''.)

find . -type f | xargs sed -i -E -e 's/ty_22([A-Za-z0-9_$]+[0-9])22/ty_\1_/g' -e 's/ty_22([A-Za-z0-9_$]+)22/ty_\1/g'

clang/test/CIR/CodeGen/stmtexpr-init.c needed an additional minor fix to
swap the expected order of two type aliases in the CIR output.
clang/test/CIR/CodeGen/coro-task.cpp needed some surgery because it was
searching for 22 to find the end of a type alias; I changed it to
search for the actual alias instead.

If you run into merge conflicts after this change, you can auto-resolve them via
smeenai@715f061

[1] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2295
[2] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2763
[3] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1014
[4] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1154

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice! Because of recently merged commits there are a few conflicts, but I'll merge as soon as they get fixed

smeenai added a commit to smeenai/clangir that referenced this pull request Sep 10, 2024
We were previously printing the type alias for `struct S` as `!ty_22S22`
instead of just `!ty_S`. This was because our alias computation for a
StructType did the following:

    os << "ty_" << structType.getName()

`structType.getName()` is a `StringAttr`, and writing a `StringAttr` to
an output stream adds double quotes around the actual string [1][2].
These double quotes then get hex-encoded as 22 [3].

We can fix this by writing the actual string value instead. Aliases that
would end in a number will now receive a trailing underscore because of
MLIR's alias sanitization not allowing a trailing digit [4] (which
ironically didn't kick in even though our aliases were always previously
ending with a number, which might be a bug in the sanitization logic).
Aliases containing other special characters (e.g. `::`) will still be
escaped as before. In other words:

```
struct S {};
// before: !ty_22S22 = ...
// after:  !ty_S = ...

struct S1 {};
// before: !ty_22S122 = ...
// after:  !ty_S1_ = ...

struct std::string {};
// before: !ty_22std3A3Astring22
// after:  !ty_std3A3Astring
```

I'm not a big fan of the trailing underscore special-case, but I also
don't want to touch core MLIR logic, and I think the end result is still
nicer than before.

The tests were mechanically updated with the following command run
inside `clang/test/CIR`, and the same commands can be run to update the
tests for any in-flight patches. (These are for GNU sed; for macOS
change the `-i` to `-i ''`.)

    find . -type f | xargs sed -i -E -e 's/ty_22([A-Za-z0-9_$]+[0-9])22/ty_\1_/g' -e 's/ty_22([A-Za-z0-9_$]+)22/ty_\1/g'

clang/test/CIR/CodeGen/stmtexpr-init.c needed an additional minor fix to
swap the expected order of two type aliases in the CIR output.
clang/test/CIR/CodeGen/coro-task.cpp needed some surgery because it was
searching for `22` to find the end of a type alias; I changed it to
search for the actual alias instead.

[1] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2295
[2] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2763
[3] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1014
[4] https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1154
@smeenai
Copy link
Collaborator Author

smeenai commented Sep 10, 2024

smeenai@715f061 is a merge driver setup for anyone running into merge conflicts after this change.

@bcardosolopes bcardosolopes merged commit fc73bd2 into llvm:main Sep 10, 2024
5 of 6 checks passed
@smeenai smeenai deleted the fix-type-alias branch September 10, 2024 20:04
Hugobros3 pushed a commit to shady-gang/clangir that referenced this pull request Oct 2, 2024
We were previously printing the type alias for `struct S` as `!ty_22S22`
instead of just `!ty_S`. This was because our alias computation for a
StructType did the following:

    os << "ty_" << structType.getName()

`structType.getName()` is a `StringAttr`, and writing a `StringAttr` to
an output stream adds double quotes around the actual string [1][2].
These double quotes then get hex-encoded as 22 [3].

We can fix this by writing the actual string value instead. Aliases that
would end in a number will now receive a trailing underscore because of
MLIR's alias sanitization not allowing a trailing digit [4] (which
ironically didn't kick in even though our aliases were always previously
ending with a number, which might be a bug in the sanitization logic).
Aliases containing other special characters (e.g. `::`) will still be
escaped as before. In other words:

```
struct S {};
// before: !ty_22S22 = ...
// after:  !ty_S = ...

struct S1 {};
// before: !ty_22S122 = ...
// after:  !ty_S1_ = ...

struct std::string {};
// before: !ty_22std3A3Astring22
// after:  !ty_std3A3Astring
```

I'm not a big fan of the trailing underscore special-case, but I also
don't want to touch core MLIR logic, and I think the end result is still
nicer than before.

The tests were mechanically updated with the following command run
inside `clang/test/CIR`, and the same commands can be run to update the
tests for any in-flight patches. (These are for GNU sed; for macOS
change the `-i` to `-i ''`.)

find . -type f | xargs sed -i -E -e
's/ty_22([A-Za-z0-9_$]+[0-9])22/ty_\1_/g' -e
's/ty_22([A-Za-z0-9_$]+)22/ty_\1/g'

clang/test/CIR/CodeGen/stmtexpr-init.c needed an additional minor fix to
swap the expected order of two type aliases in the CIR output.
clang/test/CIR/CodeGen/coro-task.cpp needed some surgery because it was
searching for `22` to find the end of a type alias; I changed it to
search for the actual alias instead.

If you run into merge conflicts after this change, you can auto-resolve
them via

smeenai@715f061

[1]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2295
[2]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2763
[3]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1014
[4]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1154
smeenai added a commit to smeenai/clangir that referenced this pull request Oct 9, 2024
We were previously printing the type alias for `struct S` as `!ty_22S22`
instead of just `!ty_S`. This was because our alias computation for a
StructType did the following:

    os << "ty_" << structType.getName()

`structType.getName()` is a `StringAttr`, and writing a `StringAttr` to
an output stream adds double quotes around the actual string [1][2].
These double quotes then get hex-encoded as 22 [3].

We can fix this by writing the actual string value instead. Aliases that
would end in a number will now receive a trailing underscore because of
MLIR's alias sanitization not allowing a trailing digit [4] (which
ironically didn't kick in even though our aliases were always previously
ending with a number, which might be a bug in the sanitization logic).
Aliases containing other special characters (e.g. `::`) will still be
escaped as before. In other words:

```
struct S {};
// before: !ty_22S22 = ...
// after:  !ty_S = ...

struct S1 {};
// before: !ty_22S122 = ...
// after:  !ty_S1_ = ...

struct std::string {};
// before: !ty_22std3A3Astring22
// after:  !ty_std3A3Astring
```

I'm not a big fan of the trailing underscore special-case, but I also
don't want to touch core MLIR logic, and I think the end result is still
nicer than before.

The tests were mechanically updated with the following command run
inside `clang/test/CIR`, and the same commands can be run to update the
tests for any in-flight patches. (These are for GNU sed; for macOS
change the `-i` to `-i ''`.)

find . -type f | xargs sed -i -E -e
's/ty_22([A-Za-z0-9_$]+[0-9])22/ty_\1_/g' -e
's/ty_22([A-Za-z0-9_$]+)22/ty_\1/g'

clang/test/CIR/CodeGen/stmtexpr-init.c needed an additional minor fix to
swap the expected order of two type aliases in the CIR output.
clang/test/CIR/CodeGen/coro-task.cpp needed some surgery because it was
searching for `22` to find the end of a type alias; I changed it to
search for the actual alias instead.

If you run into merge conflicts after this change, you can auto-resolve
them via

715f061

[1]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2295
[2]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2763
[3]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1014
[4]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1154
smeenai added a commit to smeenai/clangir that referenced this pull request Oct 9, 2024
We were previously printing the type alias for `struct S` as `!ty_22S22`
instead of just `!ty_S`. This was because our alias computation for a
StructType did the following:

    os << "ty_" << structType.getName()

`structType.getName()` is a `StringAttr`, and writing a `StringAttr` to
an output stream adds double quotes around the actual string [1][2].
These double quotes then get hex-encoded as 22 [3].

We can fix this by writing the actual string value instead. Aliases that
would end in a number will now receive a trailing underscore because of
MLIR's alias sanitization not allowing a trailing digit [4] (which
ironically didn't kick in even though our aliases were always previously
ending with a number, which might be a bug in the sanitization logic).
Aliases containing other special characters (e.g. `::`) will still be
escaped as before. In other words:

```
struct S {};
// before: !ty_22S22 = ...
// after:  !ty_S = ...

struct S1 {};
// before: !ty_22S122 = ...
// after:  !ty_S1_ = ...

struct std::string {};
// before: !ty_22std3A3Astring22
// after:  !ty_std3A3Astring
```

I'm not a big fan of the trailing underscore special-case, but I also
don't want to touch core MLIR logic, and I think the end result is still
nicer than before.

The tests were mechanically updated with the following command run
inside `clang/test/CIR`, and the same commands can be run to update the
tests for any in-flight patches. (These are for GNU sed; for macOS
change the `-i` to `-i ''`.)

find . -type f | xargs sed -i -E -e
's/ty_22([A-Za-z0-9_$]+[0-9])22/ty_\1_/g' -e
's/ty_22([A-Za-z0-9_$]+)22/ty_\1/g'

clang/test/CIR/CodeGen/stmtexpr-init.c needed an additional minor fix to
swap the expected order of two type aliases in the CIR output.
clang/test/CIR/CodeGen/coro-task.cpp needed some surgery because it was
searching for `22` to find the end of a type alias; I changed it to
search for the actual alias instead.

If you run into merge conflicts after this change, you can auto-resolve
them via

715f061

[1]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2295
[2]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2763
[3]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1014
[4]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1154
keryell pushed a commit to keryell/clangir that referenced this pull request Oct 19, 2024
We were previously printing the type alias for `struct S` as `!ty_22S22`
instead of just `!ty_S`. This was because our alias computation for a
StructType did the following:

    os << "ty_" << structType.getName()

`structType.getName()` is a `StringAttr`, and writing a `StringAttr` to
an output stream adds double quotes around the actual string [1][2].
These double quotes then get hex-encoded as 22 [3].

We can fix this by writing the actual string value instead. Aliases that
would end in a number will now receive a trailing underscore because of
MLIR's alias sanitization not allowing a trailing digit [4] (which
ironically didn't kick in even though our aliases were always previously
ending with a number, which might be a bug in the sanitization logic).
Aliases containing other special characters (e.g. `::`) will still be
escaped as before. In other words:

```
struct S {};
// before: !ty_22S22 = ...
// after:  !ty_S = ...

struct S1 {};
// before: !ty_22S122 = ...
// after:  !ty_S1_ = ...

struct std::string {};
// before: !ty_22std3A3Astring22
// after:  !ty_std3A3Astring
```

I'm not a big fan of the trailing underscore special-case, but I also
don't want to touch core MLIR logic, and I think the end result is still
nicer than before.

The tests were mechanically updated with the following command run
inside `clang/test/CIR`, and the same commands can be run to update the
tests for any in-flight patches. (These are for GNU sed; for macOS
change the `-i` to `-i ''`.)

find . -type f | xargs sed -i -E -e
's/ty_22([A-Za-z0-9_$]+[0-9])22/ty_\1_/g' -e
's/ty_22([A-Za-z0-9_$]+)22/ty_\1/g'

clang/test/CIR/CodeGen/stmtexpr-init.c needed an additional minor fix to
swap the expected order of two type aliases in the CIR output.
clang/test/CIR/CodeGen/coro-task.cpp needed some surgery because it was
searching for `22` to find the end of a type alias; I changed it to
search for the actual alias instead.

If you run into merge conflicts after this change, you can auto-resolve
them via

smeenai@715f061

[1]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2295
[2]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2763
[3]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1014
[4]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1154
lanza pushed a commit that referenced this pull request Nov 5, 2024
We were previously printing the type alias for `struct S` as `!ty_22S22`
instead of just `!ty_S`. This was because our alias computation for a
StructType did the following:

    os << "ty_" << structType.getName()

`structType.getName()` is a `StringAttr`, and writing a `StringAttr` to
an output stream adds double quotes around the actual string [1][2].
These double quotes then get hex-encoded as 22 [3].

We can fix this by writing the actual string value instead. Aliases that
would end in a number will now receive a trailing underscore because of
MLIR's alias sanitization not allowing a trailing digit [4] (which
ironically didn't kick in even though our aliases were always previously
ending with a number, which might be a bug in the sanitization logic).
Aliases containing other special characters (e.g. `::`) will still be
escaped as before. In other words:

```
struct S {};
// before: !ty_22S22 = ...
// after:  !ty_S = ...

struct S1 {};
// before: !ty_22S122 = ...
// after:  !ty_S1_ = ...

struct std::string {};
// before: !ty_22std3A3Astring22
// after:  !ty_std3A3Astring
```

I'm not a big fan of the trailing underscore special-case, but I also
don't want to touch core MLIR logic, and I think the end result is still
nicer than before.

The tests were mechanically updated with the following command run
inside `clang/test/CIR`, and the same commands can be run to update the
tests for any in-flight patches. (These are for GNU sed; for macOS
change the `-i` to `-i ''`.)

find . -type f | xargs sed -i -E -e
's/ty_22([A-Za-z0-9_$]+[0-9])22/ty_\1_/g' -e
's/ty_22([A-Za-z0-9_$]+)22/ty_\1/g'

clang/test/CIR/CodeGen/stmtexpr-init.c needed an additional minor fix to
swap the expected order of two type aliases in the CIR output.
clang/test/CIR/CodeGen/coro-task.cpp needed some surgery because it was
searching for `22` to find the end of a type alias; I changed it to
search for the actual alias instead.

If you run into merge conflicts after this change, you can auto-resolve
them via

smeenai@715f061

[1]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2295
[2]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2763
[3]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1014
[4]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1154
lanza pushed a commit that referenced this pull request Mar 18, 2025
We were previously printing the type alias for `struct S` as `!ty_22S22`
instead of just `!ty_S`. This was because our alias computation for a
StructType did the following:

    os << "ty_" << structType.getName()

`structType.getName()` is a `StringAttr`, and writing a `StringAttr` to
an output stream adds double quotes around the actual string [1][2].
These double quotes then get hex-encoded as 22 [3].

We can fix this by writing the actual string value instead. Aliases that
would end in a number will now receive a trailing underscore because of
MLIR's alias sanitization not allowing a trailing digit [4] (which
ironically didn't kick in even though our aliases were always previously
ending with a number, which might be a bug in the sanitization logic).
Aliases containing other special characters (e.g. `::`) will still be
escaped as before. In other words:

```
struct S {};
// before: !ty_22S22 = ...
// after:  !ty_S = ...

struct S1 {};
// before: !ty_22S122 = ...
// after:  !ty_S1_ = ...

struct std::string {};
// before: !ty_22std3A3Astring22
// after:  !ty_std3A3Astring
```

I'm not a big fan of the trailing underscore special-case, but I also
don't want to touch core MLIR logic, and I think the end result is still
nicer than before.

The tests were mechanically updated with the following command run
inside `clang/test/CIR`, and the same commands can be run to update the
tests for any in-flight patches. (These are for GNU sed; for macOS
change the `-i` to `-i ''`.)

find . -type f | xargs sed -i -E -e
's/ty_22([A-Za-z0-9_$]+[0-9])22/ty_\1_/g' -e
's/ty_22([A-Za-z0-9_$]+)22/ty_\1/g'

clang/test/CIR/CodeGen/stmtexpr-init.c needed an additional minor fix to
swap the expected order of two type aliases in the CIR output.
clang/test/CIR/CodeGen/coro-task.cpp needed some surgery because it was
searching for `22` to find the end of a type alias; I changed it to
search for the actual alias instead.

If you run into merge conflicts after this change, you can auto-resolve
them via

smeenai@715f061

[1]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2295
[2]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L2763
[3]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1014
[4]
https://github.com/llvm/llvm-project/blob/b3d2d5039b9b8aa10a86c593387f200b15c02aef/mlir/lib/IR/AsmPrinter.cpp#L1154
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants