Skip to content

[mlir] mergeIdenticalBlocks is blowing up on a conditional to identical CFGs #63230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Mogball opened this issue Jun 10, 2023 · 7 comments
Closed
Assignees
Labels
good first issue https://github.com/llvm/llvm-project/contribute mlir:bufferization Bufferization infrastructure

Comments

@Mogball
Copy link
Contributor

Mogball commented Jun 10, 2023

This example shows the beginnings of algorithmic explosion when run through mlir-opt -canonicalize

module {
  llvm.func @foo(%arg0: i64)

  llvm.func @rand() -> i1

  llvm.func @top(%arg0: i64) {
    %0 = llvm.mlir.constant(0 : i64) : i64
    %1 = llvm.mlir.constant(8 : i64) : i64
    %2 = llvm.mlir.constant(7 : i64) : i64
    %3 = llvm.mlir.constant(6 : i64) : i64
    %4 = llvm.mlir.constant(5 : i64) : i64
    %5 = llvm.mlir.constant(4 : i64) : i64
    %6 = llvm.mlir.constant(3 : i64) : i64
    %7 = llvm.mlir.constant(1 : i64) : i64
    %8 = llvm.mlir.constant(2 : i64) : i64
    %10 = llvm.icmp "eq" %arg0, %0 : i64
    llvm.cond_br %10, ^bb1, ^bb14
  ^bb1:  // pred: ^bb0
    %11 = llvm.call @rand() : () -> i1
    llvm.cond_br %11, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    llvm.call @foo(%7) : (i64) -> ()
    llvm.br ^bb4
  ^bb3:  // pred: ^bb1
    llvm.call @foo(%8) : (i64) -> ()
    llvm.br ^bb4
  ^bb4:  // 2 preds: ^bb2, ^bb3
    %14 = llvm.call @rand() : () -> i1
    llvm.cond_br %14, ^bb5, ^bb6
  ^bb5:  // pred: ^bb4
    llvm.call @foo(%6) : (i64) -> ()
    llvm.br ^bb7
  ^bb6:  // pred: ^bb4
    llvm.call @foo(%5) : (i64) -> ()
    llvm.br ^bb7
  ^bb7:  // 2 preds: ^bb5, ^bb6
    %17 = llvm.call @rand() : () -> i1
    llvm.cond_br %17, ^bb8, ^bb9
  ^bb8:  // pred: ^bb7
    llvm.call @foo(%4) : (i64) -> ()
    llvm.br ^bb10
  ^bb9:  // pred: ^bb7
    llvm.call @foo(%3) : (i64) -> ()
    llvm.br ^bb10
  ^bb10:  // 2 preds: ^bb8, ^bb9
    %20 = llvm.call @rand() : () -> i1
    llvm.cond_br %20, ^bb11, ^bb12
  ^bb11:  // pred: ^bb10
    llvm.call @foo(%2) : (i64) -> ()
    llvm.br ^bb13
  ^bb12:  // pred: ^bb10
    llvm.call @foo(%1) : (i64) -> ()
    llvm.br ^bb13
  ^bb13:  // 2 preds: ^bb11, ^bb12
    llvm.br ^bb27
  ^bb14:  // pred: ^bb0
    %23 = llvm.call @rand() : () -> i1
    llvm.cond_br %23, ^bb15, ^bb16
  ^bb15:  // pred: ^bb14
    llvm.call @foo(%8) : (i64) -> ()
    llvm.br ^bb17
  ^bb16:  // pred: ^bb14
    llvm.call @foo(%7) : (i64) -> ()
    llvm.br ^bb17
  ^bb17:  // 2 preds: ^bb15, ^bb16
    %26 = llvm.call @rand() : () -> i1
    llvm.cond_br %26, ^bb18, ^bb19
  ^bb18:  // pred: ^bb17
    llvm.call @foo(%5) : (i64) -> ()
    llvm.br ^bb20
  ^bb19:  // pred: ^bb17
    llvm.call @foo(%6) : (i64) -> ()
    llvm.br ^bb20
  ^bb20:  // 2 preds: ^bb18, ^bb19
    %29 = llvm.call @rand() : () -> i1
    llvm.cond_br %29, ^bb21, ^bb22
  ^bb21:  // pred: ^bb20
    llvm.call @foo(%3) : (i64) -> ()
    llvm.br ^bb23
  ^bb22:  // pred: ^bb20
    llvm.call @foo(%4) : (i64) -> ()
    llvm.br ^bb23
  ^bb23:  // 2 preds: ^bb21, ^bb22
    %32 = llvm.call @rand() : () -> i1
    llvm.cond_br %32, ^bb24, ^bb25
  ^bb24:  // pred: ^bb23
    llvm.call @foo(%1) : (i64) -> ()
    llvm.br ^bb26
  ^bb25:  // pred: ^bb23
    llvm.call @foo(%2) : (i64) -> ()
    llvm.br ^bb26
  ^bb26:  // 2 preds: ^bb24, ^bb25
    llvm.br ^bb27
  ^bb27:  // 2 preds: ^bb13, ^bb26
    llvm.return
  }
}

It corresponds approximately to

if (something) {
  foo(1, 2)
  foo(3, 4)
  foo(5, 6)
  foo(7, 8)
} else {
  foo(2, 1)
  foo(4, 3)
  foo(6, 5)
  foo(8, 7)
}

The block merger is trying to merge both branches together, since they share similar sub-CFGs, but it's blowing up on the number of block operands. Adding just a few more branches causes 💥

@Mogball Mogball added the mlir label Jun 10, 2023
@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2023

@llvm/issue-subscribers-mlir

@River707
Copy link
Contributor

I took a look into this a tiny bit on friday. There seem to be a few deficiencies:
First, we should add a region simplification that drops unnecessary arguments (i.e. arguments whose incoming value is the same for all predecessors). Secondly, block merging should avoid adding new arguments when there are existing arguments with the same incoming predecessor/values. The explosion happens as we iteratively the two halves of the CFG and need to pass in values at each merge step.

@joker-eph joker-eph added the good first issue https://github.com/llvm/llvm-project/contribute label Jun 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 11, 2024

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

  1. Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
  2. In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
  3. Fix the issue locally.
  4. Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
  5. Create a Git commit.
  6. Run git clang-format HEAD~1 to format your changes.
  7. Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

@llvmbot
Copy link
Member

llvmbot commented Jun 11, 2024

@llvm/issue-subscribers-good-first-issue

Author: Jeff Niu (Mogball)

This example shows the beginnings of algorithmic explosion when run through `mlir-opt -canonicalize`
module {
  llvm.func @<!-- -->foo(%arg0: i64)

  llvm.func @<!-- -->rand() -&gt; i1

  llvm.func @<!-- -->top(%arg0: i64) {
    %0 = llvm.mlir.constant(0 : i64) : i64
    %1 = llvm.mlir.constant(8 : i64) : i64
    %2 = llvm.mlir.constant(7 : i64) : i64
    %3 = llvm.mlir.constant(6 : i64) : i64
    %4 = llvm.mlir.constant(5 : i64) : i64
    %5 = llvm.mlir.constant(4 : i64) : i64
    %6 = llvm.mlir.constant(3 : i64) : i64
    %7 = llvm.mlir.constant(1 : i64) : i64
    %8 = llvm.mlir.constant(2 : i64) : i64
    %10 = llvm.icmp "eq" %arg0, %0 : i64
    llvm.cond_br %10, ^bb1, ^bb14
  ^bb1:  // pred: ^bb0
    %11 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %11, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    llvm.call @<!-- -->foo(%7) : (i64) -&gt; ()
    llvm.br ^bb4
  ^bb3:  // pred: ^bb1
    llvm.call @<!-- -->foo(%8) : (i64) -&gt; ()
    llvm.br ^bb4
  ^bb4:  // 2 preds: ^bb2, ^bb3
    %14 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %14, ^bb5, ^bb6
  ^bb5:  // pred: ^bb4
    llvm.call @<!-- -->foo(%6) : (i64) -&gt; ()
    llvm.br ^bb7
  ^bb6:  // pred: ^bb4
    llvm.call @<!-- -->foo(%5) : (i64) -&gt; ()
    llvm.br ^bb7
  ^bb7:  // 2 preds: ^bb5, ^bb6
    %17 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %17, ^bb8, ^bb9
  ^bb8:  // pred: ^bb7
    llvm.call @<!-- -->foo(%4) : (i64) -&gt; ()
    llvm.br ^bb10
  ^bb9:  // pred: ^bb7
    llvm.call @<!-- -->foo(%3) : (i64) -&gt; ()
    llvm.br ^bb10
  ^bb10:  // 2 preds: ^bb8, ^bb9
    %20 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %20, ^bb11, ^bb12
  ^bb11:  // pred: ^bb10
    llvm.call @<!-- -->foo(%2) : (i64) -&gt; ()
    llvm.br ^bb13
  ^bb12:  // pred: ^bb10
    llvm.call @<!-- -->foo(%1) : (i64) -&gt; ()
    llvm.br ^bb13
  ^bb13:  // 2 preds: ^bb11, ^bb12
    llvm.br ^bb27
  ^bb14:  // pred: ^bb0
    %23 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %23, ^bb15, ^bb16
  ^bb15:  // pred: ^bb14
    llvm.call @<!-- -->foo(%8) : (i64) -&gt; ()
    llvm.br ^bb17
  ^bb16:  // pred: ^bb14
    llvm.call @<!-- -->foo(%7) : (i64) -&gt; ()
    llvm.br ^bb17
  ^bb17:  // 2 preds: ^bb15, ^bb16
    %26 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %26, ^bb18, ^bb19
  ^bb18:  // pred: ^bb17
    llvm.call @<!-- -->foo(%5) : (i64) -&gt; ()
    llvm.br ^bb20
  ^bb19:  // pred: ^bb17
    llvm.call @<!-- -->foo(%6) : (i64) -&gt; ()
    llvm.br ^bb20
  ^bb20:  // 2 preds: ^bb18, ^bb19
    %29 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %29, ^bb21, ^bb22
  ^bb21:  // pred: ^bb20
    llvm.call @<!-- -->foo(%3) : (i64) -&gt; ()
    llvm.br ^bb23
  ^bb22:  // pred: ^bb20
    llvm.call @<!-- -->foo(%4) : (i64) -&gt; ()
    llvm.br ^bb23
  ^bb23:  // 2 preds: ^bb21, ^bb22
    %32 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %32, ^bb24, ^bb25
  ^bb24:  // pred: ^bb23
    llvm.call @<!-- -->foo(%1) : (i64) -&gt; ()
    llvm.br ^bb26
  ^bb25:  // pred: ^bb23
    llvm.call @<!-- -->foo(%2) : (i64) -&gt; ()
    llvm.br ^bb26
  ^bb26:  // 2 preds: ^bb24, ^bb25
    llvm.br ^bb27
  ^bb27:  // 2 preds: ^bb13, ^bb26
    llvm.return
  }
}

It corresponds approximately to

if (something) {
  foo(1, 2)
  foo(3, 4)
  foo(5, 6)
  foo(7, 8)
} else {
  foo(2, 1)
  foo(4, 3)
  foo(6, 5)
  foo(8, 7)
}

The block merger is trying to merge both branches together, since they share similar sub-CFGs, but it's blowing up on the number of block operands. Adding just a few more branches causes 💥

@joker-eph
Copy link
Collaborator

I think @River707 's suggestions here can be a good first issue! (just added the tag)

@giuseros
Copy link
Contributor

Hi @joker-eph , do you mind if I assign this ticket to me? Or someone else is working on it?

antiagainst pushed a commit to triton-lang/triton that referenced this issue Jun 21, 2024
This PR disable block merging when running
`convert-builtin-func-to-llvm`.

The reason behind this is that for now block merging can double the
arguments of the blocks. This means that after a while we can start
witnessing a block argument "explosion" which hangs the compiler.

I am working on this ticket:
llvm/llvm-project#63230 to make block merging
better, but in the meantime, we should stop merging blocks to avoid
compiler hangs.

I added the minimal test to reproduce the explosion. The test for now is
checking that we don't try to merge blocks.
giuseros added a commit that referenced this issue Jul 2, 2024
With this PR I am trying to address:
#63230.

What changed:
- While merging identical blocks, don't add a block argument if it is
"identical" to another block argument. I.e., if the two block arguments
refer to the same `Value`. The operations operands in the block will
point to the argument we already inserted
- After merged the blocks, get rid of "unnecessary" arguments. I.e., if
all the predecessors pass the same block argument, there is no need to
pass it as an argument.
- This last simplification clashed with
`BufferDeallocationSimplification`. The reason, I think, is that the two
simplifications are clashing. I.e., `BufferDeallocationSimplification`
contains an analysis based on the block structure. If we simplify the
block structure (by merging and/or dropping block arguments) the
analysis is invalid . The solution I found is to do a more prudent
simplification when running that pass.

**Note**: many tests are still not passing. But I wanted to submit the
code before changing all the tests (and probably adding a couple), so
that we can agree in principle on the algorithm/design.
lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this issue Jul 3, 2024
With this PR I am trying to address:
llvm#63230.

What changed:
- While merging identical blocks, don't add a block argument if it is
"identical" to another block argument. I.e., if the two block arguments
refer to the same `Value`. The operations operands in the block will
point to the argument we already inserted
- After merged the blocks, get rid of "unnecessary" arguments. I.e., if
all the predecessors pass the same block argument, there is no need to
pass it as an argument.
- This last simplification clashed with
`BufferDeallocationSimplification`. The reason, I think, is that the two
simplifications are clashing. I.e., `BufferDeallocationSimplification`
contains an analysis based on the block structure. If we simplify the
block structure (by merging and/or dropping block arguments) the
analysis is invalid . The solution I found is to do a more prudent
simplification when running that pass.

**Note**: many tests are still not passing. But I wanted to submit the
code before changing all the tests (and probably adding a couple), so
that we can agree in principle on the algorithm/design.
kbluck pushed a commit to kbluck/llvm-project that referenced this issue Jul 6, 2024
With this PR I am trying to address:
llvm#63230.

What changed:
- While merging identical blocks, don't add a block argument if it is
"identical" to another block argument. I.e., if the two block arguments
refer to the same `Value`. The operations operands in the block will
point to the argument we already inserted
- After merged the blocks, get rid of "unnecessary" arguments. I.e., if
all the predecessors pass the same block argument, there is no need to
pass it as an argument.
- This last simplification clashed with
`BufferDeallocationSimplification`. The reason, I think, is that the two
simplifications are clashing. I.e., `BufferDeallocationSimplification`
contains an analysis based on the block structure. If we simplify the
block structure (by merging and/or dropping block arguments) the
analysis is invalid . The solution I found is to do a more prudent
simplification when running that pass.

**Note**: many tests are still not passing. But I wanted to submit the
code before changing all the tests (and probably adding a couple), so
that we can agree in principle on the algorithm/design.
giuseros added a commit that referenced this issue Jul 17, 2024
With this PR I am trying to address:
#63230.

What changed:
- While merging identical blocks, don't add a block argument if it is
"identical" to another block argument. I.e., if the two block arguments
refer to the same `Value`. The operations operands in the block will
point to the argument we already inserted. This needs to happen to all
the arguments we pass to the different successors of the parent block
- After merged the blocks, get rid of "unnecessary" arguments. I.e., if
all the predecessors pass the same block argument, there is no need to
pass it as an argument.
- This last simplification clashed with
`BufferDeallocationSimplification`. The reason, I think, is that the two
simplifications are clashing. I.e., `BufferDeallocationSimplification`
contains an analysis based on the block structure. If we simplify the
block structure (by merging and/or dropping block arguments) the
analysis is invalid . The solution I found is to do a more prudent
simplification when running that pass.

**Note**: this a rework of #96871 . I ran all the integration tests
(`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed.
yuxuanchen1997 pushed a commit that referenced this issue Jul 25, 2024
Summary:
With this PR I am trying to address:
#63230.

What changed:
- While merging identical blocks, don't add a block argument if it is
"identical" to another block argument. I.e., if the two block arguments
refer to the same `Value`. The operations operands in the block will
point to the argument we already inserted. This needs to happen to all
the arguments we pass to the different successors of the parent block
- After merged the blocks, get rid of "unnecessary" arguments. I.e., if
all the predecessors pass the same block argument, there is no need to
pass it as an argument.
- This last simplification clashed with
`BufferDeallocationSimplification`. The reason, I think, is that the two
simplifications are clashing. I.e., `BufferDeallocationSimplification`
contains an analysis based on the block structure. If we simplify the
block structure (by merging and/or dropping block arguments) the
analysis is invalid . The solution I found is to do a more prudent
simplification when running that pass.

**Note**: this a rework of #96871 . I ran all the integration tests
(`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250916
giuseros added a commit that referenced this issue Aug 7, 2024
With this PR I am trying to address:
#63230.

What changed:
- While merging identical blocks, don't add a block argument if it is
"identical" to another block argument. I.e., if the two block arguments
refer to the same `Value`. The operations operands in the block will
point to the argument we already inserted. This needs to happen to all
the arguments we pass to the different successors of the parent block
- After merged the blocks, get rid of "unnecessary" arguments. I.e., if
all the predecessors pass the same block argument, there is no need to
pass it as an argument.
- This last simplification clashed with
`BufferDeallocationSimplification`. The reason, I think, is that the two
simplifications are clashing. I.e., `BufferDeallocationSimplification`
contains an analysis based on the block structure. If we simplify the
block structure (by merging and/or dropping block arguments) the
analysis is invalid . The solution I found is to do a more prudent
simplification when running that pass.

**Note-1**: I ran all the integration tests
(`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed.
**Note-2**: I fixed a bug found by @Dinistro in #97697 . The issue was
that, when looking for redundant arguments, I was not considering that
the block might have already some arguments. So the index (in the block
args list) of the i-th `newArgument` is `i+numOfOldArguments`.
@giuseros
Copy link
Contributor

This has been fixed in #102038

@EugeneZelenko EugeneZelenko added the mlir:bufferization Bufferization infrastructure label Aug 28, 2024
bertmaher pushed a commit to bertmaher/triton that referenced this issue Dec 10, 2024
…-lang#4176)

This PR disable block merging when running
`convert-builtin-func-to-llvm`.

The reason behind this is that for now block merging can double the
arguments of the blocks. This means that after a while we can start
witnessing a block argument "explosion" which hangs the compiler.

I am working on this ticket:
llvm/llvm-project#63230 to make block merging
better, but in the meantime, we should stop merging blocks to avoid
compiler hangs.

I added the minimal test to reproduce the explosion. The test for now is
checking that we don't try to merge blocks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue https://github.com/llvm/llvm-project/contribute mlir:bufferization Bufferization infrastructure
Projects
None yet
Development

No branches or pull requests

6 participants