[mlir] `mergeIdenticalBlocks` is blowing up on a conditional to identical CFGs #63230

Mogball · 2023-06-10T00:15:49Z

This example shows the beginnings of algorithmic explosion when run through mlir-opt -canonicalize

module {
  llvm.func @foo(%arg0: i64)

  llvm.func @rand() -> i1

  llvm.func @top(%arg0: i64) {
    %0 = llvm.mlir.constant(0 : i64) : i64
    %1 = llvm.mlir.constant(8 : i64) : i64
    %2 = llvm.mlir.constant(7 : i64) : i64
    %3 = llvm.mlir.constant(6 : i64) : i64
    %4 = llvm.mlir.constant(5 : i64) : i64
    %5 = llvm.mlir.constant(4 : i64) : i64
    %6 = llvm.mlir.constant(3 : i64) : i64
    %7 = llvm.mlir.constant(1 : i64) : i64
    %8 = llvm.mlir.constant(2 : i64) : i64
    %10 = llvm.icmp "eq" %arg0, %0 : i64
    llvm.cond_br %10, ^bb1, ^bb14
  ^bb1:  // pred: ^bb0
    %11 = llvm.call @rand() : () -> i1
    llvm.cond_br %11, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    llvm.call @foo(%7) : (i64) -> ()
    llvm.br ^bb4
  ^bb3:  // pred: ^bb1
    llvm.call @foo(%8) : (i64) -> ()
    llvm.br ^bb4
  ^bb4:  // 2 preds: ^bb2, ^bb3
    %14 = llvm.call @rand() : () -> i1
    llvm.cond_br %14, ^bb5, ^bb6
  ^bb5:  // pred: ^bb4
    llvm.call @foo(%6) : (i64) -> ()
    llvm.br ^bb7
  ^bb6:  // pred: ^bb4
    llvm.call @foo(%5) : (i64) -> ()
    llvm.br ^bb7
  ^bb7:  // 2 preds: ^bb5, ^bb6
    %17 = llvm.call @rand() : () -> i1
    llvm.cond_br %17, ^bb8, ^bb9
  ^bb8:  // pred: ^bb7
    llvm.call @foo(%4) : (i64) -> ()
    llvm.br ^bb10
  ^bb9:  // pred: ^bb7
    llvm.call @foo(%3) : (i64) -> ()
    llvm.br ^bb10
  ^bb10:  // 2 preds: ^bb8, ^bb9
    %20 = llvm.call @rand() : () -> i1
    llvm.cond_br %20, ^bb11, ^bb12
  ^bb11:  // pred: ^bb10
    llvm.call @foo(%2) : (i64) -> ()
    llvm.br ^bb13
  ^bb12:  // pred: ^bb10
    llvm.call @foo(%1) : (i64) -> ()
    llvm.br ^bb13
  ^bb13:  // 2 preds: ^bb11, ^bb12
    llvm.br ^bb27
  ^bb14:  // pred: ^bb0
    %23 = llvm.call @rand() : () -> i1
    llvm.cond_br %23, ^bb15, ^bb16
  ^bb15:  // pred: ^bb14
    llvm.call @foo(%8) : (i64) -> ()
    llvm.br ^bb17
  ^bb16:  // pred: ^bb14
    llvm.call @foo(%7) : (i64) -> ()
    llvm.br ^bb17
  ^bb17:  // 2 preds: ^bb15, ^bb16
    %26 = llvm.call @rand() : () -> i1
    llvm.cond_br %26, ^bb18, ^bb19
  ^bb18:  // pred: ^bb17
    llvm.call @foo(%5) : (i64) -> ()
    llvm.br ^bb20
  ^bb19:  // pred: ^bb17
    llvm.call @foo(%6) : (i64) -> ()
    llvm.br ^bb20
  ^bb20:  // 2 preds: ^bb18, ^bb19
    %29 = llvm.call @rand() : () -> i1
    llvm.cond_br %29, ^bb21, ^bb22
  ^bb21:  // pred: ^bb20
    llvm.call @foo(%3) : (i64) -> ()
    llvm.br ^bb23
  ^bb22:  // pred: ^bb20
    llvm.call @foo(%4) : (i64) -> ()
    llvm.br ^bb23
  ^bb23:  // 2 preds: ^bb21, ^bb22
    %32 = llvm.call @rand() : () -> i1
    llvm.cond_br %32, ^bb24, ^bb25
  ^bb24:  // pred: ^bb23
    llvm.call @foo(%1) : (i64) -> ()
    llvm.br ^bb26
  ^bb25:  // pred: ^bb23
    llvm.call @foo(%2) : (i64) -> ()
    llvm.br ^bb26
  ^bb26:  // 2 preds: ^bb24, ^bb25
    llvm.br ^bb27
  ^bb27:  // 2 preds: ^bb13, ^bb26
    llvm.return
  }
}

It corresponds approximately to

if (something) {
  foo(1, 2)
  foo(3, 4)
  foo(5, 6)
  foo(7, 8)
} else {
  foo(2, 1)
  foo(4, 3)
  foo(6, 5)
  foo(8, 7)
}

The block merger is trying to merge both branches together, since they share similar sub-CFGs, but it's blowing up on the number of block operands. Adding just a few more branches causes 💥

The text was updated successfully, but these errors were encountered:

llvmbot · 2023-06-10T00:16:04Z

@llvm/issue-subscribers-mlir

River707 · 2023-06-13T06:46:57Z

I took a look into this a tiny bit on friday. There seem to be a few deficiencies:
First, we should add a region simplification that drops unnecessary arguments (i.e. arguments whose incoming value is the same for all predecessors). Secondly, block merging should avoid adding new arguments when there are existing arguments with the same incoming predecessor/values. The explosion happens as we iteratively the two halves of the CFG and need to pass in values at each merge step.

llvmbot · 2024-06-11T15:19:15Z

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
Fix the issue locally.
Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
Create a Git commit.
Run git clang-format HEAD~1 to format your changes.
Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

llvmbot · 2024-06-11T15:19:16Z

@llvm/issue-subscribers-good-first-issue

Author: Jeff Niu (Mogball)

This example shows the beginnings of algorithmic explosion when run through `mlir-opt -canonicalize`

module {
  llvm.func @<!-- -->foo(%arg0: i64)

  llvm.func @<!-- -->rand() -&gt; i1

  llvm.func @<!-- -->top(%arg0: i64) {
    %0 = llvm.mlir.constant(0 : i64) : i64
    %1 = llvm.mlir.constant(8 : i64) : i64
    %2 = llvm.mlir.constant(7 : i64) : i64
    %3 = llvm.mlir.constant(6 : i64) : i64
    %4 = llvm.mlir.constant(5 : i64) : i64
    %5 = llvm.mlir.constant(4 : i64) : i64
    %6 = llvm.mlir.constant(3 : i64) : i64
    %7 = llvm.mlir.constant(1 : i64) : i64
    %8 = llvm.mlir.constant(2 : i64) : i64
    %10 = llvm.icmp "eq" %arg0, %0 : i64
    llvm.cond_br %10, ^bb1, ^bb14
  ^bb1:  // pred: ^bb0
    %11 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %11, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    llvm.call @<!-- -->foo(%7) : (i64) -&gt; ()
    llvm.br ^bb4
  ^bb3:  // pred: ^bb1
    llvm.call @<!-- -->foo(%8) : (i64) -&gt; ()
    llvm.br ^bb4
  ^bb4:  // 2 preds: ^bb2, ^bb3
    %14 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %14, ^bb5, ^bb6
  ^bb5:  // pred: ^bb4
    llvm.call @<!-- -->foo(%6) : (i64) -&gt; ()
    llvm.br ^bb7
  ^bb6:  // pred: ^bb4
    llvm.call @<!-- -->foo(%5) : (i64) -&gt; ()
    llvm.br ^bb7
  ^bb7:  // 2 preds: ^bb5, ^bb6
    %17 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %17, ^bb8, ^bb9
  ^bb8:  // pred: ^bb7
    llvm.call @<!-- -->foo(%4) : (i64) -&gt; ()
    llvm.br ^bb10
  ^bb9:  // pred: ^bb7
    llvm.call @<!-- -->foo(%3) : (i64) -&gt; ()
    llvm.br ^bb10
  ^bb10:  // 2 preds: ^bb8, ^bb9
    %20 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %20, ^bb11, ^bb12
  ^bb11:  // pred: ^bb10
    llvm.call @<!-- -->foo(%2) : (i64) -&gt; ()
    llvm.br ^bb13
  ^bb12:  // pred: ^bb10
    llvm.call @<!-- -->foo(%1) : (i64) -&gt; ()
    llvm.br ^bb13
  ^bb13:  // 2 preds: ^bb11, ^bb12
    llvm.br ^bb27
  ^bb14:  // pred: ^bb0
    %23 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %23, ^bb15, ^bb16
  ^bb15:  // pred: ^bb14
    llvm.call @<!-- -->foo(%8) : (i64) -&gt; ()
    llvm.br ^bb17
  ^bb16:  // pred: ^bb14
    llvm.call @<!-- -->foo(%7) : (i64) -&gt; ()
    llvm.br ^bb17
  ^bb17:  // 2 preds: ^bb15, ^bb16
    %26 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %26, ^bb18, ^bb19
  ^bb18:  // pred: ^bb17
    llvm.call @<!-- -->foo(%5) : (i64) -&gt; ()
    llvm.br ^bb20
  ^bb19:  // pred: ^bb17
    llvm.call @<!-- -->foo(%6) : (i64) -&gt; ()
    llvm.br ^bb20
  ^bb20:  // 2 preds: ^bb18, ^bb19
    %29 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %29, ^bb21, ^bb22
  ^bb21:  // pred: ^bb20
    llvm.call @<!-- -->foo(%3) : (i64) -&gt; ()
    llvm.br ^bb23
  ^bb22:  // pred: ^bb20
    llvm.call @<!-- -->foo(%4) : (i64) -&gt; ()
    llvm.br ^bb23
  ^bb23:  // 2 preds: ^bb21, ^bb22
    %32 = llvm.call @<!-- -->rand() : () -&gt; i1
    llvm.cond_br %32, ^bb24, ^bb25
  ^bb24:  // pred: ^bb23
    llvm.call @<!-- -->foo(%1) : (i64) -&gt; ()
    llvm.br ^bb26
  ^bb25:  // pred: ^bb23
    llvm.call @<!-- -->foo(%2) : (i64) -&gt; ()
    llvm.br ^bb26
  ^bb26:  // 2 preds: ^bb24, ^bb25
    llvm.br ^bb27
  ^bb27:  // 2 preds: ^bb13, ^bb26
    llvm.return
  }
}

It corresponds approximately to

if (something) {
  foo(1, 2)
  foo(3, 4)
  foo(5, 6)
  foo(7, 8)
} else {
  foo(2, 1)
  foo(4, 3)
  foo(6, 5)
  foo(8, 7)
}

The block merger is trying to merge both branches together, since they share similar sub-CFGs, but it's blowing up on the number of block operands. Adding just a few more branches causes 💥

joker-eph · 2024-06-11T15:19:20Z

I think @River707 's suggestions here can be a good first issue! (just added the tag)

giuseros · 2024-06-12T12:49:50Z

Hi @joker-eph , do you mind if I assign this ticket to me? Or someone else is working on it?

This PR disable block merging when running `convert-builtin-func-to-llvm`. The reason behind this is that for now block merging can double the arguments of the blocks. This means that after a while we can start witnessing a block argument "explosion" which hangs the compiler. I am working on this ticket: llvm/llvm-project#63230 to make block merging better, but in the meantime, we should stop merging blocks to avoid compiler hangs. I added the minimal test to reproduce the explosion. The test for now is checking that we don't try to merge blocks.

With this PR I am trying to address: #63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. **Note**: many tests are still not passing. But I wanted to submit the code before changing all the tests (and probably adding a couple), so that we can agree in principle on the algorithm/design.

With this PR I am trying to address: llvm#63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. **Note**: many tests are still not passing. But I wanted to submit the code before changing all the tests (and probably adding a couple), so that we can agree in principle on the algorithm/design.

With this PR I am trying to address: #63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted. This needs to happen to all the arguments we pass to the different successors of the parent block - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. **Note**: this a rework of #96871 . I ran all the integration tests (`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed.

Summary: With this PR I am trying to address: #63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted. This needs to happen to all the arguments we pass to the different successors of the parent block - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. **Note**: this a rework of #96871 . I ran all the integration tests (`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250916

@Dinistro

With this PR I am trying to address: #63230. What changed: - While merging identical blocks, don't add a block argument if it is "identical" to another block argument. I.e., if the two block arguments refer to the same `Value`. The operations operands in the block will point to the argument we already inserted. This needs to happen to all the arguments we pass to the different successors of the parent block - After merged the blocks, get rid of "unnecessary" arguments. I.e., if all the predecessors pass the same block argument, there is no need to pass it as an argument. - This last simplification clashed with `BufferDeallocationSimplification`. The reason, I think, is that the two simplifications are clashing. I.e., `BufferDeallocationSimplification` contains an analysis based on the block structure. If we simplify the block structure (by merging and/or dropping block arguments) the analysis is invalid . The solution I found is to do a more prudent simplification when running that pass. **Note-1**: I ran all the integration tests (`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`) and they passed. **Note-2**: I fixed a bug found by @Dinistro in #97697 . The issue was that, when looking for redundant arguments, I was not considering that the block might have already some arguments. So the index (in the block args list) of the i-th `newArgument` is `i+numOfOldArguments`.

giuseros · 2024-08-28T14:56:02Z

This has been fixed in #102038

…-lang#4176) This PR disable block merging when running `convert-builtin-func-to-llvm`. The reason behind this is that for now block merging can double the arguments of the blocks. This means that after a while we can start witnessing a block argument "explosion" which hangs the compiler. I am working on this ticket: llvm/llvm-project#63230 to make block merging better, but in the meantime, we should stop merging blocks to avoid compiler hangs. I added the minimal test to reproduce the explosion. The test for now is checking that we don't try to merge blocks.

Mogball added the mlir label Jun 10, 2023

Mogball mentioned this issue Jun 10, 2024

[mlir] Do not merge blocks during canonicalization by default #95057

Merged

joker-eph added the good first issue https://github.com/llvm/llvm-project/contribute label Jun 11, 2024

giuseros mentioned this issue Jun 12, 2024

[Issue]: Triton Compiler Takes Indefinite Time in ttgir -> llir Stage. ROCm/triton#596

Closed

joker-eph assigned giuseros Jun 12, 2024

giuseros mentioned this issue Jun 20, 2024

[AMD] Disable block merging to avoid block argument explosion triton-lang/triton#4176

Merged

giuseros mentioned this issue Jun 27, 2024

Fix block merging #96871

Merged

giuseros mentioned this issue Jul 4, 2024

[mlir] Fix block merging #97697

Merged

giuseros mentioned this issue Aug 5, 2024

[mlir] Fix block merging #102038

Merged

giuseros closed this as completed Aug 28, 2024

EugeneZelenko removed the mlir label Aug 28, 2024

EugeneZelenko added the mlir:bufferization Bufferization infrastructure label Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir] `mergeIdenticalBlocks` is blowing up on a conditional to identical CFGs #63230

[mlir] `mergeIdenticalBlocks` is blowing up on a conditional to identical CFGs #63230

Mogball commented Jun 10, 2023

llvmbot commented Jun 10, 2023

River707 commented Jun 13, 2023

llvmbot commented Jun 11, 2024

llvmbot commented Jun 11, 2024

joker-eph commented Jun 11, 2024

giuseros commented Jun 12, 2024

giuseros commented Aug 28, 2024

[mlir] mergeIdenticalBlocks is blowing up on a conditional to identical CFGs #63230

[mlir] mergeIdenticalBlocks is blowing up on a conditional to identical CFGs #63230

Comments

Mogball commented Jun 10, 2023

llvmbot commented Jun 10, 2023

River707 commented Jun 13, 2023

llvmbot commented Jun 11, 2024

llvmbot commented Jun 11, 2024

joker-eph commented Jun 11, 2024

giuseros commented Jun 12, 2024

giuseros commented Aug 28, 2024

[mlir] `mergeIdenticalBlocks` is blowing up on a conditional to identical CFGs #63230

[mlir] `mergeIdenticalBlocks` is blowing up on a conditional to identical CFGs #63230