Add optimized op_where #8866

swolchok · 2025-03-01T01:16:25Z

It materializes separate kernels for the cases where the two input data tensors have the same dtype and the third one has dtype bool.

[ghstack-poisoned]

swolchok · 2025-03-01T01:16:26Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2025-03-01T01:16:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8866

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit e52508d with merge base cca6917 ():

NEW FAILURE - The following job has failed:

pull / unittest-editable / linux / linux-job (gh)
backends/xnnpack/test/ops/test_conv1d.py::TestConv1d::test_qs8_conv1d_batchnorm_seq

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-comment-id: 2691805026 ghstack-source-id: ebe8bee4e3ca4184e91058bd7033e69e130644da ghstack-comment-id: 2691808920 Pull Request resolved: #8866

[ghstack-poisoned]

ghstack-comment-id: 2691805026 ghstack-source-id: aaf66178b1902763ebaaaa2f0b3a312505722ea8 ghstack-comment-id: 2691808920 Pull Request resolved: #8866

[ghstack-poisoned]

digantdesai · 2025-03-04T03:40:59Z

kernels/optimized/cpu/op_where.cpp

+          data_out[out_index] =
+              data_cond[cond_index] ? data_a[a_index] : data_b[b_index];


Love how clean it reads :)

Couple of high level comments,

Not too familiar with how people use this op i.e. any common case we see often in the wild, but are there any short-circuits we can do to avoid coming here?

Similarly, depending on the condition, if we know we are biased towards A vs. B, we can first copy A to the result and then go through data_cond and pick B.

Lastly, I am assuming we want to do SIMD later, if we do then we can use predicates, that should make it less "branchy". It may not help much with load/stores though.

if we know we are biased towards A vs. B

we can't possibly know this in general.

SIMD later

not currently on my agenda, but I may have to come back.

digantdesai

Looks good to me.

[ghstack-poisoned]

ghstack-comment-id: 2691805026 ghstack-source-id: 7e5ac3a26e2728bc9e7fca1f37e0368efce5186e ghstack-comment-id: 2691808920 Pull Request resolved: #8866

[ghstack-poisoned]

ghstack-comment-id: 2691805026 ghstack-source-id: c88f9387a18951f40fffb1cc9971daafe7b82122 ghstack-comment-id: 2691808920 Pull Request resolved: #8866

[ghstack-poisoned]

ghstack-comment-id: 2691805026 ghstack-source-id: 1a0f6d2c788778fcb6fcb132f7ef452cd048e3d3 ghstack-comment-id: 2691808920 Pull Request resolved: #8866

[ghstack-poisoned]

swolchok · 2025-03-06T16:09:35Z

unittest-editable failure is a known flake, so noting that we have green CI. I think this will cause -Wunused -Werror builds to fail though, so I need to fix that before merging.

swolchok · 2025-03-06T18:37:31Z

I think this will cause -Wunused -Werror builds to fail though

Checked, works fine, I misread my own code.

It materializes separate kernels for the cases where the two input data tensors have the same dtype and the third one has dtype bool.

swolchok added 4 commits February 28, 2025 17:16

Update

f1ace77

[ghstack-poisoned]

Update

d3a0f67

[ghstack-poisoned]

Update

a78277d

[ghstack-poisoned]

Update

6b6180c

[ghstack-poisoned]

swolchok requested review from manuelcandales, digantdesai and mcr229 as code owners March 1, 2025 01:16

This was referenced Mar 1, 2025

portable arg{max,min}: optimize update check #8863

Merged

add BroadcastIndexesRange #8864

Merged

Deploy BroadcastIndexesRange #8865

Merged

swolchok added a commit that referenced this pull request Mar 1, 2025

first crack at optimized op_where

5978c31

ghstack-comment-id: 2691805026 ghstack-source-id: ebe8bee4e3ca4184e91058bd7033e69e130644da ghstack-comment-id: 2691808920 Pull Request resolved: #8866

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 1, 2025

swolchok added 4 commits March 3, 2025 10:03

Update

7cbe7a1

[ghstack-poisoned]

Update

8e8ce33

[ghstack-poisoned]

Update

093a6d3

[ghstack-poisoned]

Update

77f039c

[ghstack-poisoned]

This was referenced Mar 3, 2025

portable arg{max,min}: optimize update check #8755

Closed

add DelinearizedIndexesRange #8859

Closed

deploy delinearized_indexes_range -- didn't work #8860

Closed

first crack at optimized op_where #8861

Closed

swolchok added a commit that referenced this pull request Mar 3, 2025

first crack at optimized op_where

18748af

ghstack-comment-id: 2691805026 ghstack-source-id: aaf66178b1902763ebaaaa2f0b3a312505722ea8 ghstack-comment-id: 2691808920 Pull Request resolved: #8866

Update

c4e4541

[ghstack-poisoned]

This was referenced Mar 3, 2025

Link xnn_executor_runner with optimized op library #8901

Merged

Add cpu_thread setting logic to xnn_executor_runner #8902

Merged

swolchok added 2 commits March 3, 2025 15:31

Update

d3edbcb

[ghstack-poisoned]

Update

bf96b84

[ghstack-poisoned]

digantdesai reviewed Mar 4, 2025

View reviewed changes

digantdesai approved these changes Mar 4, 2025

View reviewed changes

swolchok added 2 commits March 4, 2025 10:27

Update

3e20161

[ghstack-poisoned]

Update

1497adb

[ghstack-poisoned]

swolchok added a commit that referenced this pull request Mar 4, 2025

first crack at optimized op_where

541c73a

ghstack-comment-id: 2691805026 ghstack-source-id: 7e5ac3a26e2728bc9e7fca1f37e0368efce5186e ghstack-comment-id: 2691808920 Pull Request resolved: #8866

manuelcandales approved these changes Mar 4, 2025

View reviewed changes

swolchok added 3 commits March 4, 2025 15:35

Update

8ade738

[ghstack-poisoned]

Update

0a18dab

[ghstack-poisoned]

Update

62ab1e7

[ghstack-poisoned]

swolchok added a commit that referenced this pull request Mar 4, 2025

first crack at optimized op_where

f1090b0

ghstack-comment-id: 2691805026 ghstack-source-id: c88f9387a18951f40fffb1cc9971daafe7b82122 ghstack-comment-id: 2691808920 Pull Request resolved: #8866

swolchok added 2 commits March 4, 2025 21:32

Update

a9bbae4

[ghstack-poisoned]

Update

7bce689

[ghstack-poisoned]

swolchok added a commit that referenced this pull request Mar 5, 2025

first crack at optimized op_where

b7cabfa

ghstack-comment-id: 2691805026 ghstack-source-id: 1a0f6d2c788778fcb6fcb132f7ef452cd048e3d3 ghstack-comment-id: 2691808920 Pull Request resolved: #8866

swolchok added 2 commits March 5, 2025 09:50

Update

d208351

[ghstack-poisoned]

Update

289d53c

[ghstack-poisoned]

swolchok mentioned this pull request Mar 5, 2025

Add BroadcastIndexesRange tests with dims of size 1 in output #8964

Merged

swolchok added 4 commits March 5, 2025 10:04

Update

6893c27

[ghstack-poisoned]

Update

7cc62d2

[ghstack-poisoned]

Update

570845e

[ghstack-poisoned]

Update

e52508d

[ghstack-poisoned]

Base automatically changed from gh/swolchok/301/head to main March 6, 2025 03:06

swolchok changed the title ~~first crack at optimized op_where~~ Add optimized op_where Mar 6, 2025

swolchok merged commit a2c0b59 into main Mar 6, 2025
50 of 51 checks passed

swolchok deleted the gh/swolchok/302/head branch March 6, 2025 18:37

zonglinpeng pushed a commit that referenced this pull request Mar 6, 2025

Add optimized op_where (#8866)

5297d98

It materializes separate kernels for the cases where the two input data tensors have the same dtype and the third one has dtype bool.

github-actions bot mentioned this pull request Mar 10, 2025

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#16

Open

This was referenced Mar 17, 2025

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#18

Open

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#20

Open

github-actions bot mentioned this pull request Mar 31, 2025

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#22

Open

github-actions bot mentioned this pull request Apr 7, 2025

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optimized op_where #8866

Add optimized op_where #8866

swolchok commented Mar 1, 2025 •

edited

Loading

swolchok commented Mar 1, 2025 •

edited

Loading

pytorch-bot bot commented Mar 1, 2025 •

edited

Loading

digantdesai Mar 4, 2025 •

edited

Loading

swolchok Mar 4, 2025

digantdesai left a comment

swolchok commented Mar 6, 2025

swolchok commented Mar 6, 2025

		data_out[out_index] =
		data_cond[cond_index] ? data_a[a_index] : data_b[b_index];

Add optimized op_where #8866

Add optimized op_where #8866

Conversation

swolchok commented Mar 1, 2025 • edited Loading

swolchok commented Mar 1, 2025 • edited Loading

pytorch-bot bot commented Mar 1, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8866

❌ 1 New Failure

digantdesai Mar 4, 2025 • edited Loading

Choose a reason for hiding this comment

swolchok Mar 4, 2025

Choose a reason for hiding this comment

digantdesai left a comment

Choose a reason for hiding this comment

swolchok commented Mar 6, 2025

swolchok commented Mar 6, 2025

swolchok commented Mar 1, 2025 •

edited

Loading

swolchok commented Mar 1, 2025 •

edited

Loading

pytorch-bot bot commented Mar 1, 2025 •

edited

Loading

digantdesai Mar 4, 2025 •

edited

Loading