forked from notaz/mesa
-
Notifications
You must be signed in to change notification settings - Fork 2
Rebase to asahi-20250221 upstream tag #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kaazoo
wants to merge
3,726
commits into
UbuntuAsahi:ubuntu/noble
Choose a base branch
from
kaazoo:ubuntu/noble
base: ubuntu/noble
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The values of some builtins are known at compile time when the application creates pipelines with static state. Stats for graphics pipelines: Totals from 568 (0.71% of 80536) affected shaders: MaxWaves: 12364 -> 12502 (+1.12%); split: +1.26%, -0.15% Instrs: 515696 -> 501182 (-2.81%); split: -2.85%, +0.04% CodeSize: 2815736 -> 2741464 (-2.64%); split: -2.69%, +0.05% VGPRs: 29528 -> 29160 (-1.25%); split: -1.71%, +0.46% SpillSGPRs: 212 -> 215 (+1.42%) Latency: 5515421 -> 5409125 (-1.93%); split: -2.05%, +0.13% InvThroughput: 1293512 -> 1277913 (-1.21%); split: -1.27%, +0.06% VClause: 10570 -> 10295 (-2.60%); split: -2.74%, +0.14% SClause: 19040 -> 18531 (-2.67%); split: -2.83%, +0.16% Copies: 37189 -> 35431 (-4.73%); split: -5.31%, +0.58% Branches: 11391 -> 11070 (-2.82%); split: -2.92%, +0.11% PreSGPRs: 27848 -> 27313 (-1.92%); split: -1.95%, +0.03% PreVGPRs: 24847 -> 24106 (-2.98%); split: -3.00%, +0.02% VALU: 359356 -> 348779 (-2.94%); split: -2.97%, +0.03% SALU: 59135 -> 57448 (-2.85%); split: -3.11%, +0.26% VMEM: 14674 -> 14313 (-2.46%) SMEM: 30901 -> 30342 (-1.81%); split: -1.84%, +0.03% Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32793>
Signed-off-by: Yiwei Zhang <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33567>
Signed-off-by: Yiwei Zhang <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33567>
Signed-off-by: Yiwei Zhang <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33567>
This is no longer required. Signed-off-by: Valentine Burley <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33578>
This is no longer required. Also document a flake seen recently. Signed-off-by: Valentine Burley <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33578>
Signed-off-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33255>
Signed-off-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33255>
If the program writes to shared variables after all reads, in the last block of the program, no one will ever read the value we write. We can just eliminate these dead writes. (Thanks to Faith Ekstrand for improving the ends_program() conditions.) Reviewed-by: Faith Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33452>
Signed-off-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33505>
The main advantage is to use BDA for texel buffer descriptors. It might also be slightly faster on the CPU. Signed-off-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33505>
Signed-off-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33505>
Signed-off-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33505>
Fixes: a104a7c ("tu: Handle non-identity GMEM swaps when resolving") Signed-off-by: Danylo Piliaiev <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33514>
Signed-off-by: Danylo Piliaiev <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33514>
Apparently fast path cannot handle mismatched mutability and we should use CP_BLIT which has SP_PS_2D_SRC_INFO.MUTABLEEN to signal src mutability. Previously it was partially handled by tu_attachment_store_mismatched_swap. Fixes: a104a7c ("tu: Handle non-identity GMEM swaps when resolving") Signed-off-by: Danylo Piliaiev <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33514>
Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Faith Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33548>
…fraction The fraction was making it run for only 10-12 seconds, which is wasteful considering the huge overhead. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33581>
Improve readability of ppir representation dump - adopt "dest = op src1[, srcN]" - use symbolic names of pipeline registers - print destination writemask - print destination modifier (if any) - print source(s) swizzle - print constants - print load node base index - print branch condition(s) With these modifications it's actually possible to follow the program -------block 0------- $0008 = mov ^texture ($0005) // NIR: new ($0005) ^texture = ld_tex ^discard ($0006).xyzx, $0004.xxxx // NIR: ssa4 ($0006) ^discard = ld_coords_reg $0002.xyzx // NIR: new $0004.x = ld_var 0 // NIR: ssa6 $0002.xyz = mov $0001.yzwx // NIR: ssa5 $0001 = ld_var 0 // NIR: ssa7 Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33540>
Fixup src node when replacing src for select and load_reg It doesn't affect compiler functionality, but affects printing ppir representation. Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33540>
Print index of the node that breaks node_to_instr to make debugging easier Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33540>
Fix multiple issues with atan in disassembler: - arg1_en field in combiner unit actually seems to be a bit indicating that one of sources is vector (e.g. for atan_pt2, or multiplication) - atan2 has 2 arguments, not one - properly handle all instruction variants Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33540>
Assert on unexpected pipeline dest for fmul and vmul to catch scheduler bugs early Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33540>
Combiner unit support scalar by vector multiplication and scalar mov. Implement it for codegen Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33568>
Combiner unit runs after fmul/smul/fadd/sadd units and it can consume the results that previous units wrote to the registers. So prefer placing scalar mul into combiner unit and predecessors (if any) into other units shader-db: total instructions in shared programs: 29072 -> 27698 (-4.73%) instructions in affected programs: 11237 -> 9863 (-12.23%) helped: 163 HURT: 0 helped stats (abs) min: 1 max: 42 x̄: 8.43 x̃: 4 helped stats (rel) min: 0.64% max: 30.00% x̄: 13.03% x̃: 11.76% 95% mean confidence interval for instructions value: -9.89 -6.96 95% mean confidence interval for instructions %-change: -14.09% -11.97% Instructions are helped. total loops in shared programs: 2 -> 2 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 367 -> 372 (1.36%) spills in affected programs: 16 -> 21 (31.25%) helped: 1 HURT: 2 total fills in shared programs: 1208 -> 1224 (1.32%) fills in affected programs: 51 -> 67 (31.37%) helped: 2 HURT: 2 LOST: 0 GAINED: 0 Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33568>
Reviewed-by: Lionel Landwerlin <[email protected]> Signed-off-by: José Roberto de Souza <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33585>
… code Avoid backends duplication. Reviewed-by: Lionel Landwerlin <[email protected]> Signed-off-by: José Roberto de Souza <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33585>
The only thing we really need to do here is to make sure we don't try to use the EDB path for push descriptors since those aren't really descriptor buffers. Backport-to: 25.0 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33589>
fneg and fabs are folded later in ppir, but having them in nir as a separate instructions prevents duplicate_intrinsics pass from duplicating load_input and load_uniform. Duplicate fneg and fabs, so subsequent duplicate_intrinsic pass can duplicate the loads shader-db: total instructions in shared programs: 27698 -> 27675 (-0.08%) instructions in affected programs: 2752 -> 2729 (-0.84%) helped: 21 HURT: 2 helped stats (abs) min: 1 max: 4 x̄: 1.19 x̃: 1 helped stats (rel) min: 0.38% max: 6.67% x̄: 2.75% x̃: 0.75% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.89% max: 1.89% x̄: 1.89% x̃: 1.89% 95% mean confidence interval for instructions value: -1.39 -0.61 95% mean confidence interval for instructions %-change: -3.67% -1.03% Instructions are helped. total loops in shared programs: 2 -> 2 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 372 -> 368 (-1.08%) spills in affected programs: 27 -> 23 (-14.81%) helped: 4 HURT: 0 total fills in shared programs: 1224 -> 1205 (-1.55%) fills in affected programs: 81 -> 62 (-23.46%) helped: 4 HURT: 0 LOST: 0 GAINED: 0 Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33569>
Assertion (or attempting the layout change) is causing crash when launching Steel Rats. Tighten the condition for change so that it should affect only when runtime has made changes. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12602 Fixes: eed7882 ("anv: ensure consistent layout transitions in render passes") Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33523>
cleaner. Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
fixes new CTS dEQP-VK.query_pool.statistics_query*input_assembly_primitives.*_patch_list_* Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Per meson docs: This posarg is optional since 0.60.0. It defaults to the basename of the first output. Signed-off-by: Alyssa Rosenzweig <[email protected]>
meson can infer since these are inputs. Signed-off-by: Alyssa Rosenzweig <[email protected]>
would've saved me a lot of dbg trouble.. Signed-off-by: Alyssa Rosenzweig <[email protected]>
identified in dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.img.samples_1.2d.comp. owwie. Signed-off-by: Alyssa Rosenzweig <[email protected]>
save a few instrs. Signed-off-by: Alyssa Rosenzweig <[email protected]>
lets use the hw address mode. oops! Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
this gets us to fl12_0. Signed-off-by: Alyssa Rosenzweig <[email protected]>
so silly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
dumb corner. Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
copy what RADV does. This is a CTS bug. Merging pending the CTS fixes (in progress). Signed-off-by: Alyssa Rosenzweig <[email protected]>
An even bigger pile of tests... This adds general miptree tests for some compressed formats, and even more comprehensive miptree and tile size tests for more formats and size combinations. Since the tests are based on observing tiling done by the Metal driver, we don't know the actual tile size, but rather we can just identify which tile sizes logically have the same result (since several sizes can be equivalent). This is encoded as a bit mask, split into two halves to handle the case of stride padding for compressed textures (which tile sizes are valid with and without stride padding can vary, and sometimes you can either change the tile size or add padding and end up with the same result). As long as whatever configuration the layout code comes up with has its corresponding bit set in the bit mask, the tiling should be correct. Signed-off-by: Asahi Lina <[email protected]>
Looks good, two nits:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.