Skip to content

Combine shuffle(fneg(x),fneg(y)) -> fneg(shuffle(x,y)) #45631

Closed
@RKSimon

Description

@RKSimon
Bugzilla Link 46286
Version trunk
OS Windows NT
CC @rotateright

Extended Description

https://godbolt.org/z/cHgY_S

For cases such as:

define <4 x float> @fneg_concat_v2f32(<2 x float> %a0, <2 x float> %a1) {
  %1 = fneg <2 x float> %a0
  %2 = fneg <2 x float> %a1
  %3 = shufflevector <2 x float> %1, <2 x float> %2, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
  ret <4 x float> %3
}
define <4 x float> @fneg_concat_v4f32(<4 x float> %a0, <4 x float> %a1) {
  %1 = fneg <4 x float> %a0
  %2 = fneg <4 x float> %a1
  %3 = shufflevector <4 x float> %1, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
  ret <4 x float> %3
}

we are almost certainly better off moving the fneg after the shuffle:

define <4 x float> @concat_fneg_v2f32(<2 x float> %a0, <2 x float> %a1) {
  %1 = shufflevector <2 x float> %a0, <2 x float> %a1, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
  %2 = fneg <4 x float> %1
  ret <4 x float> %2
}
define <4 x float> @concat_fneg_v4f32(<4 x float> %a0, <4 x float> %a1) {
  %1 = shufflevector <4 x float> %a0, <4 x float> %a1, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
  %2 = fneg <4 x float> %1
  ret <4 x float> %2
}

Binops would probably benefit in some cases (constant operand?) as well.

The issue that vectorcombine might encounter though is that we fail to get costs for most length changing shuffles, so the 'concat_vectors' shuffle pattern returns an 'Unknown' cost.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions