Skip to content

[sse4a] implement all non-immediate-mode intrinsics #249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 22, 2017

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Dec 22, 2017

No description provided.

@gnzlbg gnzlbg requested a review from alexcrichton December 22, 2017 15:07
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Dec 22, 2017

Does somebody have access to an AMD machine and can check whether the tests pass there (using --nocapture to verify that sse4a is detected to be enabled)?

@alexcrichton
Copy link
Member

I don't personally have access but they look good to me, thanks!

@alexcrichton alexcrichton merged commit da7ca5f into rust-lang:master Dec 22, 2017
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Dec 22, 2017 via email

@goodmanjonathan
Copy link
Contributor

I get assertion failures for the _mm_stream_ss and _mm_stream_sd tests:

running 1725 tests
test runtime::macros::tests::test_macros ... ok
test runtime::x86::tests::compare_with_cupid ... ok
test runtime::x86::tests::dump ... sse: true
sse2: true
sse3: true
ssse3: false
sse4.1: false
sse4.2: false
sse4a: true

<==== snip ====>

test x86::i686::sse4a::assert__mm_extract_si64_extrq ... ok
test x86::i686::sse4a::assert__mm_insert_si64_insertq ... ok
test x86::i686::sse4a::assert__mm_stream_sd_movntsd ... ok
test x86::i686::sse4a::assert__mm_stream_ss_movntss ... ok
test x86::i686::sse4a::tests::_mm_extract_si64 ... ok
test x86::i686::sse4a::tests::_mm_insert_si64 ... ok
test x86::i686::sse4a::tests::_mm_stream_sd ... thread 'x86::i686::sse4a::tests::_mm_stream_sd' panicked at 'assertion failed: (left == right)
left: 3.0,
right: 4.0', src/x86/i686/sse4a.rs:128:8
note: Run with RUST_BACKTRACE=1 for a backtrace.
FAILED
test x86::i686::sse4a::tests::_mm_stream_ss ... thread 'x86::i686::sse4a::tests::_mm_stream_ss' panicked at 'assertion failed: (left == right)
left: 5.0,
right: 8.0', src/x86/i686/sse4a.rs:150:8
FAILED

<==== snip ====>

If I'm correct, it looks like the first f32/f64 in a f32x4/f64x2 is the least significant f32/f64, not the last f32/f64, as confirmed by this equivalent C code:

#include <assert.h>
#include <x86intrin.h>

__attribute__((target("sse4a"))) void f(void) {
    double d[] = { 1.0, 2.0 };
    __m128d v1 = { 3.0, 4.0 };
    _mm_stream_sd(d, v1);
    assert(d[0] == 3.0); // not 4.0
    assert(d[1] == 2.0);

    float f[] = { 1.0, 2.0, 3.0, 4.0 };
    __m128 v2 = { 5.0, 6.0, 7.0, 8.0 };
    _mm_stream_ss(f, v2);
    assert(f[0] == 5.0); // not 8.0
    assert(f[1] == 2.0);
    assert(f[2] == 3.0);
    assert(f[3] == 4.0);
}

int main(void) {
    f();
}

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Dec 27, 2017

@goodmanjonathan thanks for looking into this, are you willing to submit a patch that fixes the tests and the docs of the intrinsics?

@goodmanjonathan
Copy link
Contributor

I'd be happy to!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants