[SYCL] VS2026 support#21990
Conversation
…is not available on GPU. This fix masks and works around it.
…deal with this, so adjusting there.
Signed-off-by: Chris Perkins <chris.perkins@intel.com>
|
ping @intel/dpcpp-cfe-reviewers |
|
ping to reviewers. @intel/dpcpp-cfe-reviewers , @sergey-semenov |
Fznamznon
left a comment
There was a problem hiding this comment.
Sorry, I don't understand why SYCL device code needs to detect CPU features. I don't think that force pushing sse features for SPIR target which is a generic target independent architecture is right. I don't see it done anywhere else. Also, this seems like a problem that doesn't require SemaSYCL kernel building and therefore should be done in the upstream. cc @tahonermann
tahonermann
left a comment
There was a problem hiding this comment.
Thanks for tagging me on this, @Fznamznon. I'm likewise concerned about what this PR is trying to do. It seems to be adding a workaround on top of a prior workaround that perhaps should have been addressed a different way.
| // Mirror X86TargetInfo's "x86_64 always has SSE2" baseline: the matching | ||
| // _M_X64 macro makes MSVC STL headers take the x86 intrinsics path, whose | ||
| // _mm_* intrinsics require sse/sse2 in the target feature set. | ||
| Features["sse"] = true; | ||
| Features["sse2"] = true; | ||
| return SPIR64TargetInfo::initFeatureMap(Features, Diags, CPU, FeaturesVec); |
There was a problem hiding this comment.
This doesn't make sense to me. Why would we want to claim SSE and SSE2 availability for a target for which those are not relevant?
There was a problem hiding this comment.
It is confusing.
The _M_X64 macro is already defined for this target, which causes MSVC STL headers to take the x86 intrinsics path. ( yay! )
VS2026's <complex> introduces inline functions like _Sqr_error_x86_x64_fma annotated [[__gnu__::__target__("fma")]], whose bodies call _mm_set_sd / _mm_store_sd. The target attribute layers fma on top of the existing baseline, it doesn't replace it, so if the baseline lacks sse/sse2, those mm* builtins fail their feature-compatibility check at parse time, even though the function is never called from device code.
So we add SSE/SSE2 just to get past that check and to get things move forward again.
To my mind, since this target already claims _M_X64 then it's really a matter of being consistent. In for a penny...
There was a problem hiding this comment.
I see, thanks for the detailed explanation.
| // When SYCL device code is compiled with a Windows-MSVC host, the device | ||
| // target (spir64) defines _M_X64 so that MSVC STL headers take the x86 | ||
| // intrinsics path. The device target feature set must correspondingly carry | ||
| // sse/sse2 so that function-level __target__ attributes (e.g. VS2026 | ||
| // <complex>'s [[gnu::target("fma")]] on _Sqr_error_x86_x64_fma) don't strip | ||
| // the baseline, which would break intrinsic calls like _mm_set_sd / _mm_store_sd. |
There was a problem hiding this comment.
I don't understand this either. Why would _M_X64 be defined for device compilation when the device target is not x86-64? The issue this PR seeks to address seems to be the result of a prior workaround that might not have been well motivated.
There was a problem hiding this comment.
Correct. _M_X64 is defined so MSVC's STL headers will compile at all. <yvals_core.h> (transitively included by nearly every STL header) requires one of _M_X64 / _M_IX86 / _M_ARM64 / etc. to be defined. Without it, the header errors out before any SYCL device TU using , , etc. can be parsed.
The MicrosoftX86_64_SPIR64TargetInfo class predates this PR. I'm not sure of the original motivation beyond "STL won't open otherwise." Removing it would require either patching the MSVC STL headers (not really an option) or maintaining a substantial set of SYCL-side wrappers for any STL header a device TU might transitively pull in.
There was a problem hiding this comment.
Thanks, I wasn't aware of the MicrosoftX86_64_SPIR64TargetInfo target. That's ... interesting. It makes sense that certain accommodations are required to use Microsoft's standard library headers given that we're using them for a purpose they weren't at all intended for.
Pondering, I wonder how much success we might have in trying to convince Microsoft to accept a series of PRs that add checks for SYCL_DEVICE_ONLY throughout their headers. That might require some executive level discussion.
| // which matches a spir64 device's reality (no x86 ISA), and so selects the | ||
| // STL's scalar fallback paths if these dispatches are ever reached. | ||
| extern "C" int __isa_available __attribute__((sycl_global_var)) | ||
| __attribute__((weak)) = 0; |
There was a problem hiding this comment.
I don't have a VS2026 installation available to look at. How is __isa_available declared in <isa_availability.h>? Is its declaration protected by a guard macro? Can we substitute our own header for <isa_availability.h> that defines this variable as constexpr with a value of 0?
There was a problem hiding this comment.
Good question. __isa_available isn't actually declared in <isa_availability.h>. It's declared inline in stl/inc/__msvc_bit_utils.hpp as:
extern "C" {
extern int __isa_available;
}
with no guard macro. pulls this in. So substituting <isa_availability.h> doesn't intercept the declaration, and shadowing __msvc_bit_utils.hpp to redefine it feels heavier than just providing a definition the way this PR does.
There was a problem hiding this comment.
It looks like Microsoft declares __isa_available within the std namespace. We presumably should too.
I imagine this could be a problem outside of just the <complex> header. A quick search of __msvc_bit_utils.hpp indicates it is included by <bit>, <vector>, <numeric>, and <bitset> as well as anything that includes <__msvc_int128.hpp>. Ideally, I think we would predefine this variable (either within the compiler or, preferably, in a pre-included header file. I don't think we have a solution for that at the moment though.
|
We don't have to go the route of this PR if you have a different tack you'd prefer to take. And likely I'm not the right person for this anyway - I'm just looking to quickly work around an error. My goal here is unblocking VS2026 users. I'm happy to defer to whatever approach you prefer for the longer-term shape Predating this PR, we have this here: https://github.com/intel/llvm/blob/sycl/clang/lib/Basic/Targets/SPIR.h#L402-L417 In this PR, I'm just extending this by claiming SSE/SSE2 on top. And that's not really a novel invention, the baseline rule is "x86_64 always has SSE2" . Since the target claims to be |
tahonermann
left a comment
There was a problem hiding this comment.
Thanks for the detailed explanations @cperkinsintel. I admit I don't see much for better options available in the short term. Two notes:
- There are a number of other Microsoft headers that include
__msvc_bit_utils.hpp. I understand that it might be that only<complex>is causing trouble for us now, but I suspect it won't be long before we see issues elsewhere. Can the solution be generalized some to be agnostic to the including header file? - It looks like Microsoft declares
__isa_availablein thestdnamespace when compiling for C++.
| // Mirror X86TargetInfo's "x86_64 always has SSE2" baseline: the matching | ||
| // _M_X64 macro makes MSVC STL headers take the x86 intrinsics path, whose | ||
| // _mm_* intrinsics require sse/sse2 in the target feature set. | ||
| Features["sse"] = true; | ||
| Features["sse2"] = true; | ||
| return SPIR64TargetInfo::initFeatureMap(Features, Diags, CPU, FeaturesVec); |
There was a problem hiding this comment.
I see, thanks for the detailed explanation.
| // When SYCL device code is compiled with a Windows-MSVC host, the device | ||
| // target (spir64) defines _M_X64 so that MSVC STL headers take the x86 | ||
| // intrinsics path. The device target feature set must correspondingly carry | ||
| // sse/sse2 so that function-level __target__ attributes (e.g. VS2026 | ||
| // <complex>'s [[gnu::target("fma")]] on _Sqr_error_x86_x64_fma) don't strip | ||
| // the baseline, which would break intrinsic calls like _mm_set_sd / _mm_store_sd. |
There was a problem hiding this comment.
Thanks, I wasn't aware of the MicrosoftX86_64_SPIR64TargetInfo target. That's ... interesting. It makes sense that certain accommodations are required to use Microsoft's standard library headers given that we're using them for a purpose they weren't at all intended for.
Pondering, I wonder how much success we might have in trying to convince Microsoft to accept a series of PRs that add checks for SYCL_DEVICE_ONLY throughout their headers. That might require some executive level discussion.
| // which matches a spir64 device's reality (no x86 ISA), and so selects the | ||
| // STL's scalar fallback paths if these dispatches are ever reached. | ||
| extern "C" int __isa_available __attribute__((sycl_global_var)) | ||
| __attribute__((weak)) = 0; |
There was a problem hiding this comment.
It looks like Microsoft declares __isa_available within the std namespace. We presumably should too.
I imagine this could be a problem outside of just the <complex> header. A quick search of __msvc_bit_utils.hpp indicates it is included by <bit>, <vector>, <numeric>, and <bitset> as well as anything that includes <__msvc_int128.hpp>. Ideally, I think we would predefine this variable (either within the compiler or, preferably, in a pre-included header file. I don't think we have a solution for that at the moment though.
Signed-off-by: Chris Perkins <chris.perkins@intel.com>
Signed-off-by: Chris Perkins <chris.perkins@intel.com>
tahonermann
left a comment
There was a problem hiding this comment.
Given the constraints, this seems pretty reasonable. I can't think of a better option. Thanks for generalizing the change for the __msvc_bit_utils.hpp header file. I left one comment for the new test; up to you if you want to follow up on it.
| } | ||
|
|
||
| // CHECK: spir_kernel void @{{.*}}TestK{{.*}}() [[ATTRS:#[0-9]+]] | ||
| // CHECK: attributes [[ATTRS]] = {{.*}}"target-features"="+sse,+sse2" |
There was a problem hiding this comment.
I'm surprised we don't have an existing test to validate target-features for SPIR targets somewhere. I looked, but I wasn't able to find one. This change isn't specific to SYCL, so this test could be generalized. There are quite a few tests for other targets that validate target-features under clang/test/Driver.
Signed-off-by: Chris Perkins <chris.perkins@intel.com>
Signed-off-by: Chris Perkins <chris.perkins@intel.com>
Fznamznon
left a comment
There was a problem hiding this comment.
Well since this is another example where SYCL "design" meets real world and we can't do anything else about it except define target features for targetless triple (which still sounds horrible), so be it.
Thank you @cperkinsintel for work and explanations.
|
@intel/llvm-gatekeepers please consider merging |
guard inclusion of __msvc_bit_utils.hpp The test introduced by #21990 can trip if using older MSVC. This should fix it.
The VS2026 STL headers use
__isa_available(a runtime global variable) to detect CPU features. SYCL device code can't access host runtime globals, and that leads to failures. ( "SYCL kernel cannot use a non-const global variable." ) But we can define it easily and then things move forward.But once we are moving forward again, we encounter a second issue, the feature map is empty. The solution here is to just init the feature map like we already do in
SPIRV64AMDGCNTargetInfo::initFeatureMap.The JIRA has reproducer, etc. CMPLRLLVM-75293