Skip to content

Commit cb568fa

Browse files
pytorchbotSS-JIA
andauthored
[ET-VK] Fix metadata UBO VVL warnings (#7484)
* [ET-VK][ez] Fix undefined behaviour in ambiguous `ParamsBuffer` constructor ## Context I discovered this bug when trying to execute the `vulkan_compute_api_test` binary on Windows. Almost all the tests were failing, with compute shaders producing incorrect results. After bisecting the change, it turns out the culprit is #7015. The diff introduced an alternative templated constructor for `ParamsBuffer` which would initialize an empty UBO with a specified size instead of wrapping a pre-existing object. The issue is that these constructors are ambiguous because they both are template constructors and both only accept one argument. Therefore, the original constructor would be called when certain callsites intended to call the new constructor. This results in a UBO being created with an incorrect size, and resulted in the tensor's metadata being passed incorrectly into a compute shader. To fix, I added a dummy argument into the new constructor for disambiguation purposes. I also changed it so that it's not templated, since there's no reason for it to be templated. Differential Revision: [D67770791](https://our.internmc.facebook.com/intern/diff/D67770791/) ghstack-source-id: 260031108 Pull Request resolved: #7478 * [ET-VK] Create Pipeline layouts with push constant ranges when required ## Context #7223 added the ability to use push constants in shaders. However, one thing the diff missed was not specifying that the compute pipeline layout needed to include a push constant upon creation. The Vulkan validation layers warns against this, and on certain GPUs such as the integrated Intel GPU on my windows laptop compute shaders will produce incorrect output. This diff makes the change such that the compute pipeline layout will be created with a push constant block if necessary. ## Solution Change the key of the pipeline layout cache to accept an additional push constant size field. The push constant size will be used to create the pipeline layout with a push constant block of the specified size. Differential Revision: [D67770793](https://our.internmc.facebook.com/intern/diff/D67770793/) ghstack-source-id: 260031109 Pull Request resolved: #7479 * [ET-VK] Fix metadata UBO VVL warnings ## Context Recently #7015 was implemented so that all tensor metadata (e.g. sizes, strides) would be stored in a single UBO instead of with separate UBO objects. This helps with memory savings presumably due to defragmentation of memory allocations. However, once the change was introduced, I noticed two new warnings produced by the Vulkan Validation Layer. The first complains that the offset of a UBO descriptor is not a multiple of the `minUniformBufferOffsetAlignment` field reported by the physical device properties. The second complains that the range of a UBO descriptor exceeds the offset + range of the underlying UBO object. # Solution To address the first one, instead of using `sizeof(utils::ivec4)` to determine the offset per metadata field, check the `minUniformBufferOffsetAlignment` field of reported by the device and use that instead. The second warning arises because the logic in the constructor of `BufferBindInfo` had a mistake; instead of using the range of the underlying UBO object, it should use the range subtracted by the user specified offset. Differential Revision: [D67770792](https://our.internmc.facebook.com/intern/diff/D67770792/) ghstack-source-id: 260031110 Pull Request resolved: #7480 --------- Co-authored-by: Stephen Jia <[email protected]>
1 parent 7001ac7 commit cb568fa

File tree

7 files changed

+63
-29
lines changed

7 files changed

+63
-29
lines changed

backends/vulkan/runtime/api/containers/Tensor.cpp

Lines changed: 27 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -658,66 +658,77 @@ utils::GPUMemoryLayout vTensor::estimate_memory_layout() const {
658658
}
659659

660660
const vkapi::BufferBindInfo vTensor::sizes_ubo() {
661+
const size_t size_per_ubo = context()->adapter_ptr()->min_ubo_alignment();
662+
const size_t max_ubo_size = kMaxMetadataFieldCount * size_per_ubo;
661663
if (!uniforms_.buffer()) {
662-
uniforms_ = ParamsBuffer(storage_.context_, kMaxUniformBufferSize, true);
664+
uniforms_ = ParamsBuffer(storage_.context_, max_ubo_size, true);
663665
}
664666
if (sizes_uniform_offset_ == kUniformOffsetUnset) {
665667
VK_CHECK_COND(
666-
(uniforms_size_ + kSizePerUniform) <= kMaxUniformBufferSize,
668+
(uniforms_size_ + size_per_ubo) <= max_ubo_size,
667669
"Uniform data allocation has exceeded Tensor uniform buffer size");
668670
sizes_uniform_offset_ = uniforms_size_;
669-
uniforms_size_ += kSizePerUniform;
671+
uniforms_size_ += size_per_ubo;
670672
uniforms_.update(utils::make_whcn_ivec4(sizes_), sizes_uniform_offset_);
671673
}
672-
return vkapi::BufferBindInfo(uniforms_.buffer(), sizes_uniform_offset_);
674+
return vkapi::BufferBindInfo(
675+
uniforms_.buffer(), sizes_uniform_offset_, size_per_ubo);
673676
}
674677

675678
const vkapi::BufferBindInfo vTensor::strides_ubo() {
679+
const size_t size_per_ubo = context()->adapter_ptr()->min_ubo_alignment();
680+
const size_t max_ubo_size = kMaxMetadataFieldCount * size_per_ubo;
676681
if (!uniforms_.buffer()) {
677-
uniforms_ = ParamsBuffer(storage_.context_, kMaxUniformBufferSize, true);
682+
uniforms_ = ParamsBuffer(storage_.context_, max_ubo_size, true);
678683
}
679684
if (unsqueezed_strides_offset_ == kUniformOffsetUnset) {
680685
VK_CHECK_COND(
681-
(uniforms_size_ + kSizePerUniform) <= kMaxUniformBufferSize,
686+
(uniforms_size_ + size_per_ubo) <= max_ubo_size,
682687
"Uniform data allocation has exceeded Tensor uniform buffer size");
683688
unsqueezed_strides_offset_ = uniforms_size_;
684-
uniforms_size_ += kSizePerUniform;
689+
uniforms_size_ += size_per_ubo;
685690
uniforms_.update(
686691
utils::make_whcn_ivec4(unsqueezed_strides_),
687692
unsqueezed_strides_offset_);
688693
}
689-
return vkapi::BufferBindInfo(uniforms_.buffer(), unsqueezed_strides_offset_);
694+
return vkapi::BufferBindInfo(
695+
uniforms_.buffer(), unsqueezed_strides_offset_, size_per_ubo);
690696
}
691697

692698
const vkapi::BufferBindInfo vTensor::logical_limits_ubo() {
699+
const size_t size_per_ubo = context()->adapter_ptr()->min_ubo_alignment();
700+
const size_t max_ubo_size = kMaxMetadataFieldCount * size_per_ubo;
693701
if (!uniforms_.buffer()) {
694-
uniforms_ = ParamsBuffer(storage_.context_, kMaxUniformBufferSize, true);
702+
uniforms_ = ParamsBuffer(storage_.context_, max_ubo_size, true);
695703
}
696704
if (logical_limits_uniform_offset_ == kUniformOffsetUnset) {
697705
VK_CHECK_COND(
698-
(uniforms_size_ + kSizePerUniform) <= kMaxUniformBufferSize,
706+
(uniforms_size_ + size_per_ubo) <= max_ubo_size,
699707
"Uniform data allocation has exceeded Tensor uniform buffer size");
700708
logical_limits_uniform_offset_ = uniforms_size_;
701-
uniforms_size_ += kSizePerUniform;
709+
uniforms_size_ += size_per_ubo;
702710
uniforms_.update(logical_limits(), logical_limits_uniform_offset_);
703711
}
704712
return vkapi::BufferBindInfo(
705-
uniforms_.buffer(), logical_limits_uniform_offset_);
713+
uniforms_.buffer(), logical_limits_uniform_offset_, size_per_ubo);
706714
}
707715

708716
const vkapi::BufferBindInfo vTensor::numel_ubo() {
717+
const size_t size_per_ubo = context()->adapter_ptr()->min_ubo_alignment();
718+
const size_t max_ubo_size = kMaxMetadataFieldCount * size_per_ubo;
709719
if (!uniforms_.buffer()) {
710-
uniforms_ = ParamsBuffer(storage_.context_, kMaxUniformBufferSize, true);
720+
uniforms_ = ParamsBuffer(storage_.context_, max_ubo_size, true);
711721
}
712722
if (numel_uniform_offset_ == kUniformOffsetUnset) {
713723
VK_CHECK_COND(
714-
(uniforms_size_ + kSizePerUniform) <= kMaxUniformBufferSize,
724+
(uniforms_size_ + size_per_ubo) <= max_ubo_size,
715725
"Uniform data allocation has exceeded Tensor uniform buffer size");
716726
numel_uniform_offset_ = uniforms_size_;
717-
uniforms_size_ += kSizePerUniform;
727+
uniforms_size_ += size_per_ubo;
718728
uniforms_.update(numel(), numel_uniform_offset_);
719729
}
720-
return vkapi::BufferBindInfo(uniforms_.buffer(), numel_uniform_offset_);
730+
return vkapi::BufferBindInfo(
731+
uniforms_.buffer(), numel_uniform_offset_, size_per_ubo);
721732
}
722733

723734
size_t vTensor::staging_buffer_numel() const {

backends/vulkan/runtime/api/containers/Tensor.h

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -348,16 +348,13 @@ class vTensor final {
348348
uint32_t numel_uniform_offset_;
349349
uint32_t logical_limits_uniform_offset_;
350350

351-
// Size allocated for each uniform
352-
// each uniform is assumed to be a vec of 4 ints to maintain 16 byte alignemnt
353-
constexpr static size_t kSizePerUniform = sizeof(utils::ivec4);
354-
// Total size of tensor's uniform buffer
355-
constexpr static size_t kMaxUniformBufferSize =
356-
4 * // we have 4 uniforms that are passed on to shaders
357-
kSizePerUniform;
358-
359-
// Initial value of uniform buffer offsets
360-
constexpr static uint32_t kUniformOffsetUnset = kMaxUniformBufferSize;
351+
// Maximum number of metadata fields that can be stored in the metadata UBO.
352+
// This is used to calculate the size of the UBO that should be allocated.
353+
constexpr static size_t kMaxMetadataFieldCount = 4;
354+
355+
// Initial value of uniform buffer offsets. 1 is selected as it is essentially
356+
// impossible for a ubo to have an offset of 1.
357+
constexpr static uint32_t kUniformOffsetUnset = 1;
361358

362359
vTensorStorage storage_;
363360

backends/vulkan/runtime/vk_api/Adapter.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,10 @@ class Adapter final {
207207
return supports_8bit_storage_buffers() && supports_int8_shader_types();
208208
}
209209

210+
inline size_t min_ubo_alignment() const {
211+
return physical_device_.min_ubo_alignment;
212+
}
213+
210214
// Command Buffer Submission
211215

212216
void

backends/vulkan/runtime/vk_api/Descriptor.cpp

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,17 @@ BufferBindInfo::BufferBindInfo(
2828
const uint32_t offset_p)
2929
: handle(buffer_p.handle()),
3030
offset(buffer_p.mem_offset() + offset_p),
31-
range(buffer_p.mem_range()) {}
31+
range(buffer_p.mem_range() - offset_p) {}
32+
33+
BufferBindInfo::BufferBindInfo(
34+
const VulkanBuffer& buffer_p,
35+
const uint32_t offset_p,
36+
const uint32_t range_p)
37+
: handle(buffer_p.handle()),
38+
offset(buffer_p.mem_offset() + offset_p),
39+
range(range_p) {
40+
VK_CHECK_COND(range_p <= (buffer_p.mem_range() - offset_p));
41+
}
3242

3343
//
3444
// ParamsBindList

backends/vulkan/runtime/vk_api/Descriptor.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,10 @@ struct BufferBindInfo final {
3434

3535
BufferBindInfo();
3636
BufferBindInfo(const VulkanBuffer& buffer_p, const uint32_t offset_p = 0u);
37+
BufferBindInfo(
38+
const VulkanBuffer& buffer_p,
39+
const uint32_t offset_p,
40+
const uint32_t range_p);
3741
};
3842

3943
struct ParamsBindList final {

backends/vulkan/runtime/vk_api/Device.cpp

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,10 +39,17 @@ PhysicalDevice::PhysicalDevice(VkPhysicalDevice physical_device_handle)
3939
num_compute_queues(0),
4040
supports_int16_shader_types(false),
4141
has_unified_memory(false),
42-
has_timestamps(properties.limits.timestampComputeAndGraphics),
43-
timestamp_period(properties.limits.timestampPeriod) {
42+
has_timestamps(false),
43+
timestamp_period(0),
44+
min_ubo_alignment(0) {
4445
// Extract physical device properties
4546
vkGetPhysicalDeviceProperties(handle, &properties);
47+
48+
// Extract fields of interest
49+
has_timestamps = properties.limits.timestampComputeAndGraphics;
50+
timestamp_period = properties.limits.timestampPeriod;
51+
min_ubo_alignment = properties.limits.minUniformBufferOffsetAlignment;
52+
4653
vkGetPhysicalDeviceMemoryProperties(handle, &memory_properties);
4754

4855
VkPhysicalDeviceFeatures2 features2{

backends/vulkan/runtime/vk_api/Device.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ struct PhysicalDevice final {
4949
bool has_unified_memory;
5050
bool has_timestamps;
5151
float timestamp_period;
52+
size_t min_ubo_alignment;
5253

5354
explicit PhysicalDevice(VkPhysicalDevice);
5455
};

0 commit comments

Comments
 (0)