Skip to content

Commit 552a31f

Browse files
author
ssjia
committed
[ET-VK] Fix staging buffer allocation to check all memory types for HOST_CACHED
`test_host_cached_available()` only checked `memoryTypes[0]` to determine if HOST_CACHED memory was available. On Pixel devices, `memoryTypes[0]` is DEVICE_LOCAL without HOST_CACHED, so the function incorrectly returned `SEQUENTIAL_WRITE_BIT`. This caused DEVICE_TO_HOST staging buffers to be allocated in write-combining (uncached) memory, making CPU reads during COPY_OUTPUTS ~170x slower than necessary (~40ms vs ~237us on S24). The fix iterates over all memory types to correctly detect HOST_CACHED support. On-device profiling of edgetam_first_frame_fp16_vulkan.pte confirms the fix: - Pixel 8 Pro COPY_OUTPUTS: 40ms -> 6.3ms (-84%) - Pixel 9 Pro XL COPY_OUTPUTS: 40ms -> 2.5ms (-94%) - Pixel 8 Pro Method::execute: 492ms -> 464ms (-5.7%) - Pixel 9 Pro XL Method::execute: 445ms -> 411ms (-7.6%) Differential Revision: [D97058156](https://our.internmc.facebook.com/intern/diff/D97058156/) [ghstack-poisoned]
1 parent ed57040 commit 552a31f

File tree

1 file changed

+6
-7
lines changed

1 file changed

+6
-7
lines changed

backends/vulkan/runtime/vk_api/memory/Allocator.cpp

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,12 @@ VmaAllocationCreateFlags test_host_cached_available(
1616
VkPhysicalDeviceMemoryProperties mem_props;
1717
vkGetPhysicalDeviceMemoryProperties(physical_device, &mem_props);
1818

19-
VkMemoryPropertyFlags const flags = mem_props.memoryTypes->propertyFlags;
20-
21-
bool const host_visible = flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT;
22-
bool const host_cached = flags & VK_MEMORY_PROPERTY_HOST_CACHED_BIT;
23-
24-
if (host_visible && host_cached) {
25-
return VMA_ALLOCATION_CREATE_HOST_ACCESS_RANDOM_BIT;
19+
for (uint32_t i = 0; i < mem_props.memoryTypeCount; i++) {
20+
VkMemoryPropertyFlags flags = mem_props.memoryTypes[i].propertyFlags;
21+
if ((flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) &&
22+
(flags & VK_MEMORY_PROPERTY_HOST_CACHED_BIT)) {
23+
return VMA_ALLOCATION_CREATE_HOST_ACCESS_RANDOM_BIT;
24+
}
2625
}
2726

2827
return VMA_ALLOCATION_CREATE_HOST_ACCESS_SEQUENTIAL_WRITE_BIT;

0 commit comments

Comments
 (0)