-
Notifications
You must be signed in to change notification settings - Fork 289
Implement prefetch hints for aarch64 #918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
79a66f2
Implement prefetch hints for aarch64
dgbo aa6c0f4
fix failed style check
dgbo 2c41b38
add rustc_args_required_const and match (rw, locality) pairs
dgbo e4d43dd
add const_fn_transmute feature to lib.rs
dgbo a506b1c
fix failed style check in pref macro match
dgbo 1bef1ee
fix failed style check in pref macro match
dgbo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
#[cfg(test)] | ||
use stdarch_test::assert_instr; | ||
|
||
extern "C" { | ||
#[link_name = "llvm.prefetch"] | ||
fn prefetch(p: *const i8, rw: i32, loc: i32, ty: i32); | ||
} | ||
|
||
/// See [`prefetch`](fn._prefetch.html). | ||
pub const _PREFETCH_READ: i32 = 0; | ||
|
||
/// See [`prefetch`](fn._prefetch.html). | ||
pub const _PREFETCH_WRITE: i32 = 1; | ||
|
||
/// See [`prefetch`](fn._prefetch.html). | ||
pub const _PREFETCH_LOCALITY0: i32 = 0; | ||
|
||
/// See [`prefetch`](fn._prefetch.html). | ||
pub const _PREFETCH_LOCALITY1: i32 = 1; | ||
|
||
/// See [`prefetch`](fn._prefetch.html). | ||
pub const _PREFETCH_LOCALITY2: i32 = 2; | ||
|
||
/// See [`prefetch`](fn._prefetch.html). | ||
pub const _PREFETCH_LOCALITY3: i32 = 3; | ||
|
||
/// Fetch the cache line that contains address `p` using the given `rw` and `locality`. | ||
/// | ||
/// The `rw` must be one of: | ||
/// | ||
/// * [`_PREFETCH_READ`](constant._PREFETCH_READ.html): the prefetch is preparing | ||
/// for a read. | ||
/// | ||
/// * [`_PREFETCH_WRITE`](constant._PREFETCH_WRITE.html): the prefetch is preparing | ||
/// for a write. | ||
/// | ||
/// The `locality` must be one of: | ||
/// | ||
/// * [`_PREFETCH_LOCALITY0`](constant._PREFETCH_LOCALITY0.html): Streaming or | ||
/// non-temporal prefetch, for data that is used only once. | ||
/// | ||
/// * [`_PREFETCH_LOCALITY1`](constant._PREFETCH_LOCALITY1.html): Fetch into level 3 cache. | ||
/// | ||
/// * [`_PREFETCH_LOCALITY2`](constant._PREFETCH_LOCALITY2.html): Fetch into level 2 cache. | ||
/// | ||
/// * [`_PREFETCH_LOCALITY3`](constant._PREFETCH_LOCALITY3.html): Fetch into level 1 cache. | ||
/// | ||
/// The prefetch memory instructions signal to the memory system that memory accesses | ||
/// from a specified address are likely to occur in the near future. The memory system | ||
/// can respond by taking actions that are expected to speed up the memory access when | ||
/// they do occur, such as preloading the specified address into one or more caches. | ||
/// Because these signals are only hints, it is valid for a particular CPU to treat | ||
/// any or all prefetch instructions as a NOP. | ||
/// | ||
/// | ||
/// [Arm's documentation](https://developer.arm.com/documentation/den0024/a/the-a64-instruction-set/memory-access-instructions/prefetching-memory?lang=en) | ||
#[inline(always)] | ||
#[cfg_attr(test, assert_instr("prfm pldl1strm", rw = _PREFETCH_READ, locality = _PREFETCH_LOCALITY0))] | ||
#[cfg_attr(test, assert_instr("prfm pldl3keep", rw = _PREFETCH_READ, locality = _PREFETCH_LOCALITY1))] | ||
#[cfg_attr(test, assert_instr("prfm pldl2keep", rw = _PREFETCH_READ, locality = _PREFETCH_LOCALITY2))] | ||
#[cfg_attr(test, assert_instr("prfm pldl1keep", rw = _PREFETCH_READ, locality = _PREFETCH_LOCALITY3))] | ||
#[cfg_attr(test, assert_instr("prfm pstl1strm", rw = _PREFETCH_WRITE, locality = _PREFETCH_LOCALITY0))] | ||
#[cfg_attr(test, assert_instr("prfm pstl3keep", rw = _PREFETCH_WRITE, locality = _PREFETCH_LOCALITY1))] | ||
#[cfg_attr(test, assert_instr("prfm pstl2keep", rw = _PREFETCH_WRITE, locality = _PREFETCH_LOCALITY2))] | ||
#[cfg_attr(test, assert_instr("prfm pstl1keep", rw = _PREFETCH_WRITE, locality = _PREFETCH_LOCALITY3))] | ||
#[rustc_args_required_const(1, 2)] | ||
pub unsafe fn _prefetch(p: *const i8, rw: i32, locality: i32) { | ||
// We use the `llvm.prefetch` instrinsic with `cache type` = 1 (data cache). | ||
// `rw` and `strategy` are based on the function parameters. | ||
macro_rules! pref { | ||
($rdwr:expr, $local:expr) => { | ||
match ($rdwr, $local) { | ||
(0, 0) => prefetch(p, 0, 0, 1), | ||
(0, 1) => prefetch(p, 0, 1, 1), | ||
(0, 2) => prefetch(p, 0, 2, 1), | ||
(0, 3) => prefetch(p, 0, 3, 1), | ||
(1, 0) => prefetch(p, 1, 0, 1), | ||
(1, 1) => prefetch(p, 1, 1, 1), | ||
(1, 2) => prefetch(p, 1, 2, 1), | ||
(1, 3) => prefetch(p, 1, 3, 1), | ||
(_, _) => panic!( | ||
"Illegal (rw, locality) pair in prefetch, value ({}, {}).", | ||
$rdwr, $local | ||
), | ||
} | ||
}; | ||
} | ||
pref!(rw, locality); | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use
#[rustc_args_required_const(1, 2)]
to ensure thatrw
andlocality
are compile-time constants.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to match on every valid
(rw, locality)
pair so that theprefetch
call always gets constant values as arguments and you need to handle invalid values.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the quick review.
The
#[rustc_args_required_const(1, 2)]
is added just before the function _prefetch.Created a macro
pref
to match all valid(rw, locality)
pairs.An invalid pair will cause panic now.
Suggestions?