-
Notifications
You must be signed in to change notification settings - Fork 59
Specification of inline assembly #422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
One option I think would be to say that for an interpreter the user has to provide a specification of all effects the asm block has on the rust AM state. This specification would remain unchanged even if the code bytes change. For example if the expected changes to the code bytes would be to change a constant that will be returned, then the specification could say that the returned value is non-deterministically chosen or list the exact conditions in which self-modifying code will do a specific modification. And for instructions that are implemented differently on different cpu's the specification would list all possible implementations a cpu is allowed to have. For a compiler the compiler would have to assume any specification expressible in terms of AM operations would be the actual one without requiring the user to explicitly provide it. |
IMO it makes little sense to discuss FFI and inline assembly separately, so this should be merged with #421. Inline assembly is just a particularly intertwined form of FFI with assembly. There might be some things that apply to inline asm only but we can only start discussing them after having laid the foundations of general FFI. Also, what we're discussing here isn't going to be very "operational". There's no way to give an operational semantics to FFI/asm in general (you can only do that for FFI with a particular other language, by using that language's / ISA's opsem, but that's not even what we want to do here). The question is more, what are the reasoning principles we allow the compiler and what are the obligations we put on the programmer. |
I'm not entirely sure that they should be discussed together - asm simultaneously allows more and less reasoning than FFI. As for the opsem side, we can still define the opsem of the operation in rust (either the call to the extern function, or the asm invocation), and in some ways limit what the result can do to the AM state. That is what I'm specifically asking in #421 and this issue. If we were just discussing the informal constraints, then the reference would already answer this - my question is "How does opsem get as close as possible to the constraints specified by T-lang and project-inline-asm". |
Hm, okay I guess there will be some differences, but the overall architecture is going to (have to) be extremely similar I think. |
Answering the question operationally also it makes it easier to expand to future extensions of inline-assembly. For example, if the "proof" specification is used, then an option like transparent could extend that specification by allowing the implementation to validate that the contents of the assembly block is the proof it requires for the operation.
Yeah, and IMO those differences should be discussed on separate issues. |
There is no answering this question operationally though, at least not under any useful interpretation of the term "operational". MiniRust won't be able to interpret inline asm or FFI. The formal spec will take an axiomatically described transition on the AM state, but it's not something one can operationalize. |
I've corrected the title of this issue and #421, to say "Specification" rather than "Opsem". |
I don't think this is a coherent goal. The behavior of That said, if Miri ever somehow grows partial support for modelling FFI, it can implement As for Anything you do to the object file afterwards is up to you, or whatever build system is manipulating the file. Optimization of the object file (e.g. wasm-opt) isn't in violation of the Rust semantics of This does, however, probably put a minor bound on the kinds of cross-language optimization that a multiple-language toolchain can do by default and still correctly implement
The behavior of Rust/rustc on a given target is, at some point, defined in terms of the target and target tooling. Footnotes
|
Well, that's why I added the escape hatch of allowing some kind of AM termination. IMO it should be possible to implement a particular instance of the Rust Abstract Machine directly (like miri does), and have that be a compliant implementation of rust, regardless of whether it is fully useful in all cases
Note that this has more interesting implications on MCE (machine code emission) opts. Although, some MCE opts are definitely allowed, by fiat, otherwise gnu as and llvm-as wouldn't correctly implement it. For example, the translation of SSE instructions to 128-bit AVX instructions by prepending a VEX prefix is definitely allowed.
I'm not sure this is true of Rust as a language. It is possible for the compiler to ship an entirely closed toolchain and exclusively use that. In fact, rustc-llvm does this on some targets for the inline assembler. lccc goes a step further and also acts as the linker frontend, and the first external tool it invokes is ld (not even ar for rlibs/staticlibs but rustc already doesn't do that). Rust is also not unwilling to place constraints on target tooling (e.g. |
Perhaps, but rustc supports IMO, an implementation that is missing some or all of those features, including Miri, can still be a Rust implementation, but it would be a Rust implementation for a different target. In other words, 'target' means not just 'what computer it runs on' but 'how it interacts with a particular execution environment' and 'what target-specific functionality it provides'. Side rant: I also think that such functionality theoretically belongs in the Rust Specification, even if in practice specifying Rust is hard enough as it is without getting into target-specific details. In the C and C++ world, the language specification dismisses anything target-specific as 'implementation-defined behavior', yet often there are actually multiple implementations trying to be compatible with each other on the same target, but with no formal process for achieving consensus. In theory that could be the responsibility of the OS vendor or some other organization, but in practice it tends to be disregarded. Admittedly there does tend be a formal spec of the 'ABI' (struct layout and function calling), but I'm thinking of things like assembler directives or GCC command-line options. Or even the ELF and Mach-O formats, some aspects of which are specified only by 'an old mailing list post' or 'comments in a header file'. Anyway, in Rust's case, to the extent there are Rust-specific target-specific behaviors, the OS vendors are certainly not going to document them, so the responsibility falls to the Rust Project. At minimum that includes things like rustc command-line arguments (which are necessarily 'target-specific' at least in the sense that you could theoretically port the Rust compiler to a system that doesn't even have a command line). |
I think this is resolved by what you wrote in #421: if these optimizations truly preserve all behavior that is "observable" by these asm blocks, then the optimizations are allowed.
This sounds like even inline asm blocks are not fully opaque, and the notion of "observable" is restricted to not let code read itself? |
To be clear, if this was not allowed, then the cranelift codegen at least is non-compliant and likely cannot ever be compliant, as it runs an external assembler, GNU LD usualyl, which is known to perform this transformation in some cases. It is also possible that LLVM's internal assembler does the same, but I have not verifed this behavior or the lack thereof. The alternative is that rustc is permanently non-compliant on all x86 targets, which I assume is a non-starter. |
What are the operational semantics of the following program:
Specifically, how do we guarantee (and can we even guarantee) the following properties from the reference:
asm
in a "pure AM" implementation, such as miri.@rustbot label +S-pending-design
Note: This should get an A-label, but no label suitable for inline-assembly curently exists. Could someone with maintain access add such a label and then apply it to this issue?
The text was updated successfully, but these errors were encountered: