-
Notifications
You must be signed in to change notification settings - Fork 13.5k
ARM Assembler problem with ADR instruction #13613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
ARM ARM says below in A4.2.2 Use of labels in UAL instruction syntax, "When the assembler calculates an offset of 0 for the normal syntax of this instruction, it must assemble the encoding that adds 0 to the Align(PC,4) value of the instruction. The encoding that subtracts 0 from the Align(PC,4) value cannot be specified by the normal syntax." So the decoding of "0x00 0x00 0x4f 0xe2" is "sub r0, pc, #0" follows the rule. |
For thumb2, ARM ARM says, It is recommended that the alternative syntax forms are avoided where possible. However, the only possible syntax for encoding T2 with all immediate bits zero is |
Come up with an even general case, $ echo 'adr r0,#0x4' | llvm-mc -triple=armv7 -show-inst -show-encoding |
/* the following example is incorrect / /* the following example is correct */ This result is quite wired. There are two issues.
|
Hi Richard, is this still a problem? |
mentioned in issue llvm/llvm-bugzilla-archive#18926 |
#66343) The FileCheck string `LLVMFuzzerCustomMutatorLongSequence: {{.*}} MS: {{[0-9]*}} {{(([a-zA-Z]*-){11,})}} {{.*}}` is too restrictive and may fail the test in some case. If we look at the commit that added this check(66df989), This check is for printing out the long mutation sequence, such as this one ``` #53552 REDUCE cov: 6 ft: 6 corp: 5/9b lim: 4096 exec/s: 0 rss: 37Mb L: 2/3 MS: 54 ChangeByte-PersAutoDict-ChangeBit-ChangeBinInt-ChangeBit-ChangeBit-ChangeByte-CMP-EraseBytes-EraseBytes-CrossOver-InsertRepeatedBytes-ChangeByte-EraseBytes-InsertRepeatedBytes-ShuffleBytes-ChangeByte-ShuffleBytes-ChangeBit-CrossOver-ChangeBit-ShuffleBytes-ChangeBinInt-ShuffleBytes-EraseBytes-InsertByte-Custom-ShuffleBytes-CopyPart-InsertRepeatedBytes-PersAutoDict-InsertRepeatedBytes-ChangeByte-CrossOver-CrossOver-PersAutoDict-PersAutoDict-EraseBytes-ChangeBit-CopyPart-ChangeByte-CopyPart-InsertRepeatedBytes-CrossOver-CrossOver-CrossOver-CrossOver-ShuffleBytes-EraseBytes-InsertByte-InsertRepeatedBytes-CrossOver-EraseBytes-Custom- DE: "\377\377"-"\001\000"-"\001\000"-"\000\000\000\000\000\000\000\000"-"\001\000\000\000"- ``` But if we look at the code doing the printing ```cpp void MutationDispatcher::PrintMutationSequence(bool Verbose) { Printf("MS: %zd ", CurrentMutatorSequence.size()); size_t EntriesToPrint = Verbose ? CurrentMutatorSequence.size() : std::min(kMaxMutationsToPrint, CurrentMutatorSequence.size()); for (size_t i = 0; i < EntriesToPrint; i++) Printf("%s-", CurrentMutatorSequence[i].Name); if (!CurrentDictionaryEntrySequence.empty()) { Printf(" DE: "); EntriesToPrint = Verbose ? CurrentDictionaryEntrySequence.size() : std::min(kMaxMutationsToPrint, CurrentDictionaryEntrySequence.size()); for (size_t i = 0; i < EntriesToPrint; i++) { Printf("\""); PrintASCII(CurrentDictionaryEntrySequence[i]->GetW(), "\"-"); } } } ``` We can see that the `DE: XXX` is not always printed. So the following output is possible(and is from real-life failure), notince the missing of `DE: XXX`. ``` #13613 NEW cov: 5 ft: 5 corp: 4/6b lim: 4096 exec/s: 0 rss: 32Mb L: 2/2 MS: 27 InsertByte-ChangeBinInt-ChangeBinInt-CrossOver-ShuffleBytes-ChangeBit-EraseBytes-ShuffleBytes-InsertByte-InsertRepeatedBytes-CopyPart-InsertByte-ChangeByte-ChangeBit-InsertByte-CrossOver-EraseBytes-CopyPart-ShuffleBytes-EraseBytes-InsertByte-InsertRepeatedBytes-CrossOver-CrossOver-ShuffleBytes-ChangeBit-Custom- #13765 ...... ``` This output is totally legit and will fail that check. So I remove the check for the following strings, I think `MS: {{[0-9]*}} {{(([a-zA-Z]*-){11,})}}` is sufficient for checking the long mutation sequence. This should help resolve the flaky failure of fuzzer-custommutator.test.
Extended Description
The MCInst for the ARM ADR instruction as created by the MC assembler is re-encoded into an ADD/SUB instruction (Reproducers below)
It seems to me that there has been some re-engineering of the ADD/SUB instructions and the ADR instruction has not been taken into account in this.
Reproduce with:
(A1 encoding)
echo 'ADR r0,#0x40000000' | ./llvm-oss/build-none/bin/llvm-mc -triple=armv7 -show-inst -show-encoding
.section __TEXT,__text,regular,pure_instructions
adr r0, #1073741824 @ encoding: [0x00,0x00,0x8f,0xe2]
@ <MCInst #30 ADR
@
@
@
@ >
echo 0x00 0x00 0x8f 0xe2 | ./llvm-oss/build-none/bin/llvm-mc -triple=armv7 -show-inst -show-encoding -disassemble
.section __TEXT,__text,regular,pure_instructions
add r0, pc, #0 @ encoding: [0x00,0x00,0x8f,0xe2]
@ <MCInst #24 ADDri
@
@
@
@
@
@ >
(A2 encoding)
echo 'ADR r0,#-0x0' | ./llvm-oss/build-none/bin/llvm-mc -triple=armv7 -show-inst -show-encoding
.section __TEXT,__text,regular,pure_instructions
adr r0, #-2147483648 @ encoding: [0x00,0x00,0x4f,0xe2]
@ <MCInst #30 ADR
@
@
@
@ >
echo 0x00 0x00 0x4f 0xe2 | ./llvm-oss/build-none/bin/llvm-mc -triple=armv7 -show-inst -show-encoding -disassemble
.section __TEXT,__text,regular,pure_instructions
sub r0, pc, #0 @ encoding: [0x00,0x00,0x4f,0xe2]
@ <MCInst #456 SUBri
@
@
@
@
@
@ >
The text was updated successfully, but these errors were encountered: