Closed
Description
Background
For the extreme code model, we materialize the address of a symbol (either data or code) with:
pcalau12i $t0, %pc_hi20(sym)
addi.d $t1, $t1, %pc_lo12(sym)
lu32i.d $t1, %pc64_lo20(sym)
lu52i.d $t1, $t1, %pc64_hi12(sym)
addi.d $t0, $t0, $t1
Consider this example:
.text
.globl load_addr
load_addr:
la.pcrel $a0, $t0, sym
jr $ra
.data
sym:
.dword 0
With cc bug.s -Ttext=0x180000ff8 -Tdata=0x1000000000 -shared -nostdlib
we get:
0000000180000ff8 <load_addr>:
180000ff8: 1b000004 pcalau12i $a0, -524288
180000ffc: 02c0000c li.d $t0, 0
180001000: 160001cc lu32i.d $t0, 14
180001004: 0300018c lu52i.d $t0, $t0, 0
180001008: 0010b084 add.d $a0, $a0, $t0
18000100c: 4c000020 ret
But this is wrong: the correct immediate in lu32i.d should be 15.
The problem is this "14" is calculated with the PC of the lu32i.d instruction (0x180001000), while in fact the PC of the pcalau12i instruction (0x180000ff8) shall be used.
Possible solution
Easy solution (limiting scheduling)
In GAS, emit 64-bit la.pcrel as-is:
pcalau12i $t0, %pc_hi20(sym)
addi.d $t1, $t1, %pc_lo12(sym)
lu32i.d $t1, %pc64_lo20(sym + 8)
lu52i.d $t1, $t1, %pc64_hi12(sym + 12)
addi.d $t0, $t0, $t1
In GCC, if -mexplicit-relocs=always
, emit it as:
addi.d $t1, $t1, %pc_lo12(sym)
# The following three instructions must be kept intact, scheduling should not insert anything
pcalau12i $t0, %pc_hi20(sym)
lu32i.d $t1, %pc64_lo20(sym + 4)
lu52i.d $t1, $t1, %pc64_hi12(sym + 8)
# Until here
addi.d $t0, $t0, $t1
Hard solution (allowing scheduling)
For GAS, use the easy solution.
For GCC, introduce a new reloc type "R_LARCH_EFFECTIVE_PC" and do something like:
1:pcalau12i $t0, %pc_hi20(sym)
addi.d $t1, $t1, %pc_lo12(sym)
.reloc 0, R_LARCH_EFFECTIVE_PC 1b
lu32i.d $t1, %pc64_lo20(sym)
.reloc 0, R_LARCH_EFFECTIVE_PC 1b
lu52i.d $t1, $t1, %pc64_hi12(sym)
addi.d $t0, $t0, $t1