Skip to content

do_nop is not negligible #177

Closed
Closed
@jserv

Description

@jserv

do_nop appears to consume some CPU cycles according to perf report. It is generated by macro operation fusion, and we should aim to eliminate its overhead as early as possible.

Reproduce:

  1. Apply the following patch:
--- a/Makefile
+++ b/Makefile
@@ -5,6 +5,7 @@ OUT ?= build
 BIN := $(OUT)/rv32emu
 
 CFLAGS = -std=gnu99 -O2 -Wall -Wextra
+CFLAGS += -g -fno-omit-frame-pointer
 CFLAGS += -Wno-unused-label
 CFLAGS += -include src/common.h
 
@@ -80,7 +81,7 @@ endif
 
 # For tail-call elimination, we need a specific set of build flags applied.
 # FIXME: On macOS + Apple Silicon, -fno-stack-protector might have a negative impact.
-$(OUT)/emulate.o: CFLAGS += -fomit-frame-pointer -fno-stack-check -fno-stack-protector
+$(OUT)/emulate.o: CFLAGS += -fno-stack-check -fno-stack-protector
 
 # Clear the .DEFAULT_GOAL special variable, so that the following turns
 # to the first target after .DEFAULT_GOAL is not set.
  1. Rebuild and run dhrystone benchmark
$ make clean all
$ perf record -g build/rv32emu build/dhrystone.elf
  1. Check the report via perf report -g and be aware of the percentage of do_nop.
-   99.87%     0.16%  rv32emu  rv32emu             [.] main
   - 99.71% main
      + 68.62% rv_step
        8.11% do_addi
        3.05% do_nop
        3.05% do_auipc.part.0
        2.26% block_find
        1.96% do_bne
        1.65% do_sw
        1.60% do_lw
        1.24% do_bgeu
        1.18% do_beq
        1.05% do_or
        0.95% do_fuse3
        0.83% do_andi
        0.82% do_lbu
        0.72% do_jal

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions