Skip to content

Asynchronous interrupts#532

Merged
tilk merged 14 commits into
masterfrom
async-interrupts
Dec 16, 2023
Merged

Asynchronous interrupts#532
tilk merged 14 commits into
masterfrom
async-interrupts

Conversation

@awariac
Copy link
Copy Markdown
Member

@awariac awariac commented Dec 3, 2023

Implementations of asynchronous interrupts based on idea that we came up with @lekcyjna123 and @Arusekk

Async interrupts are handled just like exceptions (via internal ExceptionCauseRegister) and are inserted only at branch instructions (which already know its pc in FU) (+ conditonally immediately after (at) MRET and CSR because it is required by spec)
This way we don't need to store PC in ROB or use it in all FUs. Additionaly the same logic would be used for handling mispredictions (very easy to do)!

InterruptController is temporary solution and will be replaced with proper CSR based controller with the same interface in other PR (in progress)

Marked as draft because I want to add some simple tests (+ probably port ones from feature/interrupts)

Of course depends on #523 (and needed to see a nice diff :) )

@awariac awariac added the enhancement New feature or request label Dec 3, 2023
@awariac awariac added this to the Implement machine mode ISA milestone Dec 3, 2023
@awariac awariac linked an issue Dec 3, 2023 that may be closed by this pull request
@awariac awariac added the microarch Involves the processor's microarchitecture label Dec 6, 2023
Copy link
Copy Markdown
Contributor

@lekcyjna123 lekcyjna123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@awariac awariac marked this pull request as ready for review December 13, 2023 19:50
@awariac
Copy link
Copy Markdown
Member Author

awariac commented Dec 13, 2023

PR is ready - I integrated @Kristopher38 's tests from feature/interrupts and extended them to cover pending interrupts during ISR execution.

Two bugs were found and fixed:

  • Missing conflict for exception_stall and fetch_verify at the same cycle (probably could happen also on exceptions) - fetch was stalling due to invalid state
  • Core not resumed after commiting instruction with async interrupt that is the only instruction in core (introduced and only related to this PR)

I added Zicsr to basic_core_config, because it supported ExceptionUnit, and handling them wouldn't make to much sense without csr support (mepc etc)

@awariac awariac requested review from lekcyjna123 and tilk December 13, 2023 20:01
@Kristopher38
Copy link
Copy Markdown
Member

Does this mean we're really close to getting speculative execution?

Copy link
Copy Markdown
Member

@Kristopher38 Kristopher38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Un)healthy does of bikeshedding - finished reviewing this and turns out almost all of my comments are about comments, but it's probably a good sign if anything.

Comment thread coreblocks/core.py
Comment thread coreblocks/stages/retirement.py Outdated
Comment thread test/stages/test_retirement.py
Comment thread test/test_core.py Outdated
Comment on lines +340 to +341
# 10-15 is the smallest feasible cycle count between interrupts to provide forward progress
("interrupt.asm", 300, {4: 21, 8: 9349}, {2: 21, 7: 9349, 31: 0xDE}, 10, 15),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried to lower hi and lo params? This comment was true in my implementation but might no longer hold.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it turns out we always have forward progression! updated values

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we required by the spec to handle next interrupt immediately after the previous one returns?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but hi/lo reffers only to execution cycles while not in ISR. Interrupts that happen during ISR and are executed immediately at MRET are triggred by independent random() < 0.4 condition in test.

Comment thread test/test_core.py Outdated
Comment thread coreblocks/structs_common/csr.py
Comment thread coreblocks/fu/priv.py Outdated
@awariac
Copy link
Copy Markdown
Member Author

awariac commented Dec 16, 2023

Does this mean we're really close to getting speculative execution?

Yes! And I plan it to integrate if (at least partially) via reporting branch mispredictions as exceptions in next, or nextnext PR.
Things that will be still TODO:

  • Support for branch predictor (for now I think i will hard-code pc+4 prediction to jb unit)
  • Experimenting on performance of different branch predictors

Things TODO for future (that we forgot about):

  • Our exception handing method/core flushing is the slowest possible one - maybe optimize it/write new (or make faster one for jumps, see next point)
  • Further optimization or alternative - make checkpoints on speculative jumps, so if we have missprediction, we won't need to wait unit jump is retired to clear whole ROB, but we can discard missed speculation path prefix of it, and start fetching and executing proper path immediately. Can be done with separate structure, and only for jumps, real exceptions/interrupts could use slower method.

Copy link
Copy Markdown
Member

@Kristopher38 Kristopher38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! And I plan it to integrate if (at least partially) via reporting branch mispredictions as exceptions in next, or nextnext PR.

Awesome, can't wait to see it happen! I might even bang out some nontrivial branch predictor if my time constraints allow it.

Copy link
Copy Markdown
Member

@tilk tilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tilk tilk merged commit c63beb4 into master Dec 16, 2023
@tilk tilk deleted the async-interrupts branch December 16, 2023 18:50
github-actions Bot pushed a commit that referenced this pull request Dec 16, 2023
@tilk
Copy link
Copy Markdown
Member

tilk commented Dec 16, 2023

Unfortunately, Fmax went down a lot, this needs to be investigated.

@awariac
Copy link
Copy Markdown
Member Author

awariac commented Dec 16, 2023

@tilk
I managed to fix the FMax problem by inserting FIFO in front of ExceptionCauseRegister report method (back to 52 MHz). I will post PR tommorow, but I don't think this is the real problem that we are trying to solve.
Before this PR (logs from previous commit), JB unit was on critical path with a some weird dependency of RS_Optype->Decoder->JB_taken->RS_Funct3(???)->FIFO_branch. Maybe I've read it incorrectly, but in any case, it doesn't seem good to me, that JB FU takes our entire timing margin. This PR increased the already critical path in JBUnit by "only" 2 ns, so it become visible. I think it needs further investigation.
Also none of above explain gigantic spike in Carry LUT usage and IPC loss on some benchamarks.
What do you think about it?

@tilk
Copy link
Copy Markdown
Member

tilk commented Dec 16, 2023

I managed to fix the FMax problem by inserting FIFO in front of ExceptionCauseRegister report method (back to 52 MHz).

Is the additional delay to ExceptionCauseRegister safe? I fear a race condition can happen.

Maybe I've read it incorrectly, but in any case, it doesn't seem good to me, that JB FU takes our entire timing margin. This PR increased the already critical path in JBUnit by "only" 2 ns, so it become visible. I think it needs further investigation.

Indeed, this looks fishy. I wonder what will happen when jumps will no longer stall the fetcher.

Also none of above explain gigantic spike in Carry LUT usage and IPC loss on some benchamarks.

I wouldn't worry about the Carry LUT spike on the basic core, a corresponding spike is not seen on the full core. IPC loss is possibly due to more transaction conflicts. I want to introduce a simple Transactron profiler soon, which will hopefully help find such bottlenecks.

@awariac
Copy link
Copy Markdown
Member Author

awariac commented Dec 17, 2023

Haha, I know now.

  • Carry LUT and RAM LUTs increased, because I added CSRUnit to Basic Core Config. Without it all current CSR registers were optimized away. This is also why there is no spike on full core config. Whats interesting is that device utilization decreased, but it happened previously for no reason too.
  • IPC problem - yes it is caused by adding method conflicts in Fetch. When implementing it, I looked at waveforms and I'm sure that I checked and saw that Retirement transaction could run in the same time with Fetcher verify - so there should be no degradation. But it looks like I was wrong. Anyway adding conflicts was only a nicer looking alternative for adding run condition for Verify Transaction in Retirement. It is equivalent, because stall_exception is also only used in Retirement. I only need to revert to the previous solution and IPC should be the same.
  • FMax as in my previous comment (I didn't look at JBUnit yet). Probably it is safe, I need to think it through for a moment

tilk pushed a commit that referenced this pull request Jan 4, 2024
github-actions Bot pushed a commit that referenced this pull request Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request microarch Involves the processor's microarchitecture

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Interrupt implementation

4 participants