Use a linked list of background tasks to perform #2879

jepler · 2020-05-11T13:54:33Z

The "lower power" branch removed a previous optimization that allowed the background tasks to run just once every tick. Restore this, for things that run on the basis of time; and where possible move things to happening based on interrupts.

The foundation for this is a new linked list of background callbacks. A background callback takes a function pointer and an optional object pointer. Usually, such callbacks are registered from interrupts. For instance, on atmel sam, the audio DMA interrupt is used to schedule a fill of the new sample data, rather than polling a flag every 1ms.

Current state of this PR:

Builds on a range of boards
transitions all boards back to running background tasks just once per ms, though only when ticks are enabled
transitions sam d20/d5x/e5x to using interrupts for audio and usb

Testing performed: on pygamer and grand central m4, did general tire kicking of related code:

audio code
usb
filesystem
eink display
lcd display
gamepad

Benchmarks: On Grand Central M4, I ran the following code

import time
t0 = time.monotonic_ns()
k = 0
for i in range(1000):
    for j in range(100):
        k += j
t1 = time.monotonic_ns()
print(k)
print((t1-t0) / 1e9)

The regression from 5.3.0 to 6.0.0.alpha.1 is almost entirely fixed. However, the difference between the original timing of 1.02s and the new timing of 1.04s seems "real" and is not within the usual variability of execution time. (that the above loop takes about 1s on samd51 is a total coincidence, I didn't choose the numbers specially or anything!)

version	time
5.3.0	1.02s
6.0.0.a1	1.54s
this PR	1.04s

What's next:

test & benchmark more boards
get a green from Actions
possible strategic conversion of more stuff to interrupt-driven

tannewt

Thanks for this! My brain will start thinking through this approach. It's not what I was thinking which means it is probably better. :-)

supervisor/shared/background_callback.c

jepler · 2020-06-25T18:53:31Z

Discussion with @tannewt

go ahead with this approach
make sure that the case of background callbacks becoming invalid during GC heap reset

Which modules to do next

the rest of audio (it's "the hardest")
displayio second
anything that has open issues about it not working in main

jepler · 2020-07-09T16:45:26Z

The initial comment has been heavily revised to reflect the current status of this PR. Things are progressing nicely. SAM D5x/E5x is probably done, though as always there may be problems that only become obvious after testing, such as in the next alpha release.

jepler · 2020-07-09T20:22:37Z

The performance gained back on the CLUE is much smaller... but it has not been given specific attention yet.

version	time
5.3.0	2.65s
6.0.0.a1	4.66s
this PR	4.112s

jepler · 2020-07-10T20:22:32Z

With the latest changes performance on nRF is a bit better than the 5.3.0 baseline (now testing on Particle Xenon).

version	time
5.3.0	2.24s
6.0.0.a1	3.04s
this PR	2.22s

I suspect the 5.3.0 performance of the Xenon is better than the CLUE because the latter has a displayio display enabled. However, I didn't test the theory.

The motivation for doing this is so that we can allow common_hal_mcu_disable_interrupts in IRQ context, something that works on other ports, but not on nRF with SD enabled. This is because when SD is enabled, calling sd_softdevice_is_enabled in the context of an interrupt with priority 2 or 3 causes a HardFault. We have chosen to give the USB interrupt priority 2 on nRF, the highest priority that is compatible with SD. Since at least SoftDevice s130 v2.0.1, sd_nvic_critical_region_enter/exit have been implemented as inline functions and are safe to call even if softdevice is not enabled. Reference kindly provided by danh: https://devzone.nordicsemi.com/f/nordic-q-a/29553/sd_nvic_critical_region_enter-exit-missing-in-s130-v2 Switching to these as the default/only way to enable/disable interrupts simplifies things, and fixes several problems and potential problems: * Interrupts at priority 2 or 3 could not call common_hal_mcu_disable_interrupts because the call to sd_softdevice_is_enabled would HardFault * Hypothetically, the state of sd_softdevice_is_enabled could change from the disable to the enable call, meaning the calls would not match (__disable_irq() could be balanced with sd_nvic_critical_region_exit). This also fixes a problem I believe would exist if disable() were called twice when SD is enabled. There is a single "is_nested_critical_region" flag, and the second call would set it to 1. Both of the enable() calls that followed would call critical_region_exit(1), and interrupts would not properly be reenabled. In the new version of the code, we use our own nesting_count value to track the intended state, so now nested disable()s only call critical_region_enter() once, only updating is_nested_critical_region once; and only the second enable() call will call critical_region_exit, with the right value of i_n_c_r. Finally, in port_sleep_until_interrupt, if !sd_enabled, we really do need to __disable_irq, rather than using the common_hal_mcu routines; the reason why is documented in a comment.

In time, we should transition interrupt driven background tasks out of the overall run_background_tasks into distinct background callbacks, so that the number of checks that occur with each tick is reduced.

Testing performed: Played half of the Bartlebeats album :) :)

Before this, the mp3 file would be read into the in-memory buffer only when new samples were actually needed. This meant that the time to read mp3 content always counted against the ~22ms audio buffer length. Now, when there's at least 1 full disk block of free space in the input buffer, we can request that the buffer be filled _after_ returning from audiomp3_mp3file_get_buffer and actually filling the DMA pointers. In this way, the time taken for reading MP3 data from flash/SD is less likely to cause an underrun of audio DMA. The existing calls to fill the inbuf remain, but in most cases during streaming these become no-ops because the buffer will be over half full.

…to-refresh This is a step towards restoring the efficiency of the background tasks

CALLBACK_CRITICAL_BEGIN is heavyweight, but we can be confident we do not have work to do as long as callback_head is NULL. This gives back performance on nRF.

jepler · 2020-07-15T16:56:22Z

I think this is ready for testing and review. I have tested SAM D51, nRF52840, esp32s2, stm32f405 all at various points during this process and I'm no longer aware of any regressions.

dhalbert

Thanks for this careful work. The nrf changes make sense to me. @jepler and I went over them in some detail.

tannewt

This is an awesome cleanup! Thank you!

I believe there is one case you don't handle that is probably very rare and maybe not an issue. It is the case where you queue up background work like filling an mp3 buffer but delete the mp3 object before you do the background work. Think that is an issue?

A background callback must never outlive its related object. By collecting the head of the linked list of background tasks, this will not happen. One hypothetical case where this could happen is if an MP3Decoder is deleted while its callback to fill its buffer is scheduled.

jepler · 2020-07-17T13:37:02Z

@tannewt thanks for reminding me about that case. By making the callback list a GC root, this should not be possible. Please re-review.

tannewt · 2020-07-17T18:56:03Z

@jepler are all callbacks allocated on the heap or is it a mix? If it's a mix then the collect won't work because it'll stop at the first pointer off the heap I believe.

Looks like uchip is 4 bytes too big now unfortunately.

dhalbert

@jepler and I discussed some things in discord, new code looks good to me.

tannewt

Looks good to me too! Excited to hear how it works for folks.

tannewt reviewed May 11, 2020

View reviewed changes

supervisor/shared/background_callback.c Outdated Show resolved Hide resolved

jepler commented May 12, 2020

View reviewed changes

supervisor/shared/background_callback.c Show resolved Hide resolved

This was referenced May 19, 2020

CP performance decrease since 5.3 #2929

Closed

Fix usb sleep for NRF #2868

Merged

tannewt changed the base branch from master to main June 9, 2020 20:05

jepler force-pushed the background-callback branch from bbd9436 to 7944f50 Compare July 9, 2020 16:33

jepler changed the title ~~Add, demonstrate use of new background task list functionality~~ Use a linked list of background tasks to perform Jul 9, 2020

jepler mentioned this pull request Jul 14, 2020

Issue#2949 Put in a check to prevent USB starvation by long calculations #3149

Merged

jepler force-pushed the background-callback branch from 0823ad7 to 2056ff6 Compare July 14, 2020 19:30

jepler added 8 commits July 15, 2020 09:26

tick.c: adjust whitespace

51b9a1a

supervisor: Add a linked list of background callbacks

1474fcc

In time, we should transition interrupt driven background tasks out of the overall run_background_tasks into distinct background callbacks, so that the number of checks that occur with each tick is reduced.

supervisor: tick: only run background tasks once per tick

8c4a9f6

samd: audio: Move to background callback

742aa74

Testing performed: Played half of the Bartlebeats album :) :)

displayio, framebufferio: Enable supervisor tick when a display is au…

af52072

…to-refresh This is a step towards restoring the efficiency of the background tasks

background_callback: Avoid CALLBACK_CRITICAL_BEGIN with nothing to do

36b4646

CALLBACK_CRITICAL_BEGIN is heavyweight, but we can be confident we do not have work to do as long as callback_head is NULL. This gives back performance on nRF.

jepler force-pushed the background-callback branch from 2056ff6 to b445814 Compare July 15, 2020 14:33

jepler added 3 commits July 15, 2020 11:49

supervisor: factor out, Handle USB via background callback

6160d11

supervisor: factor supervisor_background_tasks from sundry ports

1df4817

supervisor: usb: note that it's unusual to need to call usb_background

81105cb

jepler requested a review from tannewt July 15, 2020 16:55

jepler marked this pull request as ready for review July 15, 2020 16:55

jepler force-pushed the background-callback branch from b445814 to 81105cb Compare July 15, 2020 16:58

dhalbert previously approved these changes Jul 16, 2020

View reviewed changes

tannewt added this to the 6.0.0 milestone Jul 16, 2020

tannewt added performance supervisor labels Jul 16, 2020

tannewt requested changes Jul 16, 2020

View reviewed changes

jepler dismissed dhalbert’s stale review via a18a392 July 17, 2020 13:36

jepler requested review from tannewt and dhalbert July 17, 2020 13:37

jepler added 2 commits July 17, 2020 14:32

uchip: reclaim some flash space

6912d31

background_callback_gc_collect: We must traverse the whole list

98eef79

dhalbert approved these changes Jul 17, 2020

View reviewed changes

tannewt approved these changes Jul 17, 2020

View reviewed changes

jepler merged commit 9cdf5e1 into adafruit:main Jul 17, 2020

kamtom480 mentioned this pull request Aug 20, 2020

spresense: call usb_background function #3306

Merged

dhalbert mentioned this pull request Jan 20, 2021

Copying multiple larger files can hang the copy if autoreload is on #3986

Closed

jepler deleted the background-callback branch November 3, 2021 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a linked list of background tasks to perform #2879

Use a linked list of background tasks to perform #2879

jepler commented May 11, 2020 •

edited

Loading

tannewt left a comment

jepler commented Jun 25, 2020

jepler commented Jul 9, 2020

jepler commented Jul 9, 2020

jepler commented Jul 10, 2020 •

edited

Loading

jepler commented Jul 15, 2020

dhalbert left a comment

tannewt left a comment

jepler commented Jul 17, 2020

tannewt commented Jul 17, 2020

dhalbert left a comment

tannewt left a comment

Use a linked list of background tasks to perform #2879

Use a linked list of background tasks to perform #2879

Conversation

jepler commented May 11, 2020 • edited Loading

tannewt left a comment

Choose a reason for hiding this comment

jepler commented Jun 25, 2020

jepler commented Jul 9, 2020

jepler commented Jul 9, 2020

jepler commented Jul 10, 2020 • edited Loading

jepler commented Jul 15, 2020

dhalbert left a comment

Choose a reason for hiding this comment

tannewt left a comment

Choose a reason for hiding this comment

jepler commented Jul 17, 2020

tannewt commented Jul 17, 2020

dhalbert left a comment

Choose a reason for hiding this comment

tannewt left a comment

Choose a reason for hiding this comment

jepler commented May 11, 2020 •

edited

Loading

jepler commented Jul 10, 2020 •

edited

Loading