-
Notifications
You must be signed in to change notification settings - Fork 1.3k
MagTag Vaccination tracker crash in magtag.network.connect() between 7alpha3 and 7alpha4 #5021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you test this with a build that includes #5017? That fixed a bug in |
I tested with adafruit-circuitpython-adafruit_magtag_2.9_grayscale-en_US-20210719-1b17b08.uf2 an artefact of #5017 and it reset loop in the same call as alpha-4. Please notice that it crash very early in the program... before doing the background part.
(having something from the learn guide make it easy to reproduce, just so far no one confirmed I am not alone with this) |
Maybe I shoud rename the title to:
Now I have a very small piece of code and adding one like make it crash...
|
If you have the time and the willingness, try diving down in to the library to narrow the problem further, emulating what the library is doing, and not using the library itself. A lot is going on in the constructor |
I re-tested with beta.6 loaded via uf2 with from adafruit_magtag.magtag import MagTag
from adafruit_progressbar import ProgressBar
DATA_SOURCE = "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/country_data/United%20States.csv"
magtag = MagTag(url=DATA_SOURCE)
print("Before magtag.network.connect()")
magtag.network.connect()
print("After magtag.network.connect()") and the crash still occurs. Intially I thought it was fixed, because the code sample a few comments above commented out the problematic line by default. |
debug uart messages seem to indicate the watchdog is biting...
|
I didn't get any traction on this bug, so giving it back for someone else to try. |
Commenting out the I tested this with the simple magtag example in Adafruit CircuitPython 7.0.0-beta.0 on 2021-08-24; Adafruit MagTag with ESP32S2 # SPDX-FileCopyrightText: 2017 Scott Shawcroft, written for Adafruit Industries
#
# SPDX-License-Identifier: Unlicense
import time
import terminalio
from adafruit_magtag.magtag import MagTag
magtag = MagTag()
# added this line to the example
magtag.network.connect()
magtag.add_text(
text_font=terminalio.FONT,
text_position=(
50,
(magtag.graphics.display.height // 2) - 1,
),
text_scale=3,
)
magtag.set_text("Hello World")
buttons = magtag.peripherals.buttons
button_colors = ((255, 0, 0), (255, 150, 0), (0, 255, 255), (180, 0, 255))
button_tones = (1047, 1318, 1568, 2093)
timestamp = time.monotonic()
while True:
for i, b in enumerate(buttons):
if not b.value:
print("Button %c pressed" % chr((ord("A") + i)))
magtag.peripherals.neopixel_disable = False
magtag.peripherals.neopixels.fill(button_colors[i])
magtag.peripherals.play_tone(button_tones[i], 0.25)
break
else:
magtag.peripherals.neopixel_disable = True
time.sleep(0.01) |
I think this bug extends to the Metro Express ESP32-S2 however I can no longer reproduce it. Similarities:
I was using the .bin nightly build from August 21st. This morning I moved to the 7.0 beta .bin, and experienced the same issue. In an effort to find the minimum amount of code needed to reproduce the issue, I was using a large string, putting it into a variable and that variable inside of an empty list, and just multiplying it to take up space so in around 10 steps the board would need to free up its memory. After the variables were populated, I'd reset them to empty strings and lists with the exception of the base, Part way through while pruning the program I accidentally caused a hard crash, and the board booted into safemode. I think this was because I added to the memory hog string a few unescaped I deleted those, hit reset on the board to exit safemode. After that point, I can no longer cause a reset to occur with it frees up the memory. It now free's the memory without issue and without resetting, performing as one would expect. To try and get the bug to occur again, I rolled back to the nightly mentioned above, and could not get it to occur, even using the full original code. |
Unfortunately, #5220 did NOT fix this |
It must be some kind of object that is moved but can't be moved safely, making this change causes the crash to go away for me:
|
Here is the minimum code that still caused the crash from adafruit_portalbase.wifi_esp32s2 import WiFi
from secrets import secrets
import gc
wifi = WiFi()
wifi.connect(secrets["ssid"], secrets["password"])
print("wifi connected")
# crashes here
gc.collect()
print("gc.collect() finished") However, this code doesn't crash from secrets import secrets
import gc
import wifi
import ssl
import socketpool
import adafruit_requests
wifi.radio.connect(secrets["ssid"], secrets["password"])
pool = socketpool.SocketPool(wifi.radio)
requests = adafruit_requests.Session(pool, ssl.create_default_context())
print("wifi connected")
gc.collect()
print("gc.collect() finished") Maybe an issue with native wifi that is moved? |
This change makes little sense, but it makes the crash go away with me. I tested with 7098d4c plus only this change, using the script from #5021 (comment) saved to The reason I say it makes little sense is that the esp32-s2 implementation of socketpool has NOTHING in it. The use of |
I cannot give a satisfactory account of _why_, but the crash in adafruit#5021 goes away when this object is initially allocated in the long-lived portion of the heap. Closes: adafruit#5021
I can think of two reasons why allocating this as long-lived accidentally fixes a problem:
|
Scott & I think we figured it out. It's a super interesting bug: scott will put together a patch for us soon, but it was way past lunchtime for him when we finished chatting & diagnosing. 🌮 |
🌮🌮🌮 |
@jfurcean Thanks for the minimal example! We used it to debug. |
def gc_layout(size_bytes, block_size=16, bits_per_byte=8):
pool_blocks = size_bytes * bits_per_byte // (block_size * bits_per_byte + 3)
print("max # blocks", pool_blocks)
bits_per_block = block_size * bits_per_byte + 3
atb_bytes = pool_blocks // 4 + 1
ftb_bytes = (pool_blocks + 7) // 8
allocation = atb_bytes + ftb_bytes + pool_blocks * block_size
print("total allocation", allocation)
if allocation > size_bytes:
pool_blocks -= 1
return (pool_blocks, atb_bytes, ftb_bytes)
def round_up(a, n):
return (a + n - 1) // n * n
def describe_gc_layout(base_addr, pool_blocks, atb_bytes, ftb_bytes, block_size=16):
atb_base = base_addr
atb_end = ftb_base = base_addr + atb_bytes
ftb_end = ftb_base + ftb_bytes
pool_base = round_up(ftb_end, 16)
pool_end = pool_base + block_size * pool_blocks
assert (4 * atb_bytes) >= (pool_blocks + 1)
assert (8 * ftb_bytes) >= pool_blocks
print("GC layout:")
print(f" alloc table at {hex(atb_base)}, length {atb_bytes:9} bytes, {atb_bytes*4:6} blocks")
print(f" finaliser table at {hex(ftb_base)}, length {ftb_bytes:9} bytes, {ftb_bytes*8:6} blocks")
print(f" pool at {hex(pool_base)}, length {block_size * pool_blocks:9} bytes, {pool_blocks:6} blocks")
print()
print(f"alloc table {hex(atb_base)} .. {hex(atb_end)}")
print(f"finaliser table {hex(ftb_base)} .. {hex(ftb_end)}")
print(f"pool table {hex(pool_base)} .. {hex(pool_end)}")
layout = gc_layout(2096088)
describe_gc_layout(0x3fd80428, *layout) I wrote a program in Python to try to lay out the gc memory. Weirdly, ... it gets one more block than micropython, even though I think it is correct and should fix the bug!
compared to the current outcome
notice how allocating one more byte to "alloc_table" means that we can call |
Oh it also turns out the number of blocks has to be a multiple of BLOCKS_PER_ATB. |
.. or, for !MICROPY_ENABLE_FINALISER, before the first block of the pool. Closes: adafruit#5021 Closes: micropython#7116 Signed-off-by: Jeff Epler <[email protected]>
Closes: adafruit#5021 Closes: micropython#7116 Signed-off-by: Jeff Epler <[email protected]>
Closes: adafruit#5021 Closes: micropython#7116 Signed-off-by: Jeff Epler <[email protected]>
Prior to this fix the follow crash occurred. With a GC layout of: GC layout: alloc table at 0x3fd80428, length 32001 bytes, 128004 blocks finaliser table at 0x3fd88129, length 16001 bytes, 128008 blocks pool at 0x3fd8bfc0, length 2048064 bytes, 128004 blocks Block 128003 is an AT_HEAD and eventually is passed to gc_mark_subtree. This causes gc_mark_subtree to call ATB_GET_KIND(128004). When block 1 is created with a finaliser, the first byte of the finaliser table becomes 0x2, but ATB_GET_KIND(128004) reads these bits as AT_TAIL, and then gc_mark_subtree references past the end of the heap, which happened to be past the end of PSRAM on the esp32-s2. The fix in this commit is to ensure there is a one-byte gap after the ATB filled permanently with AT_FREE. Fixes issue micropython#7116. See also adafruit#5021 Signed-off-by: Jeff Epler <[email protected]> Signed-off-by: Damien George <[email protected]>
Prior to this fix the follow crash occurred. With a GC layout of: GC layout: alloc table at 0x3fd80428, length 32001 bytes, 128004 blocks finaliser table at 0x3fd88129, length 16001 bytes, 128008 blocks pool at 0x3fd8bfc0, length 2048064 bytes, 128004 blocks Block 128003 is an AT_HEAD and eventually is passed to gc_mark_subtree. This causes gc_mark_subtree to call ATB_GET_KIND(128004). When block 1 is created with a finaliser, the first byte of the finaliser table becomes 0x2, but ATB_GET_KIND(128004) reads these bits as AT_TAIL, and then gc_mark_subtree references past the end of the heap, which happened to be past the end of PSRAM on the esp32-s2. The fix in this commit is to ensure there is a one-byte gap after the ATB filled permanently with AT_FREE. Fixes issue micropython#7116. See also adafruit#5021 Signed-off-by: Jeff Epler <[email protected]> Signed-off-by: Damien George <[email protected]>
Prior to this fix the follow crash occurred. With a GC layout of: GC layout: alloc table at 0x3fd80428, length 32001 bytes, 128004 blocks finaliser table at 0x3fd88129, length 16001 bytes, 128008 blocks pool at 0x3fd8bfc0, length 2048064 bytes, 128004 blocks Block 128003 is an AT_HEAD and eventually is passed to gc_mark_subtree. This causes gc_mark_subtree to call ATB_GET_KIND(128004). When block 1 is created with a finaliser, the first byte of the finaliser table becomes 0x2, but ATB_GET_KIND(128004) reads these bits as AT_TAIL, and then gc_mark_subtree references past the end of the heap, which happened to be past the end of PSRAM on the esp32-s2. The fix in this commit is to ensure there is a one-byte gap after the ATB filled permanently with AT_FREE. Fixes issue micropython#7116. See also adafruit#5021 Signed-off-by: Jeff Epler <[email protected]> Signed-off-by: Damien George <[email protected]>
Prior to this fix the follow crash occurred. With a GC layout of: GC layout: alloc table at 0x3fd80428, length 32001 bytes, 128004 blocks finaliser table at 0x3fd88129, length 16001 bytes, 128008 blocks pool at 0x3fd8bfc0, length 2048064 bytes, 128004 blocks Block 128003 is an AT_HEAD and eventually is passed to gc_mark_subtree. This causes gc_mark_subtree to call ATB_GET_KIND(128004). When block 1 is created with a finaliser, the first byte of the finaliser table becomes 0x2, but ATB_GET_KIND(128004) reads these bits as AT_TAIL, and then gc_mark_subtree references past the end of the heap, which happened to be past the end of PSRAM on the esp32-s2. The fix in this commit is to ensure there is a one-byte gap after the ATB filled permanently with AT_FREE. Fixes issue micropython#7116. See also adafruit#5021 Signed-off-by: Jeff Epler <[email protected]> Signed-off-by: Damien George <[email protected]>
CircuitPython version
Code/REPL
Demo code can be fetched from here: https://learn.adafruit.com/adafruit-magtag-covid-vaccination-percent-tracker/code-the-vaccination-tracker
Behavior
MagTag Vaccination tracker crash in magtag.network.connect() between 7alpha3 and 7alpha4
I tested MagTag Covid percentage tracker with 7.0.0-alpha.3 and 7.0.0-alpha.4.
Somewhere in between the two it crash in loop in magtag.network.connect()
With
Adafruit CircuitPython 7.0.0-alpha.3 on 2021-06-03; Adafruit MagTag with ESP32S2
It kind of work, it does not display the properly, but the progress bar are there and the network fetch work:
With
Adafruit CircuitPython 7.0.0-alpha.4 on 2021-07-08; Adafruit MagTag with ESP32S2
it bootloop inIt reset/boot in loop somewhere inside
magtag.network.connect()
:^^^ And here bootlook reseting the serial ^^^
So somewhere between 7.0.0-alpha.3 and 7.0.0-alpha.4 the network call of
I guess I need to bisec and find where it stated to behave like that.
Demo code can be fetched from here:
https://learn.adafruit.com/adafruit-magtag-covid-vaccination-percent-tracker/code-the-vaccination-tracker
Description
No response
Additional information
This test is done with the learn guide library... except for manual update to the latest version from GIT of:
Using the py (not mpy) version of those library (in an attempt to debug that further).
This is also mean using adafruit/Adafruit_CircuitPython_MagTag#63 that I was trying to validate.
The text was updated successfully, but these errors were encountered: