Skip to content
Piotr Kuligowski edited this page Nov 3, 2016 · 7 revisions

Description

In this documents possible faults should be addressed and analysed in their consequences and possible solutions.

Faults

SEU in flash memory

Possible problems:

  • No sign of error (SEU in unused part of memory).
  • Faulty calculations (SEU in constant part).
  • Not executing part of the code correctly.
  • Total loss of subsystem (e.g. processor get stuck, not responding to any command).

Mitigation:

  • Some kind of bootloader which have smaller memory footprint (probability of failure is much lower) which will load data from external memory (Flash/FRAM) with some voting (e.g. majority out of 5).
  • In case of fuse-bits processor could boot no more - there is no known mitigation scheme for this - but probability of this event is very low (fuse bits are 3 8-bit registers).

SEU in SRAM memory

Possible problems:

  • No sign of error (SEU in unused part of memory).
  • Faulty calculations (invalidating calculations).
  • Stack corruption.
  • Processor hangs, gets into infinite loop, etc.

Mitigation:

  • Watchdog should periodically check if processor is alive.
  • In case of "wrong" calculations processor should be reset to recalculate everything.
  • Processor should be rebooted every [TBD] hours to prevent errors to build up.

Notes: in AVR devices user code cannot access flash memory - no in case of sram corruption flash will be untouched.

SEU in AVR control register

Possible problems:

  • No sign of error (SEU in unused part of memory).
  • Peripherals won't respond/will respond with incorrect results.
  • Processor hangs, gets into infinite loop, etc.

Mitigation:

  • Watchdog should periodically check if processor is alive.
  • In case of "wrong" calculations processor should be reset to recalculate everything.
  • Processor should be rebooted every [TBD] hours to prevent errors to build up.

Notes: every control register lies in SRAM memory - reboot/reinitialisation of peripheral should be sufficient to repair it

Latchup

Possible problems:

  • Power line short.
  • Burnout of some parts of processor.

Mitigation:

  • Use latchup current limiters.
  • To decrease latch-up currents we can add series resistors between components or between its supply leg and supply rail.

Prevention:

TID

Possible problems:

  • Any peripherial, including core, could stop working.

Mitigation:

  • Use processor with know radiation tests.

Clone this wiki locally