-
Notifications
You must be signed in to change notification settings - Fork 1
FDIR
Piotr Kuligowski edited this page Nov 3, 2016
·
7 revisions
In this documents possible faults should be addressed and analysed in their consequences and possible solutions.
Possible problems:
- No sign of error (SEU in unused part of memory).
- Faulty calculations (SEU in constant part).
- Not executing part of the code correctly.
- Total loss of subsystem (e.g. processor get stuck, not responding to any command).
Mitigation:
- Some kind of bootloader which have smaller memory footprint (probability of failure is much lower) which will load data from external memory (Flash/FRAM) with some voting (e.g. majority out of 5).
- In case of fuse-bits processor could boot no more - there is no known mitigation scheme for this - but probability of this event is very low (fuse bits are 3 8-bit registers).
Possible problems:
- No sign of error (SEU in unused part of memory).
- Faulty calculations (invalidating calculations).
- Stack corruption.
- Processor hangs, gets into infinite loop, etc.
Mitigation:
- Watchdog should periodically check if processor is alive.
- In case of "wrong" calculations processor should be reset to recalculate everything.
- Processor should be rebooted every [TBD] hours to prevent errors to build up.
Notes: in AVR devices user code cannot access flash memory - no in case of sram corruption flash will be untouched.
Possible problems:
- No sign of error (SEU in unused part of memory).
- Peripherals won't respond/will respond with incorrect results.
- Processor hangs, gets into infinite loop, etc.
Mitigation:
- Watchdog should periodically check if processor is alive.
- In case of "wrong" calculations processor should be reset to recalculate everything.
- Processor should be rebooted every [TBD] hours to prevent errors to build up.
Notes: every control register lies in SRAM memory - reboot/reinitialisation of peripheral should be sufficient to repair it
Possible problems:
- Power line short.
- Burnout of some parts of processor.
Mitigation:
- Use latchup current limiters.
- To decrease latch-up currents we can add series resistors between components or between its supply leg and supply rail.
Prevention:
- Use components with a higher latch-up current level according to JEDEC standard JESD78 (for example 250mA instead 100mA).
- Add series resistors or resistor-diode circuits to prevent SEL -> Fairchild's Process Enhancements Eliminate the CMOS SCR Latchu-Up Problem in 74HC Logic.
- Use components where process and layout considerations have eliminated CMOS latch-up -> Fairchild's MM74HC logic family.
Possible problems:
- Any peripherial, including core, could stop working.
Mitigation:
- Use processor with know radiation tests.