Open
Description
@mkeeter points out to me that it could be quite useful for the MGS metrics subsystem to report a counter metric tracking the number of task crash dumps present on a SP. This would allow us to easily see which SPs on the rack have crash dumps, as well as providing an indication of the time1 at which a new crash dump appeared, which provides an approximation of when the task crash occurred.
Footnotes
-
Wall clock time as understood by MGS. ↩