- Speaker:Daniel Lohmann, Associate Professor (Privatdozent) at the Chair of Computer Science IV (Distributed Systems and Operating Systems), Friedrich-Alexander-Universität Erlangen-Nürnberg
- Title: Dependable System Software by Construction
- Abstract: Because of shrinking structure sizes and operating voltages, computing hardware exhibits an increasing susceptibility against transient hardware faults: Issues previously only known from avionics systems, such as bit flips caused by cosmic radiation, nowadays also affect automotive and other cost-sensitive „ground-level“ control systems. For such cost-sensitive systems, many software-based measures have been suggested to harden applications against transient effects. However, all these measures assume that the underlying system software works reliably in all cases. In the talk I present software-based concepts for constructing an operating system that provides a reliable computing base even on unreliable hardware. Our design is based on two pillars: First, strict fault avoidance by constructive measues in the kernel design and implementation. Second, reliable fault detection by fine-grained arithmetic encoding of the complete kernel execution path. Compared to an industry-grade off-the-shelf RTOS, our resulting dOSEK kernel thereby achieves a robustness improvement by four orders of magnitude.
Our results are based on extensive fault-injection campaigns that cover the entire space of single-bit faults in random-access memory and registers.