Fatal errors

Sebastian Huber sebastian.huber at embedded-brains.de
Wed Aug 11 13:47:33 UTC 2010


Hi,

I want to discuss some issues with fatal errors in RTEMS.  A fatal error is
something that renders the system unusable.  For example if you provide a works
space that is not big enough to contain all configured objects it makes no
sense to continue execution.  Fatal errors may happen due to:

1. Configuration bugs

2. Classic bugs inside the system caused even with valid user input

3. Invalid user input (lack of documentation etc.)

There are several entry points to the ultimate fatal error handler (void
_CPU_Fatal_halt(something the_error)):

1. void _Internal_error_Occurred(
     Internal_errors_Source the_source,
     bool is_internal,
     Internal_errors_t the_error
   )

2. void rtems_fatal_error_occurred(
     uint32_t the_error
   )

3. assert()

There are also some error functions that end up in _exit() which does some
clean up and initiates a context switch to the boot code:

1. exit()

2. abort()

3. std::terminate() (C++)

4. void rtems_panic(
     const char *printf_format,
     ...
   )

5. int rtems_error(
     rtems_error_code_t error_flag,
     const char *printf_format,
     ...
   )

The actions performed by _exit() are quite complex.  You need a fully
functional system to do this.  The rtems_panic() and rtems_error() functions
are very heavy weight due to dependencies on fprintf().  Also _exit() may fail
to shut down the system due to IO driver flaws.

Some notes to the _CPU_Fatal_halt() class functions:

1. The the_source and the_error values should be enough to identify the exact
error source provided you have the source code and an ELF-file with symbol
information.

2. The is_internal parameter is superfluous.  We can use the_source for this.

3. We loose information if we call _CPU_Fatal_halt() only with the_error
parameter.  This value is ambiguous without the_source.

4. assert() ends up in rtems_fatal_error_occurred(0).  Here we loose all
information about the error and rely to much on printk().

5. Currently we have these sources:

typedef enum {
  INTERNAL_ERROR_CORE,
  INTERNAL_ERROR_RTEMS_API,
  INTERNAL_ERROR_POSIX_API
} Internal_errors_Source;

This is not enough.  We should add at least a source for the BSP, the
application and triggered by assert().

6. We should change the type of the_error to be at least 32-bit wide and
capable to hold a pointer value.  This can be used by assert() to store the
failed expression string pointer.

7. The RTEMS source code should be reviewed to meet 1.

8. A call to _Internal_error_Occurred() should work at any time.  To achieve
this we have to statically initialize _User_extensions_List as an empty list.

It would be nice if you can add some comments about this.

Have a nice day!

-- 
Sebastian Huber, embedded brains GmbH

Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone   : +49 89 18 90 80 79-6
Fax     : +49 89 18 90 80 79-9
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



More information about the users mailing list