[PATCH v1 3/5] cpukit/aarch64: Add Exception Manager support

Wed Sep 1 15:48:25 UTC 2021

On 24/08/2021 01:50, Kinsey Moore wrote:
> diff --git a/cpukit/score/cpu/aarch64/aarch64-exception-default.c b/cpukit/score/cpu/aarch64/aarch64-exception-default.c
> index 2ebb3dee9f..e51e9453e1 100644
> --- a/cpukit/score/cpu/aarch64/aarch64-exception-default.c
> +++ b/cpukit/score/cpu/aarch64/aarch64-exception-default.c
> @@ -43,8 +43,61 @@
>   
>   #include <rtems/score/cpu.h>
>   #include <rtems/fatal.h>
> +#include <rtems/exception.h>
>   
>   void _AArch64_Exception_default( CPU_Exception_frame *frame )
>   {
> -  rtems_fatal( RTEMS_FATAL_SOURCE_EXCEPTION, (rtems_fatal_code) frame );
> +  if ( rtems_exception_manage( frame ) == false ) {
> +    rtems_fatal( RTEMS_FATAL_SOURCE_EXCEPTION, (rtems_fatal_code) frame );
> +  }
> +}

This is exactly the approach I don't like. I don't think we need a new 
user extension for this. We can do the exception to signal mapping also 
in a fatal error extension.

We should avoid changes in the RTEMS API unless it is really necessary. 
  Using an existing extension is preferable to adding new ones.

The normal action in case a non-interrupt exception occurs is to log the 
context and then restart the system. For this, we only need an exception 
prologue which calls rtems_fatal() as robust and safe as possible. In 
particular, it should not use the stack of the context which caused the 
exception. Non-interrupt exceptions in the field are usually due to some 
abnormal program behaviour which usually happens rarely. Getting a good 
context of the error is important. Recursive exceptions due to errors in 
the exception handling are very bad in this respect.

Since rtems_fatal() does not return to the caller, we don't need an 
exception epilogue after the rtems_fatal() call. Such an epilogue would 
be dead code in most systems. I spent a lot of time in the past to avoid 
dead code, so this is one of the reasons why I don't like the proposed 
approach.

If a fatal error extension determined that it is safe to resume 
execution, then it could simply call a new architecture-specific 
function which uses the CPU exception frame to continue execution. This 
function is basically something like the exception epilogue which you 
have on AArch64 right now. The only difference is that it is not 
executed after a return from _AArch64_Exception_default() and instead 
explicitly invoked by a function call, for example 
_CPU_Exception_return( frame ).

So, my proposal would be something like this:

1. Processor jumps to exception prologue

2. Exception prologue saves the context to CPU exception frame

3. Exception prologue calls rtems_fatal() which does not return

For the signal mapping you provide a fatal extension:

1. If the source is not RTEMS_FATAL_SOURCE_EXCEPTION, then return 
(system terminates).

2. If the exception type cannot be handled, then return (system terminates).

3. Add a post-switch action to the executing thread.

4. Call _CPU_Exception_return( frame )

The  _CPU_Exception_return( frame ) should:

1. Save the CPU exception return information to the stack of the 
executing thread.

2. Switch to the stack of the executing thread and to thread context 
with interrupts disabled.

3. Do something similar to the interrupt return.

4. The thread dispatch will call the post-switch extension which could 
raise a signal.

This approach avoids a new user extension and it avoids a potentially 
dead code in the exception epilogue.

-- 
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber at embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/