Question regarding ARMV7M FPU context switch and ISRs

Sebastian Huber sebastian.huber at embedded-brains.de
Thu Feb 8 10:55:58 UTC 2024



On 08.02.24 11:43, Cedric Berger wrote:
> Hello Sebastian,
> 
> On 08.02.2024 11:09, Sebastian Huber wrote:
>> Hello Cedric,
>>
>> On 08.02.24 10:53, Cedric Berger wrote:
>>> Hello,
>>>
>>> I've a question: does RTEMS really wants to support FPU operations in 
>>> ISRs?
>>>
>>> Because if the answer is "no", then I believe that we could simplify 
>>> the RTEMS code (and for me the mental model of the whole thing) by 
>>> running the FPU with both FPCCR.ASPEN and FPCCR.LSPEN = 0.
>>>
>>> This mean that during IRQ/exception, the simple exception frame (32 
>>> bytes) would always be used instead of a combination of the simple 
>>> and extended frame (32 or 116 bytes).
>>>
>>> This would then improve the real-time guarantees of the system, by 
>>> having a shorter and more deterministic IRQ response time.
>>
>> if you don't use the FPU in ISRs, then the overhead is just a space 
>> overhead with the lazy FPU save/restore.
> 
> Yes, but it still means more cache lines, right?

Yes, but this should not really matter if we don't write to these lines.

> 
> And functions like _ARMV7M_Pendable_service_call and 
> _ARMV7M_Supervisor_call now have to save/restore 116 bytes instead of 32 
> bytes, right?

The functions are only used if you switch threads. If the ISR returns to 
the interrupted thread immediately, then you don't have to save/restore 
stuff.


> 
>> If you switch from an ISR to another thread you have to save/restore 
>> the volatile FPU context anyway.
> 
> Yes, my idea was to simply move d0-d7 out of struct 
> ARMV7M_Exception_frame and into struct Context_Control.
> 
> And killing functions like 
> _ARMV7M_Trigger_lazy_floating_point_context_save()

I am not sure if it is that simple if you implement the deferred FPU 
switching.

> 
>>>
>>> This would also simplify the context switching code, by centralizing 
>>> of the saving of the FPU context in RTEMS only, and enabling 
>>> optimisation like only saving/restoring the FPU when switching 
>>> between tasks defined with RTEMS_FLOATING_POINT.
>>>
>>> What do you think? I'm missing something? would it be a good idea?
>>
>> From experience, working with the RTEMS_FLOATING_POINT in applications 
>> is quite annoying. Is there really a measurable and significant 
>> performance improvement if you enable the deferred FPU switching? Can 
>> you guarantee that the compiler will not generate FPU or vector 
>> instructions for integer operations? In this version or a GCC release 
>> in the future?
> 
> Obviously, since I'm not God, I won't be able to provide any guarantee 
> regarding the future :)
> 
> But I believe that if GCC started to use FPU for integer operations, 
> many people would complain:
> 
> FreeBSD requires fpu_kern_enter/fpu_kern_leave to use FPU in the kernel, 
> and Linux requires kernel_fpu_begin/kernel_fpu_end to use FPU ops in the 
> kernel.
> 
> I'm pretty sure Linus will give GCC developpers a hard time if they 
> start to use FPU for integer operations anytime soon...

I am definitely sure that on PowerPC the AltiVec unit is used to 
optimize memory copies and initializations. I agree that it is unlikely 
that GCC will use the FPU for integer operations.

> 
>>
>>>
>>> I would be willing to work on that is there is some kind of agreement 
>>> here.
>>
>> If you change the ARMv7-M CPU port to use the deferred FPU switching, 
>> then you surely break existing applications which then have to use 
>> RTEMS_FLOATING_POINT. I would do this only if there would be a clear 
>> and measurable performance improvement. For the measurements we would 
>> need a benchmark.
> 
> Ok, so no to using deferred FPU switching for the moment, at least 
> without benchmark.

 From my point of view, yes.

> 
> But what about just running with FPCCR.ASPEN and FPCCR.LSPEN = 0, and 
> always saving the FPU in _CPU_Context_switch when swithing tasks?

You have to consider that if you switch after an ISR to another thread, 
then you have to save the volatile FPU context of the interrupted 
thread. If you switch back to the thread interrupted by the ISR, then 
you have to restore the volatile FPU context.

> 
> It would only break existing applications which use the FPU inside IRQs, 
> is that a problem? is the ISR FPU behaviour/requirements documented 
> somewhere in RTEMS?
> 
> To be honest, my main motivation here is trying to simplify the code and 
> more importantly improve debuggability and my understanding of how the 
> system work. my head kind of hurts trying to understand exactly at which 
> point in the code a lazy FPU save can occur.

Yes, the ARMv7-M context switching is unusually complicated. I am not 
sure if the deferred FPU context switching will simplify things.

-- 
embedded brains GmbH & Co. KG
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber at embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


More information about the devel mailing list