[RTEMS Project] #2270: SPARC: Optimized floating-point context handling
RTEMS trac
trac at rtems.org
Thu Feb 26 10:50:07 UTC 2015
#2270: SPARC: Optimized floating-point context handling
-----------------------------+------------------------------
Reporter: sebastian.huber | Owner: sebastian.huber
Type: enhancement | Status: accepted
Priority: normal | Milestone: 4.11.1
Component: cpukit | Version: 4.11
Severity: normal | Resolution:
Keywords: SPARC |
-----------------------------+------------------------------
Comment (by daniel):
As I recall it GCC-4.0/4.1 and possible also 4.2 had a known bug
generating FP instruction in integer code. In the Gaisler toolchain we
made a fix it but the mainline does not have this issue any more. As you
pointed out it is not efficient on a SPARC to use FP registers in integer
code anyway.
OAR and Gaisler analysis during the early SMP work came to the same
conclusion. If I remember Linux has also disabled lazy switching on SMP.
I agree with you about ABI and FP registers does not have to be saved in
the normal case, unless a task is interrupted and context switched. I
think you could make some significant improvements here.
Personally I think it is not that big a deal to analyse the ISR to verify
that FP instructions are not used. If used the user shall save the FP
context itself by calling _CPU_Context_save/restore_fp(). As I understand
most LEON OSes have this design. The RTEMS documentation about this
probably should be improved. This approach is in practice a bit
troublesome some since it in many cases would require an extra function
stack frame. If a ISR uses float types but calls _CPU_Context_save() one
can not know in which order GCC performs float variable initialization or
the function call. So in practice one have to jump to ISR, call FP save,
then call the real ISR implementation doing FP instructions. Perhaps this
can be avoided? It would have been better that the trap handler saved the
FP context before calling the ISR handler. Could we introduce a
RTEMS_FLOATING_POINT option to the rtems_interrupt_handler_install()?
RTEMS invites the usage of mixed ABIs by having the RTEMS_FLOATING_POINT
option when creating tasks. Of course one have to be very careful mixing
ABIs, I would recommend at least scan the binaries for float instructions.
When verifying code you get a trap at first occurrence so normally code
coverage is enough to be sure. So the float problem is not limited to the
ISRs.
(3)
I think lazy context switching becomes less important when you implement a
proper context switch taking the ABI into account as you describe. The FPU
context is only one register, the FSR, when FPU is turned on. SO one might
just save that one register, right? The case where lazy would be
beneficial is when interrupting a task?
(2)
This is costy in the average case? My guess is that 99% of ISRs doesn't
use floats and most tasks does not use a FP context. To the sake of
average performance, wouldn't it be better to let the ISR handle the FP
context itself or add a RTEMS_FLOATING_POINT option to the interrupt
install routine?
(3)
I think it is problematic to disable the PSR.EF on entry. We should keep
in mind that it is CPU implementation specific what happens when PSR.EF is
cleared. Turning off the FPU could cause power-down mode or that FPU
operations being paused. In a modern FPU the FPU performs operations in
parallel with the integer pipeline so turning it off could actually have a
negative impact on the interrupted task. Therefore it is important to
store the FSR register to memory to wait for all FP operations to complete
before turning the FPU off, this would introduce a performance loss and
potentially a worst-case nightmare? Otherwise I like the idea of saving FP
context only when a FP task is interrupted and context switched.
Storing all the FPU registers to stack or storing the FSR registers waits
until FPU operations are completed. Storing FSR last is the best since you
could potentially store a FP register that does not depend on an ongoing
operation.
What about this for single-core and SMP (4):
* FP TASK context switch: only save FSR and disable PSR.FP on normal
context switch. Then we would wait for ongoing operations and ABI ensures
FP registers need not to be saved.
* interrupted FP TASK with PSR.EF=1 context switch: save FP registers,
then FSR and clear PSR.EF.
* ISRs can be marked with RTEMS_FLOATING_POINT on registration.
* interrupts that have one or more ISRs marked FP: save FP context if
PSR.EF=1, but leave PSR.EF=0 on interrupt exit to fall into FP_disabled
trap. We must take extra care of nested interrupts here, not to overwrite
the TCB FP context?
To avoid the problem with clobbering FP context, you can have the default
interrupt handler options with enabled RTEMS_FLOATING_POINT. Personally I
still think that the user should be responsible to save FP context. That
could perhaps also be possible in this configuration by marking the ISR as
non-FP context and then handling it itself as we do today?
What do you think?
--
Ticket URL: <http://devel.rtems.org/ticket/2270#comment:8>
RTEMS Project <http://www.rtems.org/>
RTEMS Project
More information about the bugs
mailing list