[RTEMS Project] #2270: SPARC: Optimized floating-point context handling

Thu Feb 26 10:50:07 UTC 2015

#2270: SPARC: Optimized floating-point context handling
-----------------------------+------------------------------
 Reporter:  sebastian.huber  |       Owner:  sebastian.huber
     Type:  enhancement      |      Status:  accepted
 Priority:  normal           |   Milestone:  4.11.1
Component:  cpukit           |     Version:  4.11
 Severity:  normal           |  Resolution:
 Keywords:  SPARC            |
-----------------------------+------------------------------

Comment (by daniel):

 As I recall it GCC-4.0/4.1 and possible also 4.2 had a known bug
 generating FP instruction in integer code. In the Gaisler toolchain we
 made a fix it but the mainline does not have this issue any more. As you
 pointed out it is not efficient on a SPARC to use FP registers in integer
 code anyway.

 OAR and Gaisler analysis during the early SMP work came to the same
 conclusion. If I remember Linux has also disabled lazy switching on SMP.

 I agree with you about ABI and FP registers does not have to be saved in
 the normal case, unless a task is interrupted and context switched. I
 think you could make some significant improvements here.

 Personally I think it is not that big a deal to analyse the ISR to verify
 that FP instructions are not used. If used the user shall save the FP
 context itself by calling _CPU_Context_save/restore_fp(). As I understand
 most LEON OSes have this design. The RTEMS documentation about this
 probably should be improved. This approach is in practice a bit
 troublesome some since it in many cases would require an extra function
 stack frame. If a ISR uses float types but calls _CPU_Context_save() one
 can not know in which order GCC performs float variable initialization or
 the function call. So in practice one have to jump to ISR, call FP save,
 then call the real ISR implementation doing FP instructions. Perhaps this
 can be avoided? It would have been better that the trap handler saved the
 FP context before calling the ISR handler. Could we introduce a
 RTEMS_FLOATING_POINT option to the rtems_interrupt_handler_install()?

 RTEMS invites the usage of mixed ABIs by having the RTEMS_FLOATING_POINT
 option when creating tasks. Of course one have to be very careful mixing
 ABIs, I would recommend at least scan the binaries for float instructions.
 When verifying code you get a trap at first occurrence so normally code
 coverage is enough to be sure. So the float problem is not limited to the
 ISRs.

 (3)
 I think lazy context switching becomes less important when you implement a
 proper context switch taking the ABI into account as you describe. The FPU
 context is only one register, the FSR, when FPU is turned on. SO one might
 just save that one register, right? The case where lazy would be
 beneficial is when interrupting a task?

 (2)
 This is costy in the average case? My guess is that 99% of ISRs doesn't
 use floats and most tasks does not use a FP context. To the sake of
 average performance, wouldn't it be better to let the ISR handle the FP
 context itself or add a RTEMS_FLOATING_POINT option to the interrupt
 install routine?

 (3)
 I think it is problematic to disable the PSR.EF on entry. We should keep
 in mind that it is CPU implementation specific what happens when PSR.EF is
 cleared. Turning off the FPU could cause power-down mode or that FPU
 operations being paused. In a modern FPU the FPU performs operations in
 parallel with the integer pipeline so turning it off could actually have a
 negative impact on the interrupted task. Therefore it is important to
 store the FSR register to memory to wait for all FP operations to complete
 before turning the FPU off, this would introduce a performance loss and
 potentially a worst-case nightmare? Otherwise I like the idea of saving FP
 context only when a FP task is interrupted and context switched.

 Storing all the FPU registers to stack or storing the FSR registers waits
 until FPU operations are completed. Storing FSR last is the best since you
 could potentially store a FP register that does not depend on an ongoing
 operation.

 What about this for single-core and SMP (4):
 * FP TASK context switch: only save FSR and disable PSR.FP on normal
 context switch. Then we would wait for ongoing operations and ABI ensures
 FP registers need not to be saved.
 * interrupted FP TASK with PSR.EF=1 context switch: save FP registers,
 then FSR and clear PSR.EF.
 * ISRs can be marked with RTEMS_FLOATING_POINT on registration.
 * interrupts that have one or more ISRs marked FP: save FP context if
 PSR.EF=1, but leave PSR.EF=0 on interrupt exit to fall into FP_disabled
 trap. We must take extra care of nested interrupts here, not to overwrite
 the TCB FP context?

 To avoid the problem with clobbering FP context, you can have the default
 interrupt handler options with enabled RTEMS_FLOATING_POINT. Personally I
 still think that the user should be responsible to save FP context. That
 could perhaps also be possible in this configuration by marking the ISR as
 non-FP context and then handling it itself as we do today?

 What do you think?

--
Ticket URL: <http://devel.rtems.org/ticket/2270#comment:8>
RTEMS Project <http://www.rtems.org/>
RTEMS Project