Adding SPE APU context switching to the powerPC
dufault at hda.com
Tue Oct 13 08:25:59 UTC 2009
Here's what I'm thinking of doing to add support for the SPE context.
First, I won't initially tie it into the floating point context switch
code, because I think there might be unintentional assumptions in
there and I'll get to those issues once things are working. Second,
I think there are attractions to the way I'll prototype it.
1. Disable the SPE APU by clearing the MSR_SPV bit in the MSR (it's
the same as the MSR_VE bit for the Altivec) in start.S so that the CPU
starts up without the SPE available.
2. Add a structure to store the upper halves of the GPRs, the
accumulator, and the SPE status register in cpukit/score/cpu/powerpc/
rtems/score/cpu.h. Put a pointer, initialized to NULL, at the end of
the current Context_Control structure which will point to this for
tasks that want to use the SPE.
3. Add a function to enable access to the SPE. This function will
allocate the additional context structure and set the new pointer at
the end of the Context_Control structure to point at it, and will set
MSR_SPV in the MSR.
4. During the context switch:
4A: in _CPU_Context_switch() at the end of the current context save
test the MSR_VE bit for the task being switched out. If it is set
then check the pointer to the new context space. If that is NULL
panic. If that is not NULL save the SPE state into the context
4B: A little further on during the context restore of the switched in
task the new value of the MSR is loaded up. After that load test the
MSR_SPV bit of the new thread. If it is set restore: Check pointer,
panic if NULL, if not NULL restore the SPE context.
5. In _CPU_Context_restore(): Repeat the process described in 4B.
1. The fact that the thread is using the SPE is stored in the MSR and
not elsewhere. There is a consistency check in the not-NULLness of
the context storage area.
2. You declare you're going to use the SPE, you don't just use it and
get an exception to allocate it. I like this better: the context
storage is easily allocated, all threads don't incur the extra 136
bytes of context storage, where the additional setup time overhead is
incurred is clearly defined.
3. The indirection on the context storage is not cache friendly but
does avoid the extra space overhead. Am I prematurely optimizing over
a few K of RAM? Keep in mind that the on-chip RAM of these target
processors can be very small.
Once this is working I can fold it into floating-pointness of the task.
1. I don't think I need to do anything in the interrupt / exception
handling area, other than being sure the MSR_SPV bit is cleared, and I
assume there is a place to modify the MSR when entering exception
2. Are there additional advantages to lazy resource switching that I
don't see? I know that just because a thread might use the SPE
doesn't mean it will, but I'd rather have the fixed overhead of
switching in the resource at thread activation then a variable
overhead based on where the thread happens to be. Any comments on this?
Questions, comments, criticisms?
More information about the users