ppc multlibs and BSP removal was Re: powerpc altivec support
Till Straumann
strauman at slac.stanford.edu
Thu Feb 10 03:40:09 UTC 2005
Ralf Corsepius wrote:
> On Wed, 2005-02-09 at 16:25 -0600, Joel Sherrill wrote:
>
>>Ralf Corsepius wrote:
>
>
>>>>GCC thinks a lot of -mXXX are -mppc internally anyway and RTEMS PROBABLY
>>>>doesn't care until you get to libcpu and libbsp.
>>>
>>>Unfortunately nothing could be further from the truth than this, as far
>>>as the powerpc is concerned - It is the design issue/flaw I have been
>>>repeatedly referring to.
>>>
>>>cpukit/score/cpu/powerpc/rtems/score/cpu.h is ca. 700 lines in size,
>>>scattered with ca 30 different defines, which all are candidates for
>>>conditionally compiling something somewhere. The problem there is to
>>>identify which are actually important and which are not.
>>>
>>>Try to track how *ALIGNMENT defines are set up and you'll probably
>>>experience what I am referring to.
>>
>>Turns out the more you live, the more you learn. :)
Absolutely - and life is still too short...
>>
>>PPC_CACHE_ALIGNMENT appears to be the same for almost all
>>configurations. It can be condensed to 1 define of 16 as best I can
>>tell. It is only used to properly align the bitmap structure used
>>for thread scheduling. If a multlib can distinguish the core in the
>>7455 and 8260, they use 32. The 74xx has an Altivec so that would be a
>>good candidate to multilib on and use 32.
Some PPCs have 16byte caches [860] most 32 byte.
In any way - before considering a multilib
a) check the implications of always using 32byte alignment by default
[user with special demands such as squeeze ram might need to rebuild
a new configuration].
b) if that's not practical, consider a run-time check. Cache line size
can easily be determined at startup and read from a variable.
>>
>>PPC_ALIGNMENT is basically what the heap has to align to. Does a double
>>have to be 8 or 4-byte aligned? A quick guess is that if you have
>>hardware FPU, then make it 8, else make it 4.
Might be an ABI issue anyways. AFAIK, malloc() must return memory
aligned properly for any data type (except vectors, altivec has
a special allocator).
SYSV demands that long double variables shall be 16-byte aligned,
EABI relaxes this to 8-byte alignment.
>
> Can anybody confirm these assumptions? If they hold, this was a
> breakthrough, causing powerpc.h to substantially collapse.
So far, we really only have SYSV vs EABI
>
>
>>Also as far as I know, there was NEVER an RTEMS user on the 601 or 602.
>>Those still say "Submitted with original port -- book checked only." so
>>that makes them high priority kill targets if they present any issues.
>>But all I see are alignment constants for them which are easy to get out
>>of the score.
>>
>>Can we deduce PPC_HAS_FPU directly from a cpp predefine?
>
> Conversely, I think we must.
>
> Adding a _SOFT_FLOAT != PPC_HAS_FPU preprocessor check reveals
> PPC_HAS_FPU to be inconsistent in comparison to _SOFT_FLOAT, i.e.
> broken.
I agree.
>
>
>>PPC_HAS_DOUBLE follows directly from PPC_HAS_FPU so I don't see any hint
>>of a CPU really having only 32-bit floating point registers. Doing a
>>quick search of gcc, I don't see such an animal either.
>
> Neither do I.
>
Dunno.
>
>>PPC_USE_MULTIPLE only appears in the bsp. I don't know on this but it
>>could move to libcpu.
This is a (probably unused) performance/space tuning parameter which
probably doesn't justify building multilibs
PPC features a single instruction for saving/restoring multiple registers
but the book warns that it may take "longer, perhaps much longer" on a
particular implementation than individual load/store instructions.
Again, rather than creating a CPU multilib variant we could implement
a run-time test that ends up patching/selecting the faster option if
we really worry [we're talking about ~30 instructions vs. the 'multiple
word instruction']. Recommend to simply set this to 0. Space savings
are at most a few hundred bytes.
Till
>
>
>>Ralf.. do you want to take a stab at moving those and let's see what is
>>next?
>
> I am already working on this - The limiting factor is the "sheer amount"
> of multilibs and BSPs. One iteration takes 24hours+ ;).
>
> Ralf
>
>
>
More information about the users
mailing list