RTEMS on MIPS

Fri Feb 15 16:27:37 UTC 2002

Hi everyone,

We've been finishing/debugging/tweaking a MIPS R3000 bsp for RTEMS,
the "Mongoose 5" processor, and have been working through improving
support for per-task FPU control and interrupt level stuff, plus
development of a fancier link model.  Some of our updates involve code
shared with other MIPS bsp's, including R4000 devices.  Generally, the
result is all the MIPS bsps will now support a reasonable
interpretation of interrupt levels and will have proper support of
per-task FPU settings.  In the course of this work, we may have
discovered a bug in the link scripts used by all the MIPS bsps.

Specifically, the gp segment value is set 16 bytes too high, leading
to all .sdata and .sbss addresses being shifted too high by that much.
It appears that only Newlib employs any gp-relative addresses; a
scratch pointer and a couple pointers related to POSIX environment
variable support, so unless the 16 byte shift overlaps something in
.bss, there shouldn't be an apparent problem.  It seems there is a
general lack of detailed understanding about what MIPS gp-relative
addressing is, other than its considered "bad".  The following is what
I've learned about it so far;

GP relative addressing is a GNU thing, there is no hardware support in
MIPS for it.  An arbitrary register is allocated by the ABI? for use
as a base to allow for more rapid addressing of particularly sized
data.  This register is called 'gp' and its value is constant, though
computed relative to the programs size.  The gp section allows a
variable within it to be loaded or stored with a single instruction
without involving a temporary base register.  The speedup implications
of this are appealing.  The -G parameter selects the maximum size of
data items to be included in the gp "section", any variables in a C
file (C++ too probably) of that size or smaller are automatically made
gp-relative.  If gcc includes a data item in the gp section, all
references to it are via offsets relative to the value in the "gp"
register.

The gp section itself is an abstract concept, the physical locations
of gp-relative data gp section are the .sdata and .sbss sections,
which are located after .data and before .bss and in that order.  One
might think that gp should be set to the first address of the first
.sdata item, but that isn't the case.  Instead, the first address of
.sdata should be aligned to 16 bytes and gp set to that address +
0x7ff0.  Code using gp-relative addresses is generated by gcc to use
negative offsets relative to this somewhat arbitrary address.  It also
means the gp base value is pointing to somewhere in .bss, stack, or
somewhere else.  But, thats OK because the negative offsets begin at
(gp - 0x7ff0) for the first item in .sdata.  There should be no extra
alignment imposed between .sdata and .sbss.  Even if all 0x7ff0
possible bytes aren't used, things are still OK because no gp-relative
code will refer to data before or after the variables actually IN the
gp sections.  If you use a different offset for computing the value of
gp, all gp-relative addressing will end up somewhere other than .sdata
& .sbss, or at least will be shifted within those regions.  I built up
this theory by looking at compiled code, backtracking to the
.sdata/.sbss segments and contrasting with results obtained from the
canonical binutils linkscripts.

Presumably gcc could switch to a different gp offset at some point, so
its something to pay attention to on MIPS.  I would like to know why
its set up like this because the MIPS load/store ops have a full 16
bit offset available.  I think positive offsets from 0 with gp set to
the beginning of .sdata would make lots more sense.

I was wondering if theres some reason why RTEMS doesn't use the gp
section more heavily, it could get a performance gain by putting high
demand variables in there.  One problem might be its difficult to
control what becomes a gp relative variable.  -G offers only very
coarse control and gcc would have to know a particular variable is
gp-relative in all locations it is referenced, hence the oft-heard
requirement that a program should use the same -G setting for all its
files.  I suspect referencing gp-relative variables from assembly
could also be difficult.  The evil thing about the gp section is if a
particular variable is perceived as being in gp while compiling one
file and not when compiling another, then the 2 chunks of object code
will work against 2 different locations in memory.  Link problems due
to the duplicate symbol defintion might catch it I suppose.  But if
one of the duplicate symbols is discarded or the error suppressed,
then the executable is broken and can only be identified as such by
comparing the address of variables as perceived wherever they're used.
A nasty problem...  If a #pragma or type attribute or somesuch
controlled membership in the gp section, it would be much easier to
employ.

We're continuing to develop understanding of this issue, if our work
affects you or we're misunderstanding something, please get in touch
with us so we can do the right things.

Thanks,

Greg Menke