anybody using RTEMS on SH4?

Fri Sep 14 18:01:22 UTC 2007

Thank you for your suggestions.

On 9/14/07, Joel Sherrill <joel.sherrill at oarcorp.com> wrote:
>
>
> I don't have an SH to try anything on so am only going to offer some
> general ideas:
>
> + Is 4.x using double precision and 3.4.6 using single precision?

No.  AFAIK, single precision must be explicitly toggled in both compilers:
-m4-single.

+ Cache settings change somehow? Maybe gcc 4.x is optimizing
>    some critical setting out of the BSP initialization.

I'm currently investigating this issue. But probably not, because most cache
initialization is mostly inside "asm volatile" statements.

+ If there a change in the array indexing code?  There are options
>     to control multiply and division for the SH so I am curious.

I can be wrong, but those changes mostly apply to FPU less SH4 models.

+ Does it get better or worse when -Os is used?  Or -O2 with no
>    particular options?

Worse in both cases. I can post numbers if you are interested.

+ Is the BSP compiled with the old compiler or new?  I am curious
>     if it is possible to compile the benchmarks with the new compiler
>     and leave the rest of the system alone.  This would eliminate
>     something weird happening to the RTEMS code in the new compiler.

I tried different variants: RTEMS 4.6 compiled with 3.4.6 / application
compiled and linked with 4.3.0, RTEMS 4.7 compiled with 4.3.0/application
compiled and linked with 3.4.6. Results vary, but application compiled with
3.4.6 always show better performance. Currently I can't explain why
application compiled under 3.4.6, run slowly under RTEMS 4.7 (we really need
some profiling utilities for RTEMS).

Nickolay Kolchin wrote:
> > Hi,
> >
> > We have a performance problem on SH4 with gcc4.x.
> >
> > SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark
> > ================================================================
> >            GCC: 3.4.6   4.2.1   4.3.0 (20070907)
> >      Composite:  6.05    5.01    4.82
> >            FFT:  4.90    4.15    4.21
> >            SOR: 10.10    8.36     7.64
> >     MonteCarlo:  3.68    3.06    3.04
> > Sparse matmult:  5.45    4.45    4.03
> >             LU:  6.10    5.03    5.18
> > ================================================================
> >
> > BYTEmark* Native Mode Benchmark ver. 2 (10/95)
> > ================================================================
> >              GCC:      3.4.6      4.2.1  4.3.0 (20070907)
> >     NUMERIC SORT:     35.459       32.2      29.327
> >      STRING SORT:     0.5943    0.57604      0.8603
> >         BITFIELD: 1.0585e+07  9.269e+06  9.4138e+06
> >     FP EMULATION:     4.4944     4.6012       5.364
> >          FOURIER:     272.28     241.34      259.12
> >       ASSIGNMENT:    0.35997    0.38373     0.39683
> >             IDEA:     124.11     95.057      100.07
> >          HUFFMAN:     45.593     52.083      56.391
> >       NEURAL NET:    0.36153    0.30922     0.31348
> > LU DECOMPOSITION:     11.331     9.4938       8.255
> > ================================================================
> >
> > The "real world application" has 20%-200% performance regression with
> > GCC 4.x.
> >
> > This effectively prevents us from moving to RTEMS 4.7 from 4.6.
> >
> > I've reported this issue to gcc bugzilla:
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431
> > <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431>
> >
> > But SH4 backend maintainer Kazumoto Kojima, was unable to reproduce it
> > under linux-sh:
> > ================================================================
> >                         gcc-3.4.6    gcc-4.2.1    gcc-4.3.0(20070910)
> > Composite Score:            16.76        16.86        16.99
> > FFT              Mflops:    12.92        13.36        13.36
> > SOR              Mflops:     27.88        26.76        28.01
> > MonteCarlo:      Mflops:     9.96         9.73         9.67
> > Sparse matmult   Mflops:    14.95        16.06        14.84
> > LU               Mflops:     18.08        18.39        19.05
> > ================================================================
> >
> > Maybe, somebody is also using RTEMS on SH4 and can confirm my or
> > Kojima results?
> >
>

---
Nickolay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/users/attachments/20070914/76ad0df9/attachment-0001.html>