anybody using RTEMS on SH4?

Joel Sherrill joel.sherrill at oarcorp.com
Fri Sep 14 16:19:08 UTC 2007


I don't have an SH to try anything on so am only going to offer some
general ideas:

+ Is 4.x using double precision and 3.4.6 using single precision?
+ Cache settings change somehow? Maybe gcc 4.x is optimizing
   some critical setting out of the BSP initialization.
+ If there a change in the array indexing code?  There are options
    to control multiply and division for the SH so I am curious.
+ Does it get better or worse when -Os is used?  Or -O2 with no
   particular options?
+ Is the BSP compiled with the old compiler or new?  I am curious
    if it is possible to compile the benchmarks with the new compiler
    and leave the rest of the system alone.  This would eliminate
    something weird happening to the RTEMS code in the new compiler.

--joel

Nickolay Kolchin wrote:
> Hi,
>
> We have a performance problem on SH4 with gcc4.x.
>
> SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark
> ================================================================
>            GCC: 3.4.6   4.2.1   4.3.0 (20070907)
>      Composite:  6.05    5.01    4.82
>            FFT:  4.90    4.15    4.21
>            SOR: 10.10    8.36     7.64
>     MonteCarlo:  3.68    3.06    3.04
> Sparse matmult:  5.45    4.45    4.03
>             LU:  6.10    5.03    5.18
> ================================================================
>
> BYTEmark* Native Mode Benchmark ver. 2 (10/95)
> ================================================================
>              GCC:      3.4.6      4.2.1  4.3.0 (20070907)
>     NUMERIC SORT:     35.459       32.2      29.327
>      STRING SORT:     0.5943    0.57604      0.8603
>         BITFIELD: 1.0585e+07  9.269e+06  9.4138e+06
>     FP EMULATION:     4.4944     4.6012       5.364
>          FOURIER:     272.28     241.34      259.12
>       ASSIGNMENT:    0.35997    0.38373     0.39683
>             IDEA:     124.11     95.057      100.07
>          HUFFMAN:     45.593     52.083      56.391
>       NEURAL NET:    0.36153    0.30922     0.31348
> LU DECOMPOSITION:     11.331     9.4938       8.255
> ================================================================
>
> The "real world application" has 20%-200% performance regression with 
> GCC 4.x.
>
> This effectively prevents us from moving to RTEMS 4.7 from 4.6.
>
> I've reported this issue to gcc bugzilla: 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431 
> <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431>
>
> But SH4 backend maintainer Kazumoto Kojima, was unable to reproduce it 
> under linux-sh:
> ================================================================
>                         gcc-3.4.6    gcc-4.2.1    gcc-4.3.0(20070910)
> Composite Score:            16.76        16.86        16.99
> FFT              Mflops:    12.92        13.36        13.36
> SOR              Mflops:     27.88        26.76        28.01
> MonteCarlo:      Mflops:     9.96         9.73         9.67
> Sparse matmult   Mflops:    14.95        16.06        14.84
> LU               Mflops:     18.08        18.39        19.05
> ================================================================
>
> Maybe, somebody is also using RTEMS on SH4 and can confirm my or 
> Kojima results?
>
> ----
> Nickolay
> ------------------------------------------------------------------------
>
> _______________________________________________
> rtems-users mailing list
> rtems-users at rtems.com
> http://rtems.rtems.org/mailman/listinfo/rtems-users
>   




More information about the users mailing list