[PATCH 1/2] RTEMS Source Builder changes for lpc32xx VFP support

Kirspel, Kevin Kevin-Kirspel at idexx.com
Mon Aug 22 14:42:43 UTC 2016


Here is the updated metrics with -fno-builtin.  You are right in the sense that the generated code did not call the underlying math functions.  I will generate a patch for the hard implementation.

softp - 4493 ms
hard - 4263 ms

Kevin Kirspel
Electrical Engineer - Sr. Staff
Opti Medical
235 Hembree Park Drive
Roswell GA 30076
Tel: (770)-510-4444 ext. 81642
Direct: (770)-688-1642
Fax: (770)-510-4445

-----Original Message-----
From: Sebastian Huber [mailto:sebastian.huber at embedded-brains.de] 
Sent: Monday, August 22, 2016 7:34 AM
To: Kirspel, Kevin <Kevin-Kirspel at idexx.com>; devel at rtems.org
Subject: Re: [PATCH 1/2] RTEMS Source Builder changes for lpc32xx VFP support



On 22/08/16 13:23, Kirspel, Kevin wrote:
> I used -O2.

Did you look at the generated code? I guess you find no calls to the standard math library functions. Which timing numbers do you get if you use "-O2 -fno-builtin" instead?

> Is there some standard floating point test that would be a better gauge?
>
> Kevin Kirspel
> Electrical Engineer - Sr. Staff
> Opti Medical
> 235 Hembree Park Drive
> Roswell GA 30076
> Tel: (770)-510-4444 ext. 81642
> Direct: (770)-688-1642
> Fax: (770)-510-4445
>
> -----Original Message-----
> From: Sebastian Huber [mailto:sebastian.huber at embedded-brains.de]
> Sent: Monday, August 22, 2016 1:41 AM
> To: Kirspel, Kevin <Kevin-Kirspel at idexx.com>; devel at rtems.org
> Subject: Re: [PATCH 1/2] RTEMS Source Builder changes for lpc32xx VFP 
> support
>
> On 19/08/16 17:18, Kirspel, Kevin wrote:
>> I built GCC with the hard abi.  The software ran fine but I see a performance hit with the hard abi.
>>
>> ABI:	Test Execution Time (ms)
>> soft	1358
>> softfp	96
>> hard	109
>>
>> I don't know why the hard is slower than the softfp.  Here was my test code.
> Which compiler options did you use for this test? I am pretty sure that the compiler will reduce your test code to constants with -O2, since you only use constant expressions.
>
>>           int ii;
>>           uint32_t end_time, start_time;
>>           double dvalue[8], dresult[20];
>>           
>>           start_time = rtems_clock_get_ticks_since_boot();
>>           for( ii = 0; ii < 20; ii++ )
>>           {
>>             dresult[ii] = 0.0;
>>           }
>>           dvalue[0] = 12.12;
>>           dvalue[1] = 23.23;
>>           dvalue[2] = 34.34;
>>           dvalue[3] = 45.45;
>>           dvalue[4] = 56.56;
>>           dvalue[5] = 67.67;
>>           dvalue[6] = 78.78;
>>           dvalue[7] = 89.89;
>>           for( ii = 0; ii < 10000; ii++ )
>>           {
>>             dresult[0] += (( dvalue[0] * dvalue[0] ) + ( dvalue[1] * dvalue[1] ) + ( dvalue[2] * dvalue[2] ) + ( dvalue[3] * dvalue[3] ) +
>>               ( dvalue[4] * dvalue[4] ) + ( dvalue[5] * dvalue[5] ) + ( dvalue[6] * dvalue[6] ) + ( dvalue[7] * dvalue[7] )) /
>>               (( dvalue[0] / dvalue[1] ) - ( dvalue[1] / dvalue[2] ) - ( dvalue[2] / dvalue[3] ) - ( dvalue[3] / dvalue[4] ) -
>>               ( dvalue[4] / dvalue[5] ) - ( dvalue[5] / dvalue[6] ) - ( dvalue[6] / dvalue[7] ));
>>             dresult[1] += cos( dvalue[0] * PI / 180.0 );
>>             dresult[2] += sin( dvalue[1] * PI / 180.0 );
>>             dresult[3] += tan( dvalue[2] * PI / 180.0 );
>>             dresult[4] += acos( 1 / dvalue[3] ) * 180.0 / PI;
>>             dresult[5] += asin( 1 / dvalue[4] ) * 180.0 / PI;
>>             dresult[6] += atan( dvalue[5] ) * 180.0 / PI;
>>             dresult[7] += atan2( dvalue[6], dvalue[7] ) * 180.0 / PI;
>>             dresult[8] += cosh( 1 / dvalue[7] );
>>             dresult[9] += sinh( dvalue[0] );
>>             dresult[10] += tanh( dvalue[1] );
>>             dresult[11] += acosh( dvalue[2] );
>>             dresult[12] += asinh( dvalue[3] );
>>             dresult[13] += atanh( 1 / dvalue[4] );
>>             dresult[14] += exp(   1 / dvalue[5] );
>>             dresult[15] += pow( dvalue[6], 1 / dvalue[7] );
>>             dresult[16] += sqrt( dvalue[7] );
>>             dresult[17] += log( dvalue[0] );
>>             dresult[18] += log10( dvalue[1] );
>>             dresult[19] += log2( dvalue[2] );
>>           }
>>           end_time = rtems_clock_get_ticks_since_boot();
>>           for( ii = 0; ii < 20; ii++ )
>>           {
>>             printf( "dresult[%d] = %f\n", ii, dresult[ii] );
>>           }
>>           printf( "VFP Execution Time: %lu\n", end_time - start_time 
>> );
>>
>> Kevin Kirspel
>> Electrical Engineer - Sr. Staff
>> Opti Medical
>> 235 Hembree Park Drive
>> Roswell GA 30076
>> Tel: (770)-510-4444 ext. 81642
>> Direct: (770)-688-1642
>> Fax: (770)-510-4445
>>
>> -----Original Message-----
>> From: Sebastian Huber [mailto:sebastian.huber at embedded-brains.de]
>> Sent: Friday, August 19, 2016 8:56 AM
>> To: Kirspel, Kevin <Kevin-Kirspel at idexx.com>; devel at rtems.org
>> Subject: Re: [PATCH 1/2] RTEMS Source Builder changes for lpc32xx VFP 
>> support
>>
>> On 19/08/16 14:39, Kirspel, Kevin wrote:
>>> This goes back 5 or more years ago so my recollection might be off but I had issues with GCC 4.5.2 using -mfloat-abi=hard.  NXP recommends -mfloat-abi=softfp.  I can't find the article or forum now but I think it had to do with the fact that the LPC3250 VFP didn't support all the required instructions for -mfloat-abi=hard (something about needing a Undefined Instruction exception handler to process unsupported VFP instructions).  Using -mfloat-abi=softfp eliminates the need for the Undefined Instruction exception handler to handle unsupported VFP instructions.
>> The used VFP instructions should not depend on hard vs. softfp ABI since this should only affect the calling conventions. In addition we use now GCC 6. Could you please test with -mfloat-abi=hard. This is what all the other VFP multilibs use.
>>
>> --
>> Sebastian Huber, embedded brains GmbH
>>
>> Address : Dornierstr. 4, D-82178 Puchheim, Germany
>> Phone   : +49 89 189 47 41-16
>> Fax     : +49 89 189 47 41-09
>> E-Mail  : sebastian.huber at embedded-brains.de
>> PGP     : Public key available on request.
>>
>> Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
>>
>> _______________________________________________
>> devel mailing list
>> devel at rtems.org
>> http://lists.rtems.org/mailman/listinfo/devel

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



More information about the devel mailing list