[PATCH 2/3] bsps/i386: use Pentimum instructions for pc586 and pc686 builds.

Sebastian Huber sebastian.huber at embedded-brains.de
Wed Oct 12 14:02:14 UTC 2016


This looks like a compiler or libatomic configuration bug. I currently 
have no time to investigate this further.

On 12/10/16 15:36, Pavel Pisa wrote:
> Hello Sebastian,
>
> On Wednesday 12 of October 2016 10:35:55 Sebastian Huber wrote:
>> On 12/10/16 10:26, pisa at cmp.felk.cvut.cz wrote:
>>> SMP build is broken with i386 set because libatomic and GCC
>>> generate infinite loop for __atomic_fetch_add_4 used
>>> in rtems_interrupt_lock_acquire
>>>
>>> __atomic_fetch_add_4:
>>>       push   %ebp
>>>       mov    %esp,%ebp
>>>       movl   $0x5,0x10(%ebp)
>>>       pop    %ebp
>>>       jmp    __atomic_fetch_add_4
>> Do you have a test case for this compiler/RTEMS bug?  The use of
>> libatomic is inefficient, but it should work.
> may be it is problem of my i386 toolchain build, I have not
> updated it from April.
>
> The next is a simple test
>
> ------------------------------------------------
> #include <stdatomic.h>
>
> atomic_uint atvar1;
>
> volatile unsigned int res1;
> volatile unsigned int  res2;
>
> int main(void)
> {
>
>    res1 = atomic_fetch_or(&atvar1, 0x55);
>
>    res2 = atomic_fetch_add(&atvar1, 0xaa);
>
>    return 0;
> }
> ------------------------------------------------
>
> The next build commands are used
>
> i386-rtems4.12-gcc --pipe -B/opt/rtems4.12/i386-rtems4.12/pc686/lib/ -specs
> bsp_specs -qrtems    -I /opt/rtems4.12/i386-rtems4.12/pc686/lib/include -march=i386 -Wall  -O2 -g -ffunction-sections -fdata-sections -o
> libatomic-add-test.o -c libatomic-add-test.c
>
> i386-rtems4.12-gcc --pipe -B/opt/rtems4.12/i386-rtems4.12/pc686/lib/ -specs
> bsp_specs -qrtems -mtune=pentiumpro -march=pentium -Wall -O2 -g -ffunction-sections -fdata-sections   -Wl,--gc-sections -Wl,-Ttext,0x00100000
> libatomic-add-test.o -o libatomic-test-add
>
> problem appears with and without -march=i386, when -march is something
> newer (pentium) then all is OK.
>
>
> Disassembly looks like
>
> ------------------------------------------------
>
> 00120a76 <__atomic_fetch_add_4>:
>    120a76:       55                      push   %ebp
>    120a77:       89 e5                   mov    %esp,%ebp
>    120a79:       c7 45 10 05 00 00 00    movl   $0x5,0x10(%ebp)
>    120a80:       5d                      pop    %ebp
>    120a81:       eb f3                   jmp    120a76 <__atomic_fetch_add_4>
>
> 00120a83 <__atomic_add_fetch_4>:
>    120a83:       55                      push   %ebp
>    120a84:       89 e5                   mov    %esp,%ebp
>    120a86:       c7 45 10 05 00 00 00    movl   $0x5,0x10(%ebp)
>    120a8d:       5d                      pop    %ebp
>    120a8e:       eb e6                   jmp    120a76 <__atomic_fetch_add_4>
>
> 00120a90 <__atomic_fetch_or_4>:
>    120a90:       55                      push   %ebp
>    120a91:       89 e5                   mov    %esp,%ebp
>    120a93:       56                      push   %esi
>    120a94:       53                      push   %ebx
>    120a95:       83 ec 0c                sub    $0xc,%esp
>    120a98:       8b 5d 08                mov    0x8(%ebp),%ebx
>    120a9b:       53                      push   %ebx
>    120a9c:       e8 df 66 00 00          call   127180 <_Libatomic_Protect_start>
>    120aa1:       8b 33                   mov    (%ebx),%esi
>    120aa3:       8b 55 0c                mov    0xc(%ebp),%edx
>    120aa6:       09 f2                   or     %esi,%edx
>    120aa8:       89 13                   mov    %edx,(%ebx)
>    120aaa:       5a                      pop    %edx
>    120aab:       59                      pop    %ecx
>    120aac:       50                      push   %eax
>    120aad:       53                      push   %ebx
>    120aae:       e8 ed 66 00 00          call   1271a0 <_Libatomic_Protect_end>
>    120ab3:       8d 65 f8                lea    -0x8(%ebp),%esp
>    120ab6:       89 f0                   mov    %esi,%eax
>    120ab8:       5b                      pop    %ebx
>    120ab9:       5e                      pop    %esi
>    120aba:       5d                      pop    %ebp
>    120abb:       c3                      ret
>
> ------------------------------------------------
>
>
> _Libatomic_Protect_start is provided by RTEMS.
>
> ------------------------------------------------
> 00127180 <_Libatomic_Protect_start>:
> __uint32_t _Libatomic_Protect_start( void *ptr )
> {
>    ISR_Level isr_level;
>
>    (void) ptr;
>    _ISR_Local_disable( isr_level );
>    127180:       9c                      pushf
>    127181:       fa                      cli
>    127182:       58                      pop    %eax
> static inline bool _CPU_atomic_Flag_test_and_set( CPU_atomic_Flag *obj, CPU_atomic_Order order )
> {
> #if defined(_RTEMS_SCORE_CPUSTDATOMIC_USE_ATOMIC)
>    return obj->test_and_set( order );
> #elif defined(_RTEMS_SCORE_CPUSTDATOMIC_USE_STDATOMIC)
>    return atomic_flag_test_and_set_explicit( obj, order );
>    127183:       b1 01                   mov    $0x1,%cl
>    127185:       8d 74 26 00             lea    0x0(%esi,%eiz,1),%esi
>    127189:       8d bc 27 00 00 00 00    lea    0x0(%edi,%eiz,1),%edi
>    127190:       88 ca                   mov    %cl,%dl
>    127192:       86 15 8c 85 13 00       xchg   %dl,0x13858c
>
> #if defined(RTEMS_SMP)
>    while (
>    127198:       84 d2                   test   %dl,%dl
>    12719a:       75 f4                   jne    127190 <_Libatomic_Protect_start+0x10>
>
> ------------------------------------------------
>
> When I check actual GCC repository code and compare
>
> https://gcc.gnu.org/viewcvs/gcc/trunk/libatomic/fadd_n.c?revision=232055&view=markup
>
> https://gcc.gnu.org/viewcvs/gcc/trunk/libatomic/fior_n.c?revision=187018&view=markup
>
> then I would interpret atomic add as enforcing use of GCC generated code
> through forcing HAVE_ATOMIC_FETCH_OP_4 which I expect is solved by gcc
> as call for helper __atomic_fetch_add_4 because i386 has no guaranteed
> atomic add opcode ??? lock does not work there ???. Tail recursion optimization
> changes call to the jump.
>
> So it seems to be strange from original 2012 code version.
>
> Have you any idea?
>
> try it with official RSB build toolchain.
> I would update tools when find more time.
>
> Best wishes,
>
> Pavel

-- 
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



More information about the devel mailing list