[PATCH 2/3] bsps/i386: use Pentimum instructions for pc586 and pc686 builds.

Pavel Pisa pisa at cmp.felk.cvut.cz
Wed Oct 12 13:36:00 UTC 2016


Hello Sebastian,

On Wednesday 12 of October 2016 10:35:55 Sebastian Huber wrote:
> On 12/10/16 10:26, pisa at cmp.felk.cvut.cz wrote:
> > SMP build is broken with i386 set because libatomic and GCC
> > generate infinite loop for __atomic_fetch_add_4 used
> > in rtems_interrupt_lock_acquire
> >
> > __atomic_fetch_add_4:
> >      push   %ebp
> >      mov    %esp,%ebp
> >      movl   $0x5,0x10(%ebp)
> >      pop    %ebp
> >      jmp    __atomic_fetch_add_4
>
> Do you have a test case for this compiler/RTEMS bug?  The use of
> libatomic is inefficient, but it should work.

may be it is problem of my i386 toolchain build, I have not
updated it from April.

The next is a simple test

------------------------------------------------
#include <stdatomic.h>

atomic_uint atvar1;

volatile unsigned int res1;
volatile unsigned int  res2;

int main(void)
{

  res1 = atomic_fetch_or(&atvar1, 0x55);

  res2 = atomic_fetch_add(&atvar1, 0xaa);

  return 0;
}
------------------------------------------------

The next build commands are used

i386-rtems4.12-gcc --pipe -B/opt/rtems4.12/i386-rtems4.12/pc686/lib/ -specs 
bsp_specs -qrtems    -I /opt/rtems4.12/i386-rtems4.12/pc686/lib/include -march=i386 -Wall  -O2 -g -ffunction-sections -fdata-sections -o 
libatomic-add-test.o -c libatomic-add-test.c

i386-rtems4.12-gcc --pipe -B/opt/rtems4.12/i386-rtems4.12/pc686/lib/ -specs 
bsp_specs -qrtems -mtune=pentiumpro -march=pentium -Wall -O2 -g -ffunction-sections -fdata-sections   -Wl,--gc-sections -Wl,-Ttext,0x00100000  
libatomic-add-test.o -o libatomic-test-add

problem appears with and without -march=i386, when -march is something
newer (pentium) then all is OK.


Disassembly looks like

------------------------------------------------

00120a76 <__atomic_fetch_add_4>:
  120a76:       55                      push   %ebp
  120a77:       89 e5                   mov    %esp,%ebp
  120a79:       c7 45 10 05 00 00 00    movl   $0x5,0x10(%ebp)
  120a80:       5d                      pop    %ebp
  120a81:       eb f3                   jmp    120a76 <__atomic_fetch_add_4>

00120a83 <__atomic_add_fetch_4>:
  120a83:       55                      push   %ebp
  120a84:       89 e5                   mov    %esp,%ebp
  120a86:       c7 45 10 05 00 00 00    movl   $0x5,0x10(%ebp)
  120a8d:       5d                      pop    %ebp
  120a8e:       eb e6                   jmp    120a76 <__atomic_fetch_add_4>

00120a90 <__atomic_fetch_or_4>:
  120a90:       55                      push   %ebp
  120a91:       89 e5                   mov    %esp,%ebp
  120a93:       56                      push   %esi
  120a94:       53                      push   %ebx
  120a95:       83 ec 0c                sub    $0xc,%esp
  120a98:       8b 5d 08                mov    0x8(%ebp),%ebx
  120a9b:       53                      push   %ebx
  120a9c:       e8 df 66 00 00          call   127180 <_Libatomic_Protect_start>
  120aa1:       8b 33                   mov    (%ebx),%esi
  120aa3:       8b 55 0c                mov    0xc(%ebp),%edx
  120aa6:       09 f2                   or     %esi,%edx
  120aa8:       89 13                   mov    %edx,(%ebx)
  120aaa:       5a                      pop    %edx
  120aab:       59                      pop    %ecx
  120aac:       50                      push   %eax
  120aad:       53                      push   %ebx
  120aae:       e8 ed 66 00 00          call   1271a0 <_Libatomic_Protect_end>
  120ab3:       8d 65 f8                lea    -0x8(%ebp),%esp
  120ab6:       89 f0                   mov    %esi,%eax
  120ab8:       5b                      pop    %ebx
  120ab9:       5e                      pop    %esi
  120aba:       5d                      pop    %ebp
  120abb:       c3                      ret

------------------------------------------------


_Libatomic_Protect_start is provided by RTEMS.

------------------------------------------------
00127180 <_Libatomic_Protect_start>:
__uint32_t _Libatomic_Protect_start( void *ptr )
{
  ISR_Level isr_level;

  (void) ptr;
  _ISR_Local_disable( isr_level );
  127180:       9c                      pushf
  127181:       fa                      cli
  127182:       58                      pop    %eax
static inline bool _CPU_atomic_Flag_test_and_set( CPU_atomic_Flag *obj, CPU_atomic_Order order )
{
#if defined(_RTEMS_SCORE_CPUSTDATOMIC_USE_ATOMIC)
  return obj->test_and_set( order );
#elif defined(_RTEMS_SCORE_CPUSTDATOMIC_USE_STDATOMIC)
  return atomic_flag_test_and_set_explicit( obj, order );
  127183:       b1 01                   mov    $0x1,%cl
  127185:       8d 74 26 00             lea    0x0(%esi,%eiz,1),%esi
  127189:       8d bc 27 00 00 00 00    lea    0x0(%edi,%eiz,1),%edi
  127190:       88 ca                   mov    %cl,%dl
  127192:       86 15 8c 85 13 00       xchg   %dl,0x13858c

#if defined(RTEMS_SMP)
  while (
  127198:       84 d2                   test   %dl,%dl
  12719a:       75 f4                   jne    127190 <_Libatomic_Protect_start+0x10>

------------------------------------------------

When I check actual GCC repository code and compare

https://gcc.gnu.org/viewcvs/gcc/trunk/libatomic/fadd_n.c?revision=232055&view=markup

https://gcc.gnu.org/viewcvs/gcc/trunk/libatomic/fior_n.c?revision=187018&view=markup

then I would interpret atomic add as enforcing use of GCC generated code
through forcing HAVE_ATOMIC_FETCH_OP_4 which I expect is solved by gcc
as call for helper __atomic_fetch_add_4 because i386 has no guaranteed
atomic add opcode ??? lock does not work there ???. Tail recursion optimization
changes call to the jump.

So it seems to be strange from original 2012 code version.

Have you any idea?

try it with official RSB build toolchain.
I would update tools when find more time.

Best wishes,

Pavel



More information about the devel mailing list