Memory Barrier

Till Straumann strauman at
Tue Sep 11 05:12:01 UTC 2007

Good catch, Kate.

Yesterday I patched cpukit/score/cpu/powerpc, m68k and i386 (the
architectures we use here) and introduced the "memory" clobber
to the basic ASM macros (including CPU_MSR_SET) but
Joel is trying to eliminate the use of CPU_ISR_xxable() leaving
the low-level macros alone.

Note that _ISR_Enable()/_ISR_Disable() cannot be used in the
code sections Kate refers to because what's needed here
is the opposite:
  old_mask = get_mask_and_ENABLE_irq()
  set_mask( old_mask );


Feng, Kate wrote:
> How about _CPU_MSR_SET() used in various PPC/irq.c ? 
> It's inline, but not protected by memory
> barrier, either.
>       _CPU_MSR_GET(msr);
>       new_msr = msr | MSR_EE;
>       _CPU_MSR_SET(new_msr);
>        ...........
>       ............
>       _CPU_MSR_SET(msr);
> Regards,
> Kate
> -----Original Message-----
> From: at on behalf of Joel Sherrill
> Sent: Mon 9/10/2007 3:34 PM
> To: Till Straumann
> Cc: rtems-users at
> Subject: Re: Memory Barrier
> Till Straumann wrote:
>> The bad news are that a number of CPUs ( i386, m68k, sparc, sh, ... )
>> define
>> #define _CPU_ISR_XXable(k)  <cpu>_XXable_interrupt(K)
>> and the latter is also used in many places (in- and outside of cpukit).
> I am nearly complete on the test build for the search and replace
> on _CPU_ISRXXable under c/.   There is no reason the BSPs and
> other code shouldn't have used the public API.  So this is an excuse
> at that cleanup.
> I will hunt for the others since unless there is some obvious reason
> to use them, they are just barriers to driver portability.
>> I start to wonder if it wouldn't be easier to introduce the barrier
>> where the relevant lowest-level macros for each CPU are defined.
> It might be prudent but it is also effective to just forbid anyone
> from using the _CPU_ISR form.
>> BTW: IMO, barriers are also needed around _CPU_ISR_Set_level()
>> which could have the effect of enabling or disabling interrupts.
> This is probably prudent although I don't think any use of it
> inside RTEMS is particularly critical.  As far as I recall, it is
> only used during task initialization and mode changes.
> --joel
>> -- T.
>> Joel Sherrill wrote:
>>> Till Straumann wrote:
>>>> I filed PR1257 discouraging the use of
>>>> _CPU_ISR_Disable()/_CPU_ISR_Enable()
>>>> These macros should never be used directly
>>>> (they are defined by every CPU architecture
>>>> to be expanded by the generic macros
>>>> _ISR_Disable()/_ISR_Enable() which add a compiler
>>>> memory barrier).
>>> And are made publicly available
>>> as rtems_interrupt_disable and
>>> rtems_interrupt_enable.  That is the
>>> name that every BSP and device driver
>>> should use.  There should be virtually
>>> no code under rtems-XXX/c which references
>>> _CPU_ISR or _ISR_.
>>> I have made a big edit sweep and will
>>> start to commit once I do a full build and
>>> review the diff.
>>> --joel
>>>> -T
>>>> Pavel Pisa wrote:
>>>>> On Thursday 06 September 2007 11:43, Feng, Kate wrote:
>>>>>> Joel Sherrill wrote :
>>>>>>> The memory barrier patch was PR 866 and was merged in March 
>>>>>>> 2006.  It is
>>>>>>> in all 4.7 versions.  It first appeared in 4.6.6.
>>>>>> It looks like it, but RTEMS4.7.x still needs patches.
>>>>>> This is not even fixed in
>>>>>> The memory barrier definitely should be fixed in RTEMS4.7.x
>>>>>> before jumping to RTEMS4.8.
>>>>>> Suggestions follow, except I hope I do not miss anything
>>>>>> since I came up with this a while ago.
>>>>>> 1) In cpukit/score/include/rtems/system.h:
>>>>>> #define RTEMS_COMPILER_MEMORY_BARRIER() asm volatile(::: "memory")
>>>>>> seems to be wrong and misplaced.
>>>>>> The memory barrier is processor dependent.
>>>>>> For example, the memory barrier for PowerPC is "sync".
>>>>>> Thus, for PPC,  it would seem more functional to place
>>>>>> #define RTEMS_COMPILER_MEMORY_BARRIER() asm volatile("sync"::: 
>>>>>> "memory")
>>>>>> in cpukit/score/cpu/powerpc/system.h
>>>>>> or somewhere in the processor branch.
>>>>> Hello Kate and others,
>>>>> I would like to react there, because I think, that proposed
>>>>> addition of "sync" is move into really bad direction.
>>>>> RTEMS_COMPILER_MEMORY_BARRIER is and should remain what it
>>>>> is, I believe. It is barrier against compiler optimizer
>>>>> caused reordering of instruction over the barrier.
>>>>> This does not try to declare/cause any globally visible
>>>>> ordering guarantee, by name and anything else.
>>>>> Each architecture 'X' conforming CPU has to guarantee,
>>>>> that even after complex CPU level instruction reordering,
>>>>> register renaming and transfers delaying an sequence
>>>>> of instruction would result in same state (all viewed
>>>>> from CPU POV) as if instructions has been processed
>>>>> in sequential order one by one.
>>>>> This does not mean anything about external memory transfers
>>>>> order at all (at least for PPC, there are some special rules
>>>>> for x86 caches for compatibility with old programs).
>>>>> The macro ensures only ordering of memory transfers
>>>>> from actual CPU POV/perspective. But this is enough
>>>>> even for POV of normal mode and consecutively invoked
>>>>> exception handler working with same data.
>>>>> Even if exception handler starts and CPU does not finish
>>>>> transfers caused by previously initiated operations, reads
>>>>> from exception on same!!! CPU would read back data from
>>>>> write buffer if the address corresponds to previously written
>>>>> data. So preemption or CPU IRQ flags manipulation in scope
>>>>> of the actual CPU does not need enforcing ordering of real
>>>>> memory by very expensive "sync" instruction. It only needs to
>>>>> be sure, that CPU accounts/is aware of the value write transfer
>>>>> before at correct point in the instruction sequence.
>>>>> On the other hand, there could be other reasons and situations
>>>>> requesting correct ordering of externally visible transfers.
>>>>> For example, if IRQ controller is mapped as peripheral into
>>>>> external memory/IO space and CPU IRQ is disabled, than some
>>>>> mask is changed in the controller to disable one of external
>>>>> sources and it is expected, that after IRQ enabling on CPU
>>>>> level there cannot arrive event from that source, ordering
>>>>> of reads and writes to the controller has to be synchronized
>>>>> with CPU ("eieio" has to be used in the PPC case). But it
>>>>> is not task for CPU level IRQ state manipulation. The ordering
>>>>> should and in RTEMS case is ensured by IO access routines
>>>>> which include "eieio" instruction. On the other hand, if
>>>>> some external device is accessed through overlay structures
>>>>> (even volatile), then ordering could be broken without
>>>>> explicitly inserted "eieio".
>>>>> Other legitimate requirement for strict ordering/barrier for
>>>>> external accesses are the cases, where external device/DMA/coprocessor
>>>>> accesses/shares data in system/main memory with CPU.
>>>>>  The "sync"/cache range invalidation/flushing is required before
>>>>> and after external memory accesses (the exact details depend on
>>>>> transfers directions and other parameters).
>>>>>> 3) Among PPC shared/irq/irq.c and other PPCs,
>>>>>> _ISR_Disable( _level ), and _ISR_Disable( _level )
>>>>>> should be used instead of _CPU_ISR_Disable(level) and
>>>>>> _CPU_ISR_Enable( _level )
>>>>> But I fully agree with you, that sequences like following
>>>>> one are fatally broken
>>>>>      _CPU_ISR_Disable(level);
>>>>>      *irq = rtems_hdl_tbl[irq->name];
>>>>>      _CPU_ISR_Enable(level);
>>>>> There is no guarantee, that operation which has been expected
>>>>> to be protected would not be moved outside of protection sequence.
>>>>> Explicit or implicit RTEMS_COMPILER_MEMORY_BARRIER is missing there.
>>>>>> Actually, I think a better one should be 
>>>>>> rtems_interrupt_disable(level)
>>>>>> and rtems_interrupt_enable(level).
>>>>> The code should be changed according to one of your proposals.
>>>>>> 2) In order for the inline to work, the
>>>>>> CPU_INLINE_ENABLE_DISPATCH should be defined to be TRUE.
>>>>>> Thus,
>>>>>> in cpukit/score/cpu/powerpc/rtems/score/cpu.h:
>>>>> As for the non-inlined version of _Thread_Enable_dispatch,
>>>>> there should not be problem. Calling function without static
>>>>> or inline attributes is considered as full compiler memory ordering
>>>>> barrier point. So no explicit compiler barrier should be needed
>>>>> there.
>>>>> All this is based upon my understanding of code and computer
>>>>> systems principles. There is no doubt, that there could be
>>>>> many other problems and errors. But if there are problems
>>>>> with IRQs behavior on PPC, then the checking, that sequences
>>>>> like above one do not exist. The _ISR_Disable()/_ISR_Disable()
>>>>> or higher level rtems_ variants should be used in noticed source
>>>>> file. Else bad things could happen.
>>>>> Excuse me for long answer, but I wanted to clarify things
>>>>> as good way as I could.
>>>>> Best wishes
>>>>>             Pavel
>>>>> _______________________________________________
>>>>> rtems-users mailing list
>>>>> rtems-users at
>>>> _______________________________________________
>>>> rtems-users mailing list
>>>> rtems-users at
> _______________________________________________
> rtems-users mailing list
> rtems-users at

More information about the users mailing list