Memory Barrier

Joel Sherrill joel.sherrill at
Mon Sep 10 19:34:33 UTC 2007

Till Straumann wrote:
> The bad news are that a number of CPUs ( i386, m68k, sparc, sh, ... )
> define
> #define _CPU_ISR_XXable(k)  <cpu>_XXable_interrupt(K)
> and the latter is also used in many places (in- and outside of cpukit).
I am nearly complete on the test build for the search and replace
on _CPU_ISRXXable under c/.   There is no reason the BSPs and
other code shouldn't have used the public API.  So this is an excuse
at that cleanup.

I will hunt for the others since unless there is some obvious reason
to use them, they are just barriers to driver portability.
> I start to wonder if it wouldn't be easier to introduce the barrier
> where the relevant lowest-level macros for each CPU are defined.
It might be prudent but it is also effective to just forbid anyone
from using the _CPU_ISR form.
> BTW: IMO, barriers are also needed around _CPU_ISR_Set_level()
> which could have the effect of enabling or disabling interrupts.
This is probably prudent although I don't think any use of it
inside RTEMS is particularly critical.  As far as I recall, it is
only used during task initialization and mode changes.

> -- T.
> Joel Sherrill wrote:
>> Till Straumann wrote:
>>> I filed PR1257 discouraging the use of
>>> _CPU_ISR_Disable()/_CPU_ISR_Enable()
>>> These macros should never be used directly
>>> (they are defined by every CPU architecture
>>> to be expanded by the generic macros
>>> _ISR_Disable()/_ISR_Enable() which add a compiler
>>> memory barrier).
>> And are made publicly available
>> as rtems_interrupt_disable and
>> rtems_interrupt_enable.  That is the
>> name that every BSP and device driver
>> should use.  There should be virtually
>> no code under rtems-XXX/c which references
>> _CPU_ISR or _ISR_.
>> I have made a big edit sweep and will
>> start to commit once I do a full build and
>> review the diff.
>> --joel
>>> -T
>>> Pavel Pisa wrote:
>>>> On Thursday 06 September 2007 11:43, Feng, Kate wrote:
>>>>> Joel Sherrill wrote :
>>>>>> The memory barrier patch was PR 866 and was merged in March 
>>>>>> 2006.  It is
>>>>>> in all 4.7 versions.  It first appeared in 4.6.6.
>>>>> It looks like it, but RTEMS4.7.x still needs patches.
>>>>> This is not even fixed in
>>>>> The memory barrier definitely should be fixed in RTEMS4.7.x
>>>>> before jumping to RTEMS4.8.
>>>>> Suggestions follow, except I hope I do not miss anything
>>>>> since I came up with this a while ago.
>>>>> 1) In cpukit/score/include/rtems/system.h:
>>>>> #define RTEMS_COMPILER_MEMORY_BARRIER() asm volatile(::: "memory")
>>>>> seems to be wrong and misplaced.
>>>>> The memory barrier is processor dependent.
>>>>> For example, the memory barrier for PowerPC is "sync".
>>>>> Thus, for PPC,  it would seem more functional to place
>>>>> #define RTEMS_COMPILER_MEMORY_BARRIER() asm volatile("sync"::: 
>>>>> "memory")
>>>>> in cpukit/score/cpu/powerpc/system.h
>>>>> or somewhere in the processor branch.
>>>> Hello Kate and others,
>>>> I would like to react there, because I think, that proposed
>>>> addition of "sync" is move into really bad direction.
>>>> RTEMS_COMPILER_MEMORY_BARRIER is and should remain what it
>>>> is, I believe. It is barrier against compiler optimizer
>>>> caused reordering of instruction over the barrier.
>>>> This does not try to declare/cause any globally visible
>>>> ordering guarantee, by name and anything else.
>>>> Each architecture 'X' conforming CPU has to guarantee,
>>>> that even after complex CPU level instruction reordering,
>>>> register renaming and transfers delaying an sequence
>>>> of instruction would result in same state (all viewed
>>>> from CPU POV) as if instructions has been processed
>>>> in sequential order one by one.
>>>> This does not mean anything about external memory transfers
>>>> order at all (at least for PPC, there are some special rules
>>>> for x86 caches for compatibility with old programs).
>>>> The macro ensures only ordering of memory transfers
>>>> from actual CPU POV/perspective. But this is enough
>>>> even for POV of normal mode and consecutively invoked
>>>> exception handler working with same data.
>>>> Even if exception handler starts and CPU does not finish
>>>> transfers caused by previously initiated operations, reads
>>>> from exception on same!!! CPU would read back data from
>>>> write buffer if the address corresponds to previously written
>>>> data. So preemption or CPU IRQ flags manipulation in scope
>>>> of the actual CPU does not need enforcing ordering of real
>>>> memory by very expensive "sync" instruction. It only needs to
>>>> be sure, that CPU accounts/is aware of the value write transfer
>>>> before at correct point in the instruction sequence.
>>>> On the other hand, there could be other reasons and situations
>>>> requesting correct ordering of externally visible transfers.
>>>> For example, if IRQ controller is mapped as peripheral into
>>>> external memory/IO space and CPU IRQ is disabled, than some
>>>> mask is changed in the controller to disable one of external
>>>> sources and it is expected, that after IRQ enabling on CPU
>>>> level there cannot arrive event from that source, ordering
>>>> of reads and writes to the controller has to be synchronized
>>>> with CPU ("eieio" has to be used in the PPC case). But it
>>>> is not task for CPU level IRQ state manipulation. The ordering
>>>> should and in RTEMS case is ensured by IO access routines
>>>> which include "eieio" instruction. On the other hand, if
>>>> some external device is accessed through overlay structures
>>>> (even volatile), then ordering could be broken without
>>>> explicitly inserted "eieio".
>>>> Other legitimate requirement for strict ordering/barrier for
>>>> external accesses are the cases, where external device/DMA/coprocessor
>>>> accesses/shares data in system/main memory with CPU.
>>>>  The "sync"/cache range invalidation/flushing is required before
>>>> and after external memory accesses (the exact details depend on
>>>> transfers directions and other parameters).
>>>>> 3) Among PPC shared/irq/irq.c and other PPCs,
>>>>> _ISR_Disable( _level ), and _ISR_Disable( _level )
>>>>> should be used instead of _CPU_ISR_Disable(level) and
>>>>> _CPU_ISR_Enable( _level )
>>>> But I fully agree with you, that sequences like following
>>>> one are fatally broken
>>>>      _CPU_ISR_Disable(level);
>>>>      *irq = rtems_hdl_tbl[irq->name];
>>>>      _CPU_ISR_Enable(level);
>>>> There is no guarantee, that operation which has been expected
>>>> to be protected would not be moved outside of protection sequence.
>>>> Explicit or implicit RTEMS_COMPILER_MEMORY_BARRIER is missing there.
>>>>> Actually, I think a better one should be 
>>>>> rtems_interrupt_disable(level)
>>>>> and rtems_interrupt_enable(level).
>>>> The code should be changed according to one of your proposals.
>>>>> 2) In order for the inline to work, the
>>>>> CPU_INLINE_ENABLE_DISPATCH should be defined to be TRUE.
>>>>> Thus,
>>>>> in cpukit/score/cpu/powerpc/rtems/score/cpu.h:
>>>> As for the non-inlined version of _Thread_Enable_dispatch,
>>>> there should not be problem. Calling function without static
>>>> or inline attributes is considered as full compiler memory ordering
>>>> barrier point. So no explicit compiler barrier should be needed
>>>> there.
>>>> All this is based upon my understanding of code and computer
>>>> systems principles. There is no doubt, that there could be
>>>> many other problems and errors. But if there are problems
>>>> with IRQs behavior on PPC, then the checking, that sequences
>>>> like above one do not exist. The _ISR_Disable()/_ISR_Disable()
>>>> or higher level rtems_ variants should be used in noticed source
>>>> file. Else bad things could happen.
>>>> Excuse me for long answer, but I wanted to clarify things
>>>> as good way as I could.
>>>> Best wishes
>>>>             Pavel
>>>> _______________________________________________
>>>> rtems-users mailing list
>>>> rtems-users at
>>> _______________________________________________
>>> rtems-users mailing list
>>> rtems-users at

More information about the users mailing list