Memory Barrier

Feng, Kate feng at bnl.gov
Tue Sep 11 02:38:41 UTC 2007


How about _CPU_MSR_SET() used in various PPC/irq.c ? 
It's inline, but not protected by memory
barrier, either.

      _CPU_MSR_GET(msr);
      new_msr = msr | MSR_EE;
      _CPU_MSR_SET(new_msr);
       ...........
      ............
      _CPU_MSR_SET(msr);

Regards,
Kate

-----Original Message-----
From: rtems-users-bounces+feng1=bnl.gov at rtems.org on behalf of Joel Sherrill
Sent: Mon 9/10/2007 3:34 PM
To: Till Straumann
Cc: rtems-users at rtems.org
Subject: Re: Memory Barrier
 
Till Straumann wrote:
> The bad news are that a number of CPUs ( i386, m68k, sparc, sh, ... )
> define
>
> #define _CPU_ISR_XXable(k)  <cpu>_XXable_interrupt(K)
>
> and the latter is also used in many places (in- and outside of cpukit).
>
I am nearly complete on the test build for the search and replace
on _CPU_ISRXXable under c/.   There is no reason the BSPs and
other code shouldn't have used the public API.  So this is an excuse
at that cleanup.

I will hunt for the others since unless there is some obvious reason
to use them, they are just barriers to driver portability.
> I start to wonder if it wouldn't be easier to introduce the barrier
> where the relevant lowest-level macros for each CPU are defined.
>
It might be prudent but it is also effective to just forbid anyone
from using the _CPU_ISR form.
> BTW: IMO, barriers are also needed around _CPU_ISR_Set_level()
> which could have the effect of enabling or disabling interrupts.
>
This is probably prudent although I don't think any use of it
inside RTEMS is particularly critical.  As far as I recall, it is
only used during task initialization and mode changes.

--joel
> -- T.
>
> Joel Sherrill wrote:
>> Till Straumann wrote:
>>> I filed PR1257 discouraging the use of
>>> _CPU_ISR_Disable()/_CPU_ISR_Enable()
>>>
>>> These macros should never be used directly
>>> (they are defined by every CPU architecture
>>> to be expanded by the generic macros
>>> _ISR_Disable()/_ISR_Enable() which add a compiler
>>> memory barrier).
>>>
>>>   
>> And are made publicly available
>> as rtems_interrupt_disable and
>> rtems_interrupt_enable.  That is the
>> name that every BSP and device driver
>> should use.  There should be virtually
>> no code under rtems-XXX/c which references
>> _CPU_ISR or _ISR_.
>>
>> I have made a big edit sweep and will
>> start to commit once I do a full build and
>> review the diff.
>>
>> --joel
>>> -T
>>>
>>> Pavel Pisa wrote:
>>>  
>>>> On Thursday 06 September 2007 11:43, Feng, Kate wrote:
>>>>     
>>>>> Joel Sherrill wrote :
>>>>>         
>>>>>> The memory barrier patch was PR 866 and was merged in March 
>>>>>> 2006.  It is
>>>>>> in all 4.7 versions.  It first appeared in 4.6.6.
>>>>>>               
>>>>> It looks like it, but RTEMS4.7.x still needs patches.
>>>>> This is not even fixed in 4.77.99.2.
>>>>> The memory barrier definitely should be fixed in RTEMS4.7.x
>>>>> before jumping to RTEMS4.8.
>>>>>
>>>>> Suggestions follow, except I hope I do not miss anything
>>>>> since I came up with this a while ago.
>>>>>
>>>>> 1) In cpukit/score/include/rtems/system.h:
>>>>>
>>>>> #define RTEMS_COMPILER_MEMORY_BARRIER() asm volatile(::: "memory")
>>>>>
>>>>> seems to be wrong and misplaced.
>>>>>
>>>>> The memory barrier is processor dependent.
>>>>> For example, the memory barrier for PowerPC is "sync".
>>>>>
>>>>> Thus, for PPC,  it would seem more functional to place
>>>>> #define RTEMS_COMPILER_MEMORY_BARRIER() asm volatile("sync"::: 
>>>>> "memory")
>>>>>
>>>>> in cpukit/score/cpu/powerpc/system.h
>>>>> or somewhere in the processor branch.
>>>>>           
>>>> Hello Kate and others,
>>>>
>>>> I would like to react there, because I think, that proposed
>>>> addition of "sync" is move into really bad direction.
>>>>
>>>> RTEMS_COMPILER_MEMORY_BARRIER is and should remain what it
>>>> is, I believe. It is barrier against compiler optimizer
>>>> caused reordering of instruction over the barrier.
>>>> This does not try to declare/cause any globally visible
>>>> ordering guarantee, by name and anything else.
>>>>
>>>> Each architecture 'X' conforming CPU has to guarantee,
>>>> that even after complex CPU level instruction reordering,
>>>> register renaming and transfers delaying an sequence
>>>> of instruction would result in same state (all viewed
>>>> from CPU POV) as if instructions has been processed
>>>> in sequential order one by one.
>>>> This does not mean anything about external memory transfers
>>>> order at all (at least for PPC, there are some special rules
>>>> for x86 caches for compatibility with old programs).
>>>>
>>>> The macro ensures only ordering of memory transfers
>>>> from actual CPU POV/perspective. But this is enough
>>>> even for POV of normal mode and consecutively invoked
>>>> exception handler working with same data.
>>>> Even if exception handler starts and CPU does not finish
>>>> transfers caused by previously initiated operations, reads
>>>> from exception on same!!! CPU would read back data from
>>>> write buffer if the address corresponds to previously written
>>>> data. So preemption or CPU IRQ flags manipulation in scope
>>>> of the actual CPU does not need enforcing ordering of real
>>>> memory by very expensive "sync" instruction. It only needs to
>>>> be sure, that CPU accounts/is aware of the value write transfer
>>>> before at correct point in the instruction sequence.
>>>>
>>>> On the other hand, there could be other reasons and situations
>>>> requesting correct ordering of externally visible transfers.
>>>> For example, if IRQ controller is mapped as peripheral into
>>>> external memory/IO space and CPU IRQ is disabled, than some
>>>> mask is changed in the controller to disable one of external
>>>> sources and it is expected, that after IRQ enabling on CPU
>>>> level there cannot arrive event from that source, ordering
>>>> of reads and writes to the controller has to be synchronized
>>>> with CPU ("eieio" has to be used in the PPC case). But it
>>>> is not task for CPU level IRQ state manipulation. The ordering
>>>> should and in RTEMS case is ensured by IO access routines
>>>> which include "eieio" instruction. On the other hand, if
>>>> some external device is accessed through overlay structures
>>>> (even volatile), then ordering could be broken without
>>>> explicitly inserted "eieio".
>>>> Other legitimate requirement for strict ordering/barrier for
>>>> external accesses are the cases, where external device/DMA/coprocessor
>>>> accesses/shares data in system/main memory with CPU.
>>>>  The "sync"/cache range invalidation/flushing is required before
>>>> and after external memory accesses (the exact details depend on
>>>> transfers directions and other parameters).
>>>>
>>>>     
>>>>> 3) Among PPC shared/irq/irq.c and other PPCs,
>>>>> _ISR_Disable( _level ), and _ISR_Disable( _level )
>>>>> should be used instead of _CPU_ISR_Disable(level) and
>>>>> _CPU_ISR_Enable( _level )
>>>>>           
>>>> But I fully agree with you, that sequences like following
>>>> one are fatally broken
>>>>
>>>>      _CPU_ISR_Disable(level);
>>>>      *irq = rtems_hdl_tbl[irq->name];
>>>>      _CPU_ISR_Enable(level);
>>>>
>>>> There is no guarantee, that operation which has been expected
>>>> to be protected would not be moved outside of protection sequence.
>>>> Explicit or implicit RTEMS_COMPILER_MEMORY_BARRIER is missing there.
>>>>
>>>>     
>>>>> Actually, I think a better one should be 
>>>>> rtems_interrupt_disable(level)
>>>>> and rtems_interrupt_enable(level).
>>>>>           
>>>> The code should be changed according to one of your proposals.
>>>>
>>>>     
>>>>> 2) In order for the inline to work, the
>>>>> CPU_INLINE_ENABLE_DISPATCH should be defined to be TRUE.
>>>>>
>>>>> Thus,
>>>>> in cpukit/score/cpu/powerpc/rtems/score/cpu.h:
>>>>>
>>>>> -#define CPU_INLINE_ENABLE_DISPATCH       FALSE
>>>>> +#define CPU_INLINE_ENABLE_DISPATCH       TRUE
>>>>>           
>>>> As for the non-inlined version of _Thread_Enable_dispatch,
>>>> there should not be problem. Calling function without static
>>>> or inline attributes is considered as full compiler memory ordering
>>>> barrier point. So no explicit compiler barrier should be needed
>>>> there.
>>>>
>>>> All this is based upon my understanding of code and computer
>>>> systems principles. There is no doubt, that there could be
>>>> many other problems and errors. But if there are problems
>>>> with IRQs behavior on PPC, then the checking, that sequences
>>>> like above one do not exist. The _ISR_Disable()/_ISR_Disable()
>>>> or higher level rtems_ variants should be used in noticed source
>>>> file. Else bad things could happen.
>>>>
>>>> Excuse me for long answer, but I wanted to clarify things
>>>> as good way as I could.
>>>>
>>>> Best wishes
>>>>
>>>>             Pavel
>>>> _______________________________________________
>>>> rtems-users mailing list
>>>> rtems-users at rtems.com
>>>> http://rtems.rtems.org/mailman/listinfo/rtems-users
>>>>       
>>>
>>>
>>> _______________________________________________
>>> rtems-users mailing list
>>> rtems-users at rtems.com
>>> http://rtems.rtems.org/mailman/listinfo/rtems-users
>>>   
>>
>

_______________________________________________
rtems-users mailing list
rtems-users at rtems.com
http://rtems.rtems.org/mailman/listinfo/rtems-users





More information about the users mailing list