Problem report: Struct aliasing problem causes Thread_Ready_Chain corruption in 4.6.99.3
Joel Sherrill
joel.sherrill at oarcorp.com
Thu Dec 7 22:41:10 UTC 2006
Till Straumann wrote:
> Ralf Corsepius wrote:
>
>> On Thu, 2006-12-07 at 09:47 +0100, Wolfram Wadepohl wrote:
>>
>>
>>> Till Straumann schrieb:
>>>
>>>
>>>
>>>> I agree with Linus' sentiment, too. The problem, however,
>>>> is (repeating mantra) that this is not just some weirdo
>>>> gcc optimization that can be switched off. It is the C99 *standard*.
>>>> Even if you can switch this off for gcc today, there is no
>>>> guarantee that you will be able to in the future or on other
>>>> compilers. If we want to produce C99 compliant code then
>>>> we must comply with the alias rule. period.
>>>>
>>>> Steven Johnson wrote:
>>>>
>>>>
>>>>> I have been quietly following this thread, but I find the whole
>>>>> -fstrict-aliasing/-fnostrict-aliasing issue to be very disturbing.
>>>>> Luckily my program isn't built with -O2 or I probably would have been
>>>>> tracking untold numbers of strange bugs in known working code. For
>>>>> the C language to change so that a pointer (regardless of the pointer
>>>>> type used to reference that memory) no longer points to a known piece
>>>>> of memory, in a predictable way is whacked.
>>>>>
>>>>> I for one do not look forward to adding __attribute__ ((may_alias)) to
>>>>> the hundreds of places where I change the way I address memory using
>>>>> pointers. It is a monumental waste of time, prone to error and in my
>>>>> opinion putting in declarations to fix a broken compiler
>>>>> optimisation. When a compiler optimisation breaks a fundamental
>>>>> aspect of C that has existed since the beginnings of the language,
>>>>> then I consider that optimisation to be broken, and not the code
>>>>> itself. I will be adding -fno-strict-aliasing to all of my builds in
>>>>> future, and I will be making sure RTEMS (and all of the other Open
>>>>> Source libraries I use) builds with -fno-strict-aliasing, regardless
>>>>> of what is ultimately decided here), I just don't want the headache.
>>>>> In my opinion you wouldn't be fixing RTEMS by adding these
>>>>> declarations or changing the code, you would be working around a
>>>>> broken compiler. The other OS's that use -fno-strict-aliasing are (in
>>>>> my opinion) doing the right thing. I also fail to see how the option
>>>>> could yield any tangible benefits on performance that would warrant
>>>>> the pain and difficulty it causes.
>>>>>
>>>>> But that is my 2c.
>>>>>
>>>>>
>>> Hi all,
>>>
>>> i've follwed the discussion on the list. As a user of RTEMS building
>>> comercial *embedded* applications with high availability the only short
>>> term solution is to use -fno-strict-aliasing for the whole program
>>> including all RTEMS parts.
>>>
>>> Till is right in telling us the gcc optimization (weird or not) *is* C99
>>> standard.
>>>
>>>
>> I agree with Till and you.
>>
>>
>>
>>> Following this argumentation and expecting that
>>> -fno-strict-aliasing will be dropped eventually and also considering that
>>> we write embedded code dealing with real hardware i ask the question if C
>>> is the right language to implement these applications in the future. A
>>> programming language forcing me to consider what machine code the compiler
>>> will eventually produce is not worth using for *embedded* programming.
>>>
>>>
>> Well, let me put this way: There are people having tried to (ab-)use C
>> as macro assembler. C99 (probably under the influence of C++) has voided
>> this aspect to a large extend and shifted to a different level of
>> programming languages (more into Pascal's direction).
>>
>> As most high level applications don't apply such "assembler like"
>> features, so they aren't really affected. And those highlevel
>> application which do (esp. some GUI toolkits) are facing similar issues
>> as we are.
>>
>>
>>
>>> In general and from an academic point of view Ralf is totally right; the
>>> standard is clear and the code should be fixed. A proper data model, well
>>> designed from base on, will hopefully not produce aliasing problems. But
>>> is this always possible and adequate in real work? It shuold be for basic
>>> technology like RTOS or general libraries like newlib!
>>>
>>>
>> I think so, but ... as we currently all are experiencing, the "C as
>> macro assembler times" seem to be over.
>>
>>
>>
>>> In fact the current RTEMS code is not in the shape that aliasing is not
>>> considered as a problem. It has grown over more than a decade of years.
>>> Is this the time for a complete rewrite?
>>>
>>>
>> Frankly speaking, I think, at least some very basic parts/types RTEMS
>> are in need of a redesign/rewrite. IMO, introducing "type strictness"
>> and related to it, to "properly-typed" APIs is in dire need.
>>
>> RTEMS definitely has weaknesses related to these areas. Therefore, I
>> would expect a large amount of the issues related to "strict aliasing"
>> and "strict alignment" to collapse, once they would be addressed.
>>
>>
>>
>>> Can we fix it?
>>>
>>>
>> I hope so, but do not expect this to happen any time soon.
>>
>>
>>
>>> What piece of work
>>> can i do, as a user with limited knowledge of kernel functonality?
>>>
>>>
>> Good question.
>>
>> ATM, from my point of view, people being familiar with certain flavors
>> of asm who could identify aliasing showing effects on RTEMS code would
>> be helpful. I have been trying to identify files being affected by
>> strict-aliasing and meanwhile have a list consisting of ca. 20000 object
>> (Note: *.o not *.c!) files (out of ca. 70000) from RTEMS-4.8, which are
>> affected by aliasing.
>>
>> Now, identifying those which really are broken by aliasing would be
>> necessary. So far, apart of Peer's/Thomas's case [1], I haven't found
>> any :)
>>
>>
> Problem with your current approach is that you don't really
> find alias rule violations but only the subset of them that
> cause problems with current gcc's optimization implementation.
>
Agreed. That's why I think it is important to make sure that we find a
procedure that
is good enough to run for future gcc versions.
Do you think counting load/store instructions as a second level check
for differences in
strict aliasing will reduce the false positive cases?
--joel
> T.
>
>> Ralf
>>
>> [1] Which meanwhile is supposed to be worked-around.
>>
>>
>>
>>
>
> _______________________________________________
> rtems-users mailing list
> rtems-users at rtems.com
> http://rtems.rtems.org/mailman/listinfo/rtems-users
>
More information about the users
mailing list