Problem report: Struct aliasing problem causes Thread_Ready_Chain corruption in 4.6.99.3

Ralf Corsepius ralf.corsepius at rtems.org
Wed Nov 29 06:49:22 UTC 2006


On Tue, 2006-11-28 at 11:14 -0600, Joel Sherrill wrote:
> Eric Norum wrote:
> > In the interests of not delaying 4.7 for another year I suggest that  
> > we simply add -fno-strict-aliasing to all gcc invocations.  I don't  
> > see anything wrong with this approach in the near term.  As has been  
> > pointed out by others, many other kernel development projects have  
> > resorted to this technique.
> >
> > I know that Ralf is opposed to this, but I have not heard a reason to  
> > convince me.
> >   
> I thought about this over the holiday weekend and
> a few thoughts kept coming back. 

> + It is an OPTIMIZATION and an optional one at that.
> This isn't a test of manhood.  There isn't any shame in
> disabling it. 
Then you might be able to explain why

* HUGE projects such as Fedora and OpenSuSE are able to compile 1000's
of source tarballs and millions of lines of code with it enabled and are
only facing very few packages to break?

* GCC and newlib can be compiled with it enabled for RTEMS?

> + I don't know how much benefit turning it on would have
> anyway.  In general, RTEMS proper is written to avoid unnecessary
> memory references so this would probably not have big impact.
> So how much performance gain could turning this on win anyway?
> 
> + As Eric points out, other OSes with larger user and maintainer
> bases have not found a solution to using this optimization
> safely.
Their problem is their project's sizes and their attitude.

Our problem is lack of testing (primary cause: way too long release
cycles). 

Instead the RTEMS community seems to prefer to "blindly shoot into the
crowd" on "hear/say" and to play with symptoms, but to fix causes.

> + More importantly, we are trying to get a release out.  The
> most expedient solution is also the one with the least technical
> impact on code stability.  I often get faulted for not pushing
> for releases and this is one case where I see no end to the amount
> of work in question to address every place that is broken by
> strict aliasing.
Me suspects very few, but central points in RTEMS to be broken and
needing to be fundamentally redesigned.

> Bottom line is that if we want strict-aliasing on for 4.7, we
> will be delaying the release.  This is a very bad thing.  I
> am torn between Thomas' suggestions 2 and 3
> 
> > 2.) We set "-fno-strict-aliasing" now and forever

With all due respect, but to me, this would be "plain stupid".

> > 3.) we use "-fno-strict-aliasing" for RTEMS 4.7 and, ASAP we build a
> > strategy on how to get ALL code aliasing clean.

This would be a _temporary_ compromise, I could live with.
Nevertheless, we need to identify the broken pieces and not to play it
nice nor to play these breakdowns low.

Peer's report (which I seem to have missed initially) and Thomas'
followup to it are a points to getting started.


Embarrassed,

	Ralf





More information about the users mailing list