Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference
Nick Withers
nick.withers at anu.edu.au
Wed Feb 18 23:29:53 UTC 2015
On Wed, 2015-02-18 at 14:25 -0600, Joel Sherrill wrote:
>
> On 2/18/2015 2:05 PM, Gedare Bloom wrote:
>
> >
> >
> > On Wed, Feb 18, 2015 at 2:38 PM, Joel Sherrill
> > <joel.sherrill at oarcorp.com> wrote:
> > Hi
> >
> > I am trying to wrap my head around this discussion and its
> > impact on RTEMS. Should we compile parts of RTEMS with
> > this option? All of it?
> >
> >
> > A bit more context would have helped! S
>
> Sorry. This was the first I had seen of this option and I really
> didn't have
> much context besides "this looks like it could break code".
> > o basically, gcc can now optimize out NULL pointer accesses and turn
> > them into traps directly? And this is a problem for targets that
> > have a valid address at 0x0. One solution is to turn on the flag
> > "-fno-delete-null-pointer-checks"?
> >
> >
> Yep. But if all we have is writeable vector tables at 0x0, then it
> MIGHT be
> ok. GCC may not be able to detect. But on the m68k's without a VBR
> register the table is always at 0x0.
> > I guess we should identify which BSPs this would affect, that is,
> > which ones are allowed to make valid memory accesses at 0x0, and
> > then turn off the optimization for those BSPs?
> >
> >
> It might not just be BSPs, but architectures. Running code at 0x0
> should be
> OK since that would likely be the start code. You would never
> indirectly
> jump through it.
>
> Reading/writing data at 0 is the issue.
>
> I really have no idea if/how this impacts anything but wanted us all
> to
> think on it.
Seems like that might affect the PowerPC mini-loader that moves an image
down to 0x0 at startup? See
c/src/lib/libbsp/powerpc/shared/start/preload.S
> > Gedare
> >
> >
> >
> >
> >
> > --joel
> >
> >
> > -------- Forwarded Message --------
> > Subject:
> > Re: Obscure crashes due to gcc
> > 4.9 -O2 =>
> > -fisolate-erroneous-paths-dereference
> > Date:
> > Wed, 18 Feb 2015 13:30:24
> > -0600
> > From:
> > Andrew Pinski
> > <pinskia at gmail.com>
> > To:
> > Jeff Prothero
> > <jprother at altera.com>
> > CC:
> > GCC Mailing List
> > <gcc at gcc.gnu.org>
> >
> >
> > On Wed, Feb 18, 2015 at 11:21 AM, Jeff Prothero <jprother at altera.com> wrote:
> > >
> > > Starting with gcc 4.9, -O2 implicitly invokes
> > >
> > > -fisolate-erroneous-paths-dereference:
> > >
> > > which
> > >
> > > https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> > >
> > > documents as
> > >
> > > Detect paths that trigger erroneous or undefined behavior due to
> > > dereferencing a null pointer. Isolate those paths from the main control
> > > flow and turn the statement with erroneous or undefined behavior into a
> > > trap. This flag is enabled by default at -O2 and higher.
> > >
> > > This results in a sizable number of previously working embedded programs mysteriously
> > > crashing when recompiled under gcc 4.9. The problem is that embedded
> > > programs will often have ram starting at address zero (think hardware-defined
> > > interrupt vectors, say) which gets initialized by code which the
> > > -fisolate-erroneous-paths-deference logic can recognize as reading and/or
> > > writing address zero.
> >
> > You should have used -fno-delete-null-pointer-checks which has been
> > doing this optimization for a long time now, just it got better with
> > -fisolate-erroneous-paths-dereference pass.
> >
> > Thanks,
> > Andrew Pinski
> >
> >
> >
> > >
> > > What happens then is that the previously running program compiles without
> > > any warnings, but then typically locks up mysteriously (often disabling the
> > > remote debug link) due to the trap not being gracefully handled by the
> > > embedded runtime.
> > >
> > > Granted, such code is out-of-spec wrt to C standards.
> > >
> > > None the less, the problem is quite painful to track down and
> > > unexpected.
> > >
> > > Is there any good reason the
> > >
> > > -fisolate-erroneous-paths-dereference
> > >
> > > logic could not issue a compiletime warning or error, instead of just
> > > silently generating code virtually certain to crash at runtime?
> > >
> > > Such a warning/error would save a lot of engineers significant amounts
> > > of time, energy and frustration tracking down this problem.
> > >
> > > I would like to think that the spirit of gcc is about helping engineers
> > > efficiently correct nonstandard pain, rather than inflicting maximal
> > > pain upon engineers violating C standards. :-)
> > >
> > > -Jeff
> > >
> > > BTW, I'd also be curious to know what is regarded as engineering best
> > > practice for writing a value to address zero when this is architecturally
> > > required by the hardware platform at hand. Obviously one can do various
> > > things to obscure the process sufficiently that the current gcc implementation
> > > won't detect it and complain, but as gcc gets smarter about optimization
> > > those are at risk of failing in a future release. It would be nice to have
> > > a guaranteed-to-work future-proof idiom for doing this. Do we have one, short
> > > of retreating to assembly code?
> >
> >
> >
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel at rtems.org
> > http://lists.rtems.org/mailman/listinfo/devel
> >
> >
>
> --
> Joel Sherrill, Ph.D. Director of Research & Development
> joel.sherrill at OARcorp.com On-Line Applications Research
> Ask me about RTEMS: a free RTOS Huntsville AL 35805
> Support Available (256) 722-9985
> _______________________________________________
> devel mailing list
> devel at rtems.org
> http://lists.rtems.org/mailman/listinfo/devel
More information about the devel
mailing list