Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Wed Feb 18 23:29:53 UTC 2015

On Wed, 2015-02-18 at 14:25 -0600, Joel Sherrill wrote:
> 
> On 2/18/2015 2:05 PM, Gedare Bloom wrote:
> 
> > 
> > 
> > On Wed, Feb 18, 2015 at 2:38 PM, Joel Sherrill
> > <joel.sherrill at oarcorp.com> wrote:
> >         Hi
> >         
> >         I am trying to wrap my head around this discussion and its 
> >         impact on RTEMS. Should we compile parts of RTEMS with
> >         this option? All of it?
> >         
> >         
> > A bit more context would have helped! S
> 
> Sorry. This was the first I had seen of this option and I really
> didn't have 
> much context besides "this looks like it could break code". 
> > o basically, gcc can now optimize out NULL pointer accesses and turn
> > them into traps directly? And this is a problem for targets that
> > have a valid address at 0x0. One solution is to turn on the flag
> > "-fno-delete-null-pointer-checks"?
> > 
> > 
> Yep. But if all we have is writeable vector tables at 0x0, then it
> MIGHT be
> ok. GCC may not be able to detect.  But on the m68k's without a VBR 
> register the table is always at 0x0.
> > I guess we should identify which BSPs this would affect, that is,
> > which ones are allowed to make valid memory accesses at 0x0, and
> > then turn off the optimization for those BSPs?
> > 
> > 
> It might not just be BSPs, but architectures.  Running code at 0x0
> should be
> OK since that would likely be the start code. You would never
> indirectly 
> jump through it.
> 
> Reading/writing data at 0 is the issue.
> 
> I really have no idea if/how this impacts anything but wanted us all
> to
> think on it.

Seems like that might affect the PowerPC mini-loader that moves an image
down to 0x0 at startup? See
c/src/lib/libbsp/powerpc/shared/start/preload.S

> > Gedare
> > 
> > 
> > 
> > 
> >  
> >         --joel
> >         
> >         
> >         -------- Forwarded Message -------- 
> >                              Subject: 
> >         Re: Obscure crashes due to gcc
> >         4.9 -O2 =>
> >         -fisolate-erroneous-paths-dereference
> >                                 Date: 
> >         Wed, 18 Feb 2015 13:30:24
> >         -0600
> >                                 From: 
> >         Andrew Pinski
> >         <pinskia at gmail.com>
> >                                   To: 
> >         Jeff Prothero
> >         <jprother at altera.com>
> >                                   CC: 
> >         GCC Mailing List
> >         <gcc at gcc.gnu.org>
> >         
> >         
> >         On Wed, Feb 18, 2015 at 11:21 AM, Jeff Prothero <jprother at altera.com> wrote:
> >         >
> >         > Starting with gcc 4.9, -O2 implicitly invokes
> >         >
> >         >     -fisolate-erroneous-paths-dereference:
> >         >
> >         > which
> >         >
> >         >     https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> >         >
> >         > documents as
> >         >
> >         >     Detect paths that trigger erroneous or undefined behavior due to
> >         >     dereferencing a null pointer. Isolate those paths from the main control
> >         >     flow and turn the statement with erroneous or undefined behavior into a
> >         >     trap. This flag is enabled by default at -O2 and higher.
> >         >
> >         > This results in a sizable number of previously working embedded programs mysteriously
> >         > crashing when recompiled under gcc 4.9.  The problem is that embedded
> >         > programs will often have ram starting at address zero (think hardware-defined
> >         > interrupt vectors, say) which gets initialized by code which the
> >         > -fisolate-erroneous-paths-deference logic can recognize as reading and/or
> >         > writing address zero.
> >         
> >         You should have used -fno-delete-null-pointer-checks which has been
> >         doing this optimization for a long time now, just it got better with
> >         -fisolate-erroneous-paths-dereference pass.
> >         
> >         Thanks,
> >         Andrew Pinski
> >         
> >         
> >         
> >         >
> >         > What happens then is that the previously running program compiles without
> >         > any warnings, but then typically locks up mysteriously (often disabling the
> >         > remote debug link) due to the trap not being gracefully handled by the
> >         > embedded runtime.
> >         >
> >         > Granted, such code is out-of-spec wrt to C standards.
> >         >
> >         > None the less, the problem is quite painful to track down and
> >         > unexpected.
> >         >
> >         > Is there any good reason the
> >         >
> >         >     -fisolate-erroneous-paths-dereference
> >         >
> >         > logic could not issue a compiletime warning or error, instead of just
> >         > silently generating code virtually certain to crash at runtime?
> >         >
> >         > Such a warning/error would save a lot of engineers significant amounts
> >         > of time, energy and frustration tracking down this problem.
> >         >
> >         > I would like to think that the spirit of gcc is about helping engineers
> >         > efficiently correct nonstandard pain, rather than inflicting maximal
> >         > pain upon engineers violating C standards.  :-)
> >         >
> >         > -Jeff
> >         >
> >         > BTW, I'd also be curious to know what is regarded as engineering best
> >         > practice for writing a value to address zero when this is architecturally
> >         > required by the hardware platform at hand.  Obviously one can do various
> >         > things to obscure the process sufficiently that the current gcc implementation
> >         > won't detect it and complain, but as gcc gets smarter about optimization
> >         > those are at risk of failing in a future release.  It would be nice to have
> >         > a guaranteed-to-work future-proof idiom for doing this. Do we have one, short
> >         > of retreating to assembly code?
> >         
> >         
> >         
> >         
> >         
> >         _______________________________________________
> >         devel mailing list
> >         devel at rtems.org
> >         http://lists.rtems.org/mailman/listinfo/devel
> > 
> > 
> 
> -- 
> Joel Sherrill, Ph.D.             Director of Research & Development
> joel.sherrill at OARcorp.com        On-Line Applications Research
> Ask me about RTEMS: a free RTOS  Huntsville AL 35805
> Support Available                (256) 722-9985
> _______________________________________________
> devel mailing list
> devel at rtems.org
> http://lists.rtems.org/mailman/listinfo/devel