Any known compiler issues for CPU32? (stack corruption problem)

Joel Sherrill joel.sherrill at OARcorp.com
Thu Feb 28 14:14:21 UTC 2002



Mike Panetta wrote:
> 
> On Wed, 2002-02-27 at 15:52, Matthew J Fletcher wrote:
> 
> >
> > i might be stupid here, but your not trying to run 68332 (i.e cpu32), code
> > on the not-totaly-cpu32, 68030 are you ?,.. i doubt it but what the hell,..
> 
> Nope.  Unless the RTEMS build process is flawed in that kind of way
> (doubtfull).
> 
> >
> > ok, when i was developing on 68331's we never had these problems but
> > we did patch gcc 2.95.3 with a special inturrupt patch that fixed problems,
> > with restoring regiesters (like stack pointers), after a context switch,.
> 
> That sounds intresting.  Ill have a look.  I am not sure if RTEMS
> already has patches for this or not in their distributed compiler.

RTEMS would not need this since it manages saving and restoring 
registers for interrupt handlers.  This is required to be able to
switch to a new task on ISR exit.

> >
> > here is the link,..
> >
> > http://sources.redhat.com/ml/crossgcc/2000-q1/msg00304.html
> >
> > > So far, all I know, is somewhere before the code gets to Init() (in
> > > hello sample) the stack gets messed so bad that the stack pointer is
> > > pointing to an address over 5000 bytes above the top of the stack!
> >
> > ok, so thats before inturrupts have even been enabled (i think), so its
> > probably not the above.
> 
> Well maybe,  I donno yet.  I do know that I was confused about where the
> stack should have been after the process starts.  Its intresting what
> you can learn after stepping through a context switch in a debugger :).
> Apparently the stack for processes is allocated in an area at the end of
> memory called the workspace.  So the SP being above the stack top is
> normal if your in a process.

I don't know what has changed with your local copy of the BSP but
the linkcmds/memory map looked to be OK.  If you have reworked the
start code, there are a handful of common things that can go wrong:

  + not disabling interrupts 
  + not initializing the memory controller correctly.
  + bad memory map (code, data, bss, heap, starting stack, and 
    RTEMS workspace should not overlap)
  + initializing the stack pointer to the wrong end of the
    RAM reserved for the starting stack.  The stack grows down on m68k
    so it would have to be at the high end of the memory reserved.

If you are tracing through the first context switch, and going 
off in the weeds, then it is almost 100% certain that you are getting
a spurious interrupt.  If everything in the BSP's init is working well
enough to get to the first context switch, then that is where interrupts
are enabled for the first time.  It is not uncommon for things to
blow up there when an interrupt source has not been cleared/masked
properly by the init code.

> The strange thing is, after disabling interrupt mode IO, I got the hello
> sample to work, but after that I tried to run base_sp, and it failed.
> Something really wacky is going on here.  

When debugging a BSP initially always do a hard reset between tests.
Many BSPs do not clean up completely upon program exit and this clouds
things.

Are you using the old console driver?  It violated the RTEMS interrupt
model and could cause all sorts of weird problems.

> I stepped all the way through
> the code  in the Init part of base_sp, from task creation, to starting
> the second task, to the destruction of the init task, it all seemed
> fine, untill it tried to do a context switch to the new task, then it
> died.  The context switch from the OS to the Init task went smoothly
> though.  Also, running in GDB does not seem to act the same as just
> running on the board with a breakpoint set in CPU32BUG.  The code will
> not get as far in CPU32BUG.  Say if I put a breakpoint right at Init in
> the code running in GDB, it may get to it.  If I put a breakpoint in the
> exact same place in CPU32BUG, it dies before it gets to it with an
> illegal instruction exception at 0xFFFE (or somthing like that).

Is the BSP init code preserving the debug vector number used by
CPU32bug?

> >
> > >In
> > > addition the frame pointer has been set to 0 (is that ok?). Its fine at
> > > boot_card(), and its fine at console_initialize() (a bsp specific
> > > routine in this case), but between console_initialize() and Init() it
> > > goes south.  I have stepped through all the code in console_initialize,
> > > and the stack seems fine in there too. Maybe this is just me not quite
> > > understanding how RTEMS initializes the stack for a process...  I do
> > > know, that if I let the program run without any breakpoints, it goes
> > > wild and overwrites itself...  Isn't that generally an indication of
> > > stack corruption?
> >
> > is something just blowing the stack ?, do you have enough configured
> > is trickey to really work out how much you need, but doubleing it is
> > a good idea.
> 
> I have doubled, and even quadrupled the stack, it does not help.

It won't if it is an interrupt problem.

> > i also seem to remember (but cant be sure), that rtems is running in
> > priverlaged mode at this point, is it possible that come code is stomping
> > the stack pointer ? - ive seen stranger.
> 
> I donno.  I cant figure it out in GDB, because the code acts different
> in the debugger :(

Not unusual.  I really suspect a basic interrupt problem.  SOmething is
not initialized correctly and/or there is something wrong with the 
console driver.

If RTEMS and CPU32bug are sharing the same serial port, you will have to
be careful that the RTEMS handling of it does not conflict.  I suspect
that the console driver will have to remain polled to cooperate with
CPU32bug.

> Mike
> >
> > regards
> >
> > ---
> > Matthew J Fletcher            amimjf at connectfree.co.uk
> > Software Engineering          Matthew.Fletcher at student.shu.ac.uk
> > www.amimjf.org                        ICQ amimjf 44193496
> > ---
> > kickass.amimjf.org 2.4.17-20mdk #1 Wed Feb 20 20:45:19 CET 2002
> > ---
> > Inadmissible:  Not competent to be considered.  Said of certain kinds of
> > testimony which juries are supposed to be unfit to be entrusted with,
> > and which judges, therefore, rule out, even of proceedings before themselves
> > alone.  Hearsay evidence is inadmissible because the person quoted was
> > unsworn and is not before the court for examination; yet most momentous
> > actions, military, political, commercial and of every other kind, are
> > daily undertaken on hearsay evidence.  There is no religion in the world
> > that has any other basis than hearsay evidence.  Revelation is hearsay
> > evidence; that the Scriptures are the word of God we have only the
> > testimony of men long dead whose identy is not clearly established and
> > who are not known to have been sworn in any sense.  Under the rules of
> > evidence as they now exist in this country, no single assertion in the
> > Bible has in its support any evidence admissible in a court of law...
> >
> > But as records of courts of justice are admissible, it can easily be proved
> > that powerful and malevolent magicians once existed and were a scourge to
> > mankind.  The evidence (including confession) upon which certain women
> > were convicted of witchcraft and executed was without a flaw; it is still
> > unimpeachable.  The judges' decisions based on it were sound in logic and
> > in law.  Nothing in any existing court was ever more thoroughly proved than
> > the charges of witchcraft and sorcery for which so many suffered death.
> > If there were no witches, human testimony and human reason are alike
> > destitute of value.  --Ambrose Bierce
> >

-- 
Joel Sherrill, Ph.D.             Director of Research & Development
joel at OARcorp.com                 On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available                (256) 722-9985



More information about the users mailing list