Fw: SEVERE Bug in mc68360 _ISR_Handler???
bobwis at asczone.com
bobwis at asczone.com
Mon Jul 23 06:31:33 UTC 2001
The target board successfully ran all weekend without a lockup with Chris'
patch.
Well done Thomas and Chris, thanks!
Bob Wisdom
bobwis at asczone.com
----- Original Message -----
From: "Joel Sherrill" <joel.sherrill at OARcorp.com>
To: <bobwis at asczone.com>; <ccj at acm.org>; "Eric Norum" <eric at cls.usask.ca>
Sent: Thursday, July 19, 2001 1:18 PM
Subject: Re: SEVERE Bug in mc68360 _ISR_Handler???
>
>
> bobwis at asczone.com wrote:
> >
> > Hi Joel
> > I can confirm that the original 4.5 system still locks up with the timer
> > tick interrupt set to level 6 and the scc interrupt set to 4, in our
case it
> > took > 8 hours to lock it up.
> > We have now applied Chris' patch and it has been running for about three
> > hours.
> > Would you like me to post the confirmation to the mail-list when we have
run
> > overnight?
>
> Please. I have cc'ed Eric and Chris since I am taking a long weekend.
>
> The longer you can run it, the better we all will feel.
>
> Thank you very much. This type of testing is really valuable.
>
> > Regards,
> > Bob Wisdom
> >
> > ----- Original Message -----
> > From: "Joel Sherrill" <joel.sherrill at OARcorp.com>
> > To: <bobwis at asczone.com>
> > Sent: Wednesday, July 18, 2001 12:53 PM
> > Subject: Re: SEVERE Bug in mc68360 _ISR_Handler???
> >
> > >
> > >
> > > bobwis at asczone.com wrote:
> > > >
> > > > Hi Joel,
> > > > I will have a go - it might take a day or two as I have to rebuild
my
> > > > hardware test system.
> > > > I will let you know the results.
> > >
> > > Thanks. It sure sounds like the same problem to me though.
> > >
> > > Just let at least myself and Eric Norum know the results. It would be
> > > better to post them to the users list.
> > >
> > > > Bob
> > > >
> > > > ----- Original Message -----
> > > > From: "Joel Sherrill" <joel.sherrill at OARcorp.com>
> > > > To: "Thomas Doerfler" <Thomas.Doerfler at imd-systems.de>
> > > > Cc: <bobwis at asczone.com>; <rtems-users at OARcorp.com>
> > > > Sent: Tuesday, July 17, 2001 11:48 PM
> > > > Subject: Re: SEVERE Bug in mc68360 _ISR_Handler???
> > > >
> > > > > Thomas Doerfler wrote:
> > > > > >
> > > > > > Hi Bob,
> > > > > >
> > > > > > > Greetings from the UK.
> > > > > > > I remember there was an interrupt problem with the '360 that
we
> > came
> > > > across
> > > > > > > when we were trying to test the TCP stack about 9 months ago.
All
> > I
> > > > can
> > > > > > > remember is that to fix it, the "tick" timer had to be on the
same
> > (or
> > > > > > > lower?) interrupt priority than the network to prevent the
very
> > > > occasional
> > > > > > > scc lockup. At the time Joel and Eric couldn't see what was
going
> > > > wrong -
> > > > > > > and as far as I know it was never tracked down - it was well
> > beyond my
> > > > > > > capability!. I believe there is a comment in a "readme" file
> > enclosed
> > > > with
> > > > > > > the current 68360 distribution explaining the workaround.
> > > > > > > Am I off track here or could this be the same problem
resurfacing?
> > > > > >
> > > > > > No you are right, putting the PIT to a lower level also works
ok. I
> > > > > > am really glad to here that I am not the only one who saw that
> > > > > > problem.
> > > > > >
> > > > > > I think it is very '360 specific, because all CPM interrupts get
> > > > > > masked as soon as the CPU has fetched the interrupt vector
(oposed
> > to
> > > > > > many other interrupt sources that remove the interrupt request
only
> > > > > > when the actual peripheral register are accessed).
> > > > >
> > > > > No. I think it is not 68360 specific just made worse and more
> > > > > frequent on the 360 since it is typically a heavy IRQ handler.
> > > > >
> > > > > I think this is Bob Wisdom's problem resurfaced. Just this time
> > > > > in a more obvious way to handle.
> > > > >
> > > > > And Chris' comment about needing to check for "ISR_Handler" being
> > > > > the return address on all m68ks without separate stacks except
> > > > > 68060 and Coldfire is more than likely on the money.
> > > > >
> > > > > Summary analysis:
> > > > >
> > > > > Some m68k models do not guarantee that the first instruction in
> > > > > an ISR will be executed if a higher priority interrupt occurs.
> > > > >
> > > > > m68ks with separate stacks have a dedicated in HW interrupt stack
> > > > > and push a special F/VO stack frame on the outer most interrupt to
> > > > > indicate that you are at the edge of the interrupt stack and
> > > > > that the rte popping this F/VO must switch the the other
> > > > > (always master for RTEMS) stack.
> > > > >
> > > > > "m68ks without separate stacks" is all m68k's based on
> > > > > 68000, cpu32, and coldfire cores. Quote from Chris Johns:
> > > > >
> > > > > Also, Motorola fixed the '060 and Coldfire's by allowing the
first
> > > > > instruction after the interrupt to execute before allowing the
core
> > to
> > > > > service another interrupt
> > > > >
> > > > > The 68020/30/40 have a throwaway F/VO stack frame so we always
know we
> > > > > are nested in a hardware defined manner. No problem on those
models.
> > > > > Even if the interrupt occurs before the 1st instruction, the
throwaway
> > > > > for the outermost interrupt is still there so all is cool.
> > > > >
> > > > > The other CPU models do not have a HW interrupt stack, thus do not
> > > > > switch
> > > > > stacks, and thus do not generate the throwaway stack frame. They
> > > > > can nest to the 2nd interrupt before processing ANYTHING of the
FIRST.
> > > > >
> > > > > Chris's patch inserts the equivalent of the F/VO logic for these
> > > > > CPU models.
> > > > >
> > > > > Size of window:
> > > > >
> > > > > + 1st interrupt occurs
> > > > > + CPU pushes interrupt stack frame
> > > > > !!NO CODE FROM 1ST INTERRUPT EXECUTES!!!
> > > > > + 2nd interrupt occurs
> > > > > + 2nd interrupt of higher priority occurs
> > > > > + 2nd priority interrupt executes
> > > > >
> > > > > The size of the window is the time required to push an interrupt
> > > > > stack frame on some m68k CPU models. This should be 680x0 and
> > > > > 683xx primarily since they use CPU32, CPU32+ and 68000 cores.
> > > > >
> > > > > Let's give Thomas a while to test this.
> > > > >
> > > > > Bob .. if you changed your interrupt priorities back, could you
> > > > > reproduce the same failure you used to get?
> > > > >
> > > > > > Bye,
> > > > > > Thomas.
> > > > > >
> > > > > > > Bob Wisdom
> > > > > > >
> > > > > >
> > > > > > --------------------------------------------
> > > > > > IMD Ingenieurbuero fuer Microcomputertechnik
> > > > > > Thomas Doerfler Herbststrasse 8
> > > > > > D-82178 Puchheim Germany
> > > > > > email: Thomas.Doerfler at imd-systems.de
> > > > > > PGP public key available at:
http://www.imd-systems.de/pgp_key.htm
> > > > >
> > > > > --
> > > > > Joel Sherrill, Ph.D. Director of Research &
Development
> > > > > joel at OARcorp.com On-Line Applications Research
> > > > > Ask me about RTEMS: a free RTOS Huntsville AL 35805
> > > > > Support Available (256) 722-9985
> > > > >
> > >
> > > --
> > > Joel Sherrill, Ph.D. Director of Research & Development
> > > joel at OARcorp.com On-Line Applications Research
> > > Ask me about RTEMS: a free RTOS Huntsville AL 35805
> > > Support Available (256) 722-9985
> > >
>
> --
> Joel Sherrill, Ph.D. Director of Research & Development
> joel at OARcorp.com On-Line Applications Research
> Ask me about RTEMS: a free RTOS Huntsville AL 35805
> Support Available (256) 722-9985
>
More information about the users
mailing list