<offlist> or1k printf causes crash

Joel Sherrill joel.sherrill at oarcorp.com
Thu Aug 21 20:56:15 UTC 2014


On 8/21/2014 2:44 PM, Hesham Moustafa wrote:
> Hi,
>
> I have been debugging since a while or1k code hopefully I'd find
> what's wrong. Here's what I got.
First I am moving this to devel@ so others can chime in.
> First, I asked about this problem at #openrisc IRC channel, they told
> me the problem might be that I have to take account of the red-zone, I
> asked what's the red-zone and Stefan said the following:
> "the first 128 bytes of the stack has to be stepped over, leaf
> functions might use that without modifying the stack pointer, and gcc
> takes advantage of the fact that there is a red zone in non-leaf
> functions prologues too. i.e. it stores things on the stack and *then*
> update the stack pointer"
This is a bug in gcc. We have seen it on the ARM and there was a recent
dust up from the Linux kernel community because it happened on x86-64.
My understanding is that there was rework/improvement which triggered
bugs in backends. But this needs to be fixed.

The sp must be updated before the memory can be used. This is just
a bug otherwise.
> He suggested that I add 128 bytes to stack pointer before I jump to
> _ISR_Handler (from start.S). I tried this solution and I was not
> lucky. You may have some ideas where/when this red-zone make problem.
You probably need to
> Second, I discovered that there is unusual (unalign) exception happens
> when using printf (which does not happen with printk). When I stack, I
> found out the problem happens in rtems_semaphore_obtain(), when trying
> to access the_semaphore data which its pointer is returned (invalid
> pointer) from a call to _Objects_Get_isr_disable(). This exception
> only happens after DISPATCH_NEEDED is true and _ISR_Handler jumps to
> _Thread_Dispatch and make a successful context switch and run the
> first task. The following is a snapshot of the output when
> encountering this problem.
What's the alignment of the task stack in the port? The stack may not be
properly aligned for the widest access of the or1k. 
> "*** BEGIN OF TEST CLOCK TICK ***
> TA1  - rtems_clock_get_tod - 09:00:00   12/31/1988
> TA2  - rtems_clock_get_tod - 09:00:00   12/31/1988
> TA3  - rtems_clock_get_tod - 09:00:00   12/31/1988
> Fatal Error 263572 Halted"
Can you tell what the instruction is? And the address it is trying to
access.
> I set a break point at  a call to _Objects_Get_isr_disable() and
> continued until the call that returns the invalid Object pointer, and
> typed bt to get the following stack:
Another possibility is that the register/memory constraints on
enable/disable
interrupts isn't right and it is confusing gcc. You could be randomly
clobbering
registers anytime ISRs are disabled/enabled.

Christian.. can you review that code?
> "
> #0  _Objects_Get_isr_disable (
>     information=0x3ba54 <_Semaphore_Information>,
>     id=436273156, location=0x406b4, level_p=0x406b0)
>     at ../../../../../../rtems/c/src/../../cpukit/score/src/objectgetisr.c:34
> #1  0x00014294 in _Semaphore_Get_interrupt_disable (
>     id=436273156, location=0x406b4, level=0x406b0)
>     at ../../cpukit/../../../or1k_or1ksim/lib/include/rtems/rtems/semimpl.h:196
> #2  0x000142e0 in rtems_semaphore_obtain (id=436273156,
>     option_set=0, timeout=0)
>     at ../../../../../../rtems/c/src/../../cpukit/rtems/src/semobtain.c:47
> #3  0x0000d648 in rtems_termios_write (arg=0x40730)
>     at ../../../../../../rtems/c/src/../../cpukit/libcsupport/src/termios.c:1099
> #4  0x00004380 in console_write (major=0, minor=0,
>     arg=0x40730)
>     at ../../../../../../../../rtems/c/src/lib/libbsp/or1k/or1ksim/../../shared/console_write.c:42
> #5  0x00031cc4 in rtems_io_write (major=0, minor=0,
>     argument=0x40730)
>     at ../../../../../../rtems/c/src/../../cpukit/sapi/src/---Type
> <return> to continue, or q <return> to quit---
> iowrite.c:37
> #6  0x000305f0 in rtems_deviceio_write (iop=0x46a30,
>     buf=0x4088c, nbyte=1, major=0, minor=0)
>     at ../../../../../../rtems/c/src/../../cpukit/libcsupport/src/sup_fs_deviceio.c:109
> #7  0x0002fc70 in device_write (iop=0x46a30,
>     buffer=0x4088c, count=1)
>     at ../../../../../../rtems/c/src/../../cpukit/libfs/src/imfs/deviceio.c:90
> #8  0x00038f14 in write (fd=2, buffer=0x4088c, count=1)
>     at ../../../../../../rtems/c/src/../../cpukit/libcsupport/src/write.c:48
> #9  0x00038d54 in _write_r (ptr=0x3db40, fd=2,
>     buf=0x4088c, nbytes=1)
>     at ../../../../../../rtems/c/src/../../cpukit/libcsupport/src/write_r.c:41
> #10 0x00033198 in __swrite (ptr=0x3db40, cookie=0x3dd68,
>     buf=0x4088c "T\004\b\220", n=1)
>     at ../../../../../gcc-4.8.2/newlib/libc/stdio/stdio.c:97
> #11 0x000357c0 in __sfvwrite_r (ptr=0x3db40, fp=0x3dd68,
>     uio=0x40840)
>     at ../../../../../gcc-4.8.2/newlib/libc/stdio/fvwrite.c---Type
> <return> to continue, or q <return> to quit---
> :99
> #12 0x000338a0 in __sprint_r (ptr=ptr at entry=0x3db40,
>     fp=fp at entry=0x3dd68, uio=uio at entry=0x40840)
>     at ../../../../../gcc-4.8.2/newlib/libc/stdio/vfprintf.c:437
> #13 0x000345e0 in __sprint_r (uio=0x40840, fp=0x3dd68,
>     ptr=0x3db40)
>     at ../../../../../gcc-4.8.2/newlib/libc/stdio/vfprintf.c:1776
> #14 _vfiprintf_r (data=0x3db40, fp=fp at entry=0x3dd68,
>     fmt0=fmt0 at entry=0x392d1 "%c", ap=0x40930,
>     ap at entry=0x4092c)
>     at ../../../../../gcc-4.8.2/newlib/libc/stdio/vfprintf.c:1776
> #15 0x00032aec in fiprintf (fp=0x3dd68, fmt=0x392d1 "%c")
>     at ../../../../../gcc-4.8.2/newlib/libc/stdio/fiprintf.c:50
> #16 0x00002f28 in Test_task (unused=1)
>     at ../../../../../../../rtems/c/src/../../testsuites/samples/ticker/tasks.c:43
> #17 0x00031ddc in _Thread_Handler ()
>     at ../../../../../../rtems/c/src/../../cpukit/score/src/threadhandler.c:192
> ---Type <return> to continue, or q <return> to quit---
> #18 0x00031d64 in _User_extensions_Thread_exitted (
>     executing=0x3d92c)
>     at ../../cpukit/../../../or1k_or1ksim/lib/include/rtems/score/userextimpl.h:243
> Backtrace stopped: frame did not save the PC
> "
>
> This problem does not happen with printk, because non of these newlib
> stuff is called and consequently rtems_semaphore_obtain() is not
> called after context switches and/or _ISR_Handler.
printk is simple and may not be accessing memory in the same way. It
also may
be simple enough that an issue with incorrect register constraints on inline
assembly aren't blowing it up.
>
>
> On Tue, Aug 19, 2014 at 7:52 PM, Gedare Bloom <gedare at rtems.org> wrote:
>> Submit the revised patch.
>>
>> -Gedare
>>
>> On Tue, Aug 19, 2014 at 1:49 PM, Hesham Moustafa
>> <heshamelmatary at gmail.com> wrote:
>>> Hi Gedare,
>>> Thanks for providing this solution, I will try to imitate these two files
>>> and run the test. The fixed patch for or1ksim is ready, should i submit it
>>> or wait until I check this solution and hopefully figuring out what is
>>> wrong?
>>>
>>> On Aug 19, 2014 7:08 PM, "Gedare Bloom" <gedare at rtems.org> wrote:
>>>> Hi Hesham,
>>>>
>>>> I found this advice from Sebastian in our bugzilla related to another
>>>> arch (bfin) that has some context-switch problems:
>>>> "In order to test the exception code I would add the functions
>>>>
>>>> _CPU_Context_validate()
>>>> _CPU_Context_volatile_clobber(
>>>> )
>>>>
>>>> used in this test
>>>>
>>>> http://git.rtems.org/rtems/tree/testsuites/sptests/spcontext01/init.c
>>>>
>>>> For examples please have a look at the ARM, Nios 2 or PowerPC."
>>>>
>>>> You may like to try this out to debug your problem.
>>>> Gedare

-- 
Joel Sherrill, Ph.D.             Director of Research & Development
joel.sherrill at OARcorp.com        On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available                (256) 722-9985



More information about the devel mailing list