Random lwIP Crashes in _POSIX_Mutex_Lock_support()
Sebastian Huber
sebastian.huber at embedded-brains.de
Wed Oct 21 13:00:36 UTC 2015
On 21/10/15 14:56, Isaac Gutekunst wrote:
> On 10/21/2015 08:24 AM, Sebastian Huber wrote:
>>
>>
>> On 21/10/15 14:13, Isaac Gutekunst wrote:
>>> Thanks for the reply.
>>>
>>> On 10/21/2015 01:50 AM, Sebastian Huber wrote:
>>>>
>>>>
>>>> On 20/10/15 16:02, Isaac Gutekunst wrote:
>> [...]
>>>>
>>>>>
>>>>> As far as I can tell this would only occur if the caller of
>>>>> pthread_mutex_lock was in a "bad"
>>>>> state. I don't believe it is in an interrupt context, and don't
>>>>> know what other bad states
>>>>> could exist.
>>>>
>>>> We have
>>>>
>>>> #define _CORE_mutex_Check_dispatch_for_seize(_wait) \
>>>> (!_Thread_Dispatch_is_enabled() \
>>>> && (_wait) \
>>>> && (_System_state_Get() >= SYSTEM_STATE_UP))
>>>>
>>>> What is the thread dispatch disable level and the system state at
>>>> this point?
>>>>
>>>> In case the thread dispatch disable level is not zero, then
>>>> something is probably broken in the
>>>> operating system code which is difficult to find. Could be a
>>>> general memory corruption problem
>>>> too. Which RTEMS version do you use?
>>>>
>>>
>>> The thread dispatch disable level is usually -1 or -2.
>>> (0xFFFFFFFE or 0xFFFFFFD).
>>
>> A negative value is very bad, but easy to detect via manual
>> instrumentation (only an hand full
>> of spots touch this variable) or hardware breakpoints/watchpoints.
>> Looks the rest of
>> _Per_CPU_Information all right?
>>
> It looks like it's only the thread_dispatch_disable_level that's broken.
>
> We'll go and grep for all places for all the places it's touched, and
> look for something.
>
> The problem with watchpoints is they fire exceptionally often, and
> putting in a conditional watchpoint slows the code to a crawl, but
> that may be worth it.
>
> Here are some printouts of the relevant structs right after a crash:
>
> $4 = {
> cpu_per_cpu = {<No data fields>},
> isr_nest_level = 0,
> thread_dispatch_disable_level = 4294967295,
> executing = 0xc01585c8,
> heir = 0xc0154038,
> dispatch_necessary = true,
> time_of_last_context_switch = {
> sec = 2992,
> frac = 10737447432380511034
> },
> Stats = {<No data fields>}
> }
No, this doesn't look good. According to the stack trace you are in
thread context. However, we have executing != heir and
dispatch_necessary == true. This is a broken state itself. I guess,
something is wrong with the interrupt level so that a context switch is
blocked. On ARMv7-M this is done via the system call exception.
--
Sebastian Huber, embedded brains GmbH
Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail : sebastian.huber at embedded-brains.de
PGP : Public key available on request.
Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
More information about the devel
mailing list