BUG? stack corruption in posix thread.

Joel Sherrill joel.sherrill at OARcorp.com
Tue Jun 12 21:15:47 UTC 2012


On 06/12/2012 04:06 PM, Till Straumann wrote:
> On 06/12/2012 03:47 PM, Joel Sherrill wrote:
>> I agree with your analysis. Dispatching should be disabled
>> once all resources are allocated and before _Thread_Start
>> is called.
>>
>> But is it the capture thread or stack overflow checker
>> that is writing the pattern?
> It is the rtems_capture_start_task() extension callback
> which fills the stack with 0xdeaddead when a task is started.
OK.
>> I assume you have a patch for this and it is helping?
> I haven't tried yet. Mostly because I was unsure
> if the critical section in pthreadcreate.c should include
>    _Watchdog_Insert_ticks() or if that is not necessary.
I think it should be.  If it were a SCHED_SPORADIC server
then I don't think we would want a tick ISR to operate on
its replenishment information until it was really running.
> - T.
>> On 06/12/2012 03:39 PM, Till Straumann wrote:
>>> I observe pretty reproducable crashes when using pthreads under rtems-4.9.
>>>
>>> It seems that a function running in a new pthread context finds '0xdeaddead'
>>> on the stack upon returning to its caller and jumps into the wild.
>>>
>>> Apparently, the capture engine overwrites the task's stack with 0xdeaddead
>>> *after* the task is already running.
>>>
>>> I believe that 'pthread_create()' is the culprit. It creates a SCORE thread
>>> and then calls
>>>
>>> _Thread_Start( )
>>>
>>> *without* disabling thread-dispatching.
>>>
>>> However, _Thread_Start() marks the thread as 'runnable' *before* calling
>>> user extensions (_Thread_Start() body):
>>>
>>> {
>>>       if ( _States_Is_dormant( the_thread->current_state ) )
>>> {../../score/src/apimutexunlock.c
>>>
>>>         the_thread->Start.entry_point      = (Thread_Entry) entry_point;
>>>
>>>         the_thread->Start.prototype        = the_prototype;
>>>         the_thread->Start.pointer_argument = pointer_argument;
>>>         the_thread->Start.numeric_argument = numeric_argument;
>>>
>>>         _Thread_Load_environment( the_thread );
>>>
>>>         _Thread_Ready( the_thread );
>>>
>>>         _User_extensions_Thread_start( the_thread );
>>>
>>>         return true;
>>>       }
>>>
>>>       return false;
>>> }
>>>
>>> Therefore, could it not be that the thread is already scheduled *before*
>>> user extensions are executed? In this scenario, the following race condition
>>> could occur:
>>>
>>> 1. thread X calls pthread_create
>>> 2. _Thread_Start() marks new thread Y 'ready'
>>> 3. 'Y' is scheduled, calls stuff and blocks
>>> 4. 'X' runs again and executes user extensions for 'Y'
>>> 5. capture engine's 'thread_start' extension fills 'Y's stack with
>>> 0xdeaddead
>>> 6. 'Y' is scheduled again, when popping a return address from the stack
>>>        it jumps to 0xdeaddead and crashes the system.
>>>
>>> NOTES:
>>>      - other APIs (rtems, itron) *have* thread-dispatching disabled around
>>> _Thread_Start()
>>>      - the current 'master' branch seems to still suffer from this
>>>      - I consider this a serious bug.
>>>
>>> -- Till


-- 
Joel Sherrill, Ph.D.             Director of Research&   Development
joel.sherrill at OARcorp.com        On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
     Support Available             (256) 722-9985





More information about the users mailing list