BUG? stack corruption in posix thread.
Till Straumann
strauman at slac.stanford.edu
Tue Jun 12 20:39:22 UTC 2012
I observe pretty reproducable crashes when using pthreads under rtems-4.9.
It seems that a function running in a new pthread context finds '0xdeaddead'
on the stack upon returning to its caller and jumps into the wild.
Apparently, the capture engine overwrites the task's stack with 0xdeaddead
*after* the task is already running.
I believe that 'pthread_create()' is the culprit. It creates a SCORE thread
and then calls
_Thread_Start( )
*without* disabling thread-dispatching.
However, _Thread_Start() marks the thread as 'runnable' *before* calling
user extensions (_Thread_Start() body):
{
if ( _States_Is_dormant( the_thread->current_state ) )
{../../score/src/apimutexunlock.c
the_thread->Start.entry_point = (Thread_Entry) entry_point;
the_thread->Start.prototype = the_prototype;
the_thread->Start.pointer_argument = pointer_argument;
the_thread->Start.numeric_argument = numeric_argument;
_Thread_Load_environment( the_thread );
_Thread_Ready( the_thread );
_User_extensions_Thread_start( the_thread );
return true;
}
return false;
}
Therefore, could it not be that the thread is already scheduled *before*
user extensions are executed? In this scenario, the following race condition
could occur:
1. thread X calls pthread_create
2. _Thread_Start() marks new thread Y 'ready'
3. 'Y' is scheduled, calls stuff and blocks
4. 'X' runs again and executes user extensions for 'Y'
5. capture engine's 'thread_start' extension fills 'Y's stack with
0xdeaddead
6. 'Y' is scheduled again, when popping a return address from the stack
it jumps to 0xdeaddead and crashes the system.
NOTES:
- other APIs (rtems, itron) *have* thread-dispatching disabled around
_Thread_Start()
- the current 'master' branch seems to still suffer from this
- I consider this a serious bug.
-- Till
More information about the users
mailing list