Uniprocessor implementation of nested mutex problem of priority inversion.

Fri Sep 4 13:40:56 UTC 2015

Hello Saurabh,

I created a ticket to document some requirements for a mutex 
implementation. Its not complete and just a starting point:

https://devel.rtems.org/ticket/2412

On 17/08/15 07:45, Saurabh Gadia wrote:
> For every mutex held by a thread it is not feasible to have reference 
> to CORE_mutex_order_list of all mutex that it holds.*I feel that 
> current solution with CORE_mutex_order_list as argument is the best 
> choice to have as it is O(1) operation and needs no extra bookkeeping 
> for the thread.
> *
> In JPF model we have it as follows:
> public List<Mutex> mutexOrderList; //it is a linkedList which stores 
> acquired mutex objects in LIFO order.
>
> and to extract the index of mutex we call :
>
> public int getMutexIndex(Mutex obj)
> {
> return mutexOrderList.indexOf(obj);
> }
>
> Lets see it through example:
>
> Holder thread: H
> Mutex being acquired by executing thread which is acquired by holder 
> thread(H): M
> Executing thread: T
>
> From rtems we have following information:
>
> 1. In TCB:
> Every thread maintains a list of mutex acquired by it through out its 
> course in doubly circular linked list fashion :
>
> #ifdef __RTEMS_STRICT_ORDER_MUTEX__
>   /** This field is the head of queue of priority inheritance mutex
>    *  held by the thread.
>    */
>   Chain_Control            lock_mutex;
> #endif
>
> thread->lock_mutex
>
> 2. In Mutex Control Block:
>
> typedef struct{
>     /** This field is a chian of locked mutex by a thread,new mutex will
>      *  be added to the head of queue, and the mutex which will be 
> released
>      *  must be the head of queue.
>      */
>     Chain_Node                lock_queue;
>     /** This field is the priority of thread before locking this mutex
>      *
>      */
>     Priority_Control          priority_before;
>   }  CORE_mutex_order_list;
>
> Mutex->queue <CORE_mutex_order_list>
>
> When a thread acquires a mutex the mutex->queue.lock_queue node is 
> prepended to the holder's H->lock_mutex.
> This way we can traverse through all mutex's acquired by holder thread.
>
> Now when executing thread tries to acquire the mutex M which is 
> acquired by the holder thread(H) it wants to raise the recorded 
> priority before of all mutex's acquired by the holder after acquiring 
> mutex M. So the best way to do it in O(1) operation and without 
> storing any reference we do it in following manner in new solution:
>
> while(check!=head)
>   {
>     queue = RTEMS_CONTAINER_OF(check, CORE_mutex_order_list, lock_queue);
>     if(!(queue->priority_before > new_priority))
>     {
>       return PRIORITY_STATUS_NOT_CHANGED;
>     }
>     queue->priority_before = new_priority;
>     check = check->previous;
>   }
>
> This is the most efficient way we  can do that. We don't need to 
> change any design for this nor need to do any bookkeeping thing.
>
> Thanks,
>
> Saurabh Gadia
>
> On Sun, Aug 16, 2015 at 6:48 PM, Gedare Bloom <gedare at rtems.org 
> <mailto:gedare at rtems.org>> wrote:
>
>     We will rely on the linker should not pull the function in if unused.
>     So if it is only referenced by test code, this solution may be fine.
>
>     Gedare
>
>     On Sun, Aug 16, 2015 at 7:50 PM, Cyrille Artho
>     <cyrille.artho at gmail.com <mailto:cyrille.artho at gmail.com>> wrote:
>     > Hi all,
>     > As I wrote earlier, I think it makes sense to include validate but
>     > compile it conditionally. We want to use it for regression
>     testing but
>     > not in deployed code.
>     > Which macros do you use for this purpose?
>     >
>     > On Sat, Aug 15, 2015 at 2:50 PM, Cyrille Artho
>     <cyrille.artho at gmail.com <mailto:cyrille.artho at gmail.com>> wrote:
>     >> I will look at the code in detail on Monday, but we should keep in
>     >> mind that validate is only needed for testing. It does not have
>     to be
>     >> compiled into the final code otherwise.
>     >> So we do not have to be overly concerned with its performance but
>     >> instead we have to make sure that validate itself is not subject to
>     >> data races.
>     >>
>     >> On Sat, Aug 15, 2015 at 5:05 AM, Saurabh Gadia <gadia at usc.edu
>     <mailto:gadia at usc.edu>> wrote:
>     >>> Hi,
>     >>>
>     >>> I have implemented the validate method. following are the
>     links for it:
>     >>> github: https://github.com/saurabhgadia4/rtems/tree/nested-mutex
>     >>> commit:
>     >>>
>     https://github.com/saurabhgadia4/rtems/commit/e7f0f169c056076c46ef5ea17b0c38efe33fe576
>     >>>
>     >>> I am waiting on the decision of how to integrate call to this
>     >>> _Thread_Validate_Priority.
>     >>>
>     >>> Thanks,
>     >>>
>     >>> Saurabh Gadia
>     >>>
>     >>> On Fri, Aug 14, 2015 at 10:06 AM, Saurabh Gadia <gadia at usc.edu
>     <mailto:gadia at usc.edu>> wrote:
>     >>>>
>     >>>> And in respect of efficiency, we have to traverse through all
>     the mutex
>     >>>> held my the thread and do the checking. There is no other way
>     to confirm
>     >>>> that priority inversion has occurred or not. One way we can
>     do is have
>     >>>> default argument variable for core_mutex_surrender but then
>     there is change
>     >>>> in API call semantics. So kind of stuck between which way to
>     take.
>     >>>>
>     >>>> One way is that we can do lazy evaluation. following should
>     be the calling
>     >>>> pair:
>     >>>> task1()
>     >>>> {
>     >>>> ...
>     >>>> ...
>     >>>> rtems_semaphore_release()
>     >>>> validate()
>     >>>> }
>     >>>> these pairs should be intact so even if the task1 thread gets
>     preempted
>     >>>> after calling rtems_semaphore_release(), then other thread
>     will get control
>     >>>> of processor. When this task1 thread get the control back
>     then next call it
>     >>>> will do is validate() which is no harm to us as only task 1
>     thread can
>     >>>> release rest of its resources of we have owner release
>     binding. But there
>     >>>> can be one problem that is Till the task 1 thread get the
>     control back its
>     >>>> priority may be promoted by other task thread. So it won't be
>     100% accurate
>     >>>> validate test.
>     >>>>
>     >>>> Main problem still exists is: For uniprocessor(for this
>     project scope)
>     >>>> implementation how can we make sure that only after validate
>     method task1
>     >>>> can be preempted ( To acheive this behavior I guess we will
>     need to make
>     >>>> change to core_mutex_surrender).
>     >>>>
>     >>>> Thanks,
>     >>>>
>     >>>> Saurabh Gadia
>     >>>>
>     >>>> On Fri, Aug 14, 2015 at 9:43 AM, Saurabh Gadia <gadia at usc.edu
>     <mailto:gadia at usc.edu>> wrote:
>     >>>>>
>     >>>>> When a thread releases a semaphore/mutex we call this
>     validate method to
>     >>>>> make sure that there does not exists any priority inversion.
>     Following is
>     >>>>> the course of action that needs to be performed:
>     >>>>>
>     >>>>> 1. Validate method needs to be called within mutex_surrender
>     method
>     >>>>> because after releasing a mutex a new holder thread can get
>     scheduled and
>     >>>>> then we can't call validate method. We need to do call
>     validate before we
>     >>>>> enable interrupts in uniprocessor or dispatching of threads.
>     >>>>>
>     >>>>> 2. Functioning of validate method: input param -  executing
>     thread
>     >>>>> (thread which releases the mutex)
>     >>>>>
>     >>>>> 2.a) Go through the list of Mutex Chain i.e( Chain_Control
>     lock_mutex;)
>     >>>>> in TCB.
>     >>>>> 2.b) Extract mutex from the chain node linked list. This
>     gives us
>     >>>>> (the_mutex->queue.lock_queue) from which we will extract
>     mutex by using
>     >>>>> CONTAINER_OF() method.
>     >>>>>
>     >>>>> 2.c) Now access the thread wait queue of this mutex
>     >>>>> i.e(the_mutex->Wait_queue) to get the information about the
>     first waiting
>     >>>>> thread in this waitqueue by calling
>     >>>>> _Thread_queue_First_locked(&the_mutex->Wait_queue)
>     >>>>>
>     >>>>> 2.d) Now check whether
>     >>>>>
>     assert(releasingThread.current_priority<=first_locked_thread.currentPriority).
>     >>>>> If assertion fails then there is priority inversion problem
>     and we can break
>     >>>>> out of loop and return error status.
>     >>>>>
>     >>>>> 2.e) Repeat from 2.a till all acquired resources are visited
>     present in
>     >>>>> Chain_Control of releasing thread.
>     >>>>>
>     >>>>> So I am sceptical about how can I include this validate
>     method in the
>     >>>>> _CORE_mutex_Surrender(). We can have it as a separate call
>     from user level
>     >>>>> but then we need to make sure that the thread dispatch
>     mechanism is
>     >>>>> disabled. If not then whether including this validate method in
>     >>>>> _CORE_mutex_Surrender for only strict_order_mutex and
>     Priority inheritance
>     >>>>> attribute is feasible or not.
>     >>>>>
>     >>>>> Please guide me on this.
>     >>>>>
>     >>>>>
>     >>>>> Thanks,
>     >>>>>
>     >>>>> Saurabh Gadia
>     >>>>>
>     >>>>> On Thu, Aug 13, 2015 at 9:10 AM, Saurabh Gadia
>     <gadia at usc.edu <mailto:gadia at usc.edu>> wrote:
>     >>>>>>
>     >>>>>> That is how we were doing in JPF. The validate method was
>     triggered
>     >>>>>> after every release of mutex by any thread and we would
>     check for all the
>     >>>>>> waiting threads on mutex's held by the holder. And if it
>     finds a thread with
>     >>>>>> higher priority blocked then it would panic or give
>     assertion failure.
>     >>>>>>
>     >>>>>> Thanks,
>     >>>>>>
>     >>>>>> Saurabh Gadia
>     >>>>>>
>     >>>>>> On Thu, Aug 13, 2015 at 9:08 AM, Gedare Bloom
>     <gedare at rtems.org <mailto:gedare at rtems.org>> wrote:
>     >>>>>>>
>     >>>>>>> Thanks. Would it be possible for you to turn the failure
>     cases into
>     >>>>>>> real test failures? In other words, add some logic to
>     detect the
>     >>>>>>> priority inversion and abort the test?
>     >>>>>>>
>     >>>>>>> Gedare
>     >>>>>>>
>     >>>>>>> On Thu, Aug 13, 2015 at 12:05 PM, Saurabh Gadia
>     <gadia at usc.edu <mailto:gadia at usc.edu>> wrote:
>     >>>>>>> > Hi,
>     >>>>>>> >
>     >>>>>>> > Following is the result of spsem04 test without the patch:
>     >>>>>>> >
>     >>>>>>> > *** BEGIN OF TEST SPSEM 4 ***
>     >>>>>>> >
>     >>>>>>> > init: S0 created
>     >>>>>>> >
>     >>>>>>> > init: S1 created
>     >>>>>>> >
>     >>>>>>> > init: TA01 created with priority 36
>     >>>>>>> >
>     >>>>>>> > init: TA02 created with priority 34
>     >>>>>>> >
>     >>>>>>> > init: TA03 created with priority 32
>     >>>>>>> >
>     >>>>>>> > TA01: started with priority 36
>     >>>>>>> >
>     >>>>>>> > TA01: priority 36, holding S0
>     >>>>>>> >
>     >>>>>>> > TA02: started with priority 34
>     >>>>>>> >
>     >>>>>>> > TA02: priority 34, holding S1
>     >>>>>>> >
>     >>>>>>> > TA01: priority 34, holding S0 and now creating TA03
>     >>>>>>> >
>     >>>>>>> > TA03: started with priority 32
>     >>>>>>> >
>     >>>>>>> > TA01: priority 34 Releasing s0 (This is priority
>     inheritance problem
>     >>>>>>> > as TA02
>     >>>>>>> > waits on mutex held by TA01 and has higher priority than
>     TA01. TA03
>     >>>>>>> > just
>     >>>>>>> > promotes the priority TA02.)
>     >>>>>>> >
>     >>>>>>> > TA02: priority 32, holding S1,S0
>     >>>>>>> >
>     >>>>>>> > TA02: priority 32 before releasing S0
>     >>>>>>> >
>     >>>>>>> > TA02: priority 32 after releasing S0
>     >>>>>>> >
>     >>>>>>> > TA02: priority 32 before releasing S1
>     >>>>>>> >
>     >>>>>>> > TA03: priority 32, holding S1
>     >>>>>>> >
>     >>>>>>> > TA03: priority 32
>     >>>>>>> >
>     >>>>>>> > TA03: suspending
>     >>>>>>> >
>     >>>>>>> > TA02: priority 34 after releasing S1
>     >>>>>>> >
>     >>>>>>> > TA02: suspending
>     >>>>>>> >
>     >>>>>>> > TA01: priority 36
>     >>>>>>> >
>     >>>>>>> > TA01: exiting
>     >>>>>>> >
>     >>>>>>> > *** END OF TEST SPSEM 4 ***
>     >>>>>>> >
>     >>>>>>> > You can see the difference in highlighted portions of
>     both outputs.
>     >>>>>>> >
>     >>>>>>> > Thanks,
>     >>>>>>> >
>     >>>>>>> > Saurabh Gadia
>     >>>>>>> >
>     >>>>>>> > On Thu, Aug 13, 2015 at 8:17 AM, Saurabh Gadia
>     <gadia at usc.edu <mailto:gadia at usc.edu>> wrote:
>     >>>>>>> >>
>     >>>>>>> >> Ok. I will mail you back soon.
>     >>>>>>> >>
>     >>>>>>> >>
>     >>>>>>> >> On Thursday, August 13, 2015, Gedare Bloom
>     <gedare at rtems.org <mailto:gedare at rtems.org>> wrote:
>     >>>>>>> >>>
>     >>>>>>> >>> Saurabh,
>     >>>>>>> >>>
>     >>>>>>> >>> Please pull from rtems.git again, create a new branch from
>     >>>>>>> >>> 'master',
>     >>>>>>> >>> and apply your changes to the branch. It's too hard to
>     review code
>     >>>>>>> >>> that is not all by itself on a branch.
>     >>>>>>> >>>
>     >>>>>>> >>> Gedare
>     >>>>>>> >>>
>     >>>>>>> >>> On Thu, Aug 13, 2015 at 5:28 AM, Saurabh Gadia
>     <gadia at usc.edu <mailto:gadia at usc.edu>>
>     >>>>>>> >>> wrote:
>     >>>>>>> >>> > Hi,
>     >>>>>>> >>> >
>     >>>>>>> >>> > I have implemented uniprocessor model of nested
>     mutex problem in
>     >>>>>>> >>> > rtems.
>     >>>>>>> >>> > its
>     >>>>>>> >>> > still in basic form. I tried to multiplex it with
>     the existing
>     >>>>>>> >>> > solution
>     >>>>>>> >>> > but
>     >>>>>>> >>> > was finding hard time. To push ahead, I decided to
>     have separate
>     >>>>>>> >>> > functions
>     >>>>>>> >>> > for uniprocessor and SMP(kept default behavior) and
>     with your
>     >>>>>>> >>> > review
>     >>>>>>> >>> > comments will know what to do. Following is the link
>     for the git
>     >>>>>>> >>> > repo:
>     >>>>>>> >>> >
>     https://github.com/saurabhgadia4/rtems/commits/master and its JPF
>     >>>>>>> >>> > branch:
>     >>>>>>> >>> >
>     >>>>>>> >>> >
>     >>>>>>> >>> >
>     https://github.com/saurabhgadia4/lock-model/blob/uniproc-new1/rtems/Mutex.java
>     >>>>>>> >>> >
>     >>>>>>> >>> > I have also tested spsem01, 02, 03 test cases.
>     Following are the
>     >>>>>>> >>> > links
>     >>>>>>> >>> > for
>     >>>>>>> >>> > the test case results which states output before
>     solution and
>     >>>>>>> >>> > after
>     >>>>>>> >>> > applying
>     >>>>>>> >>> > the solution. I am still not getting whether my code
>     is passing
>     >>>>>>> >>> > spsem03
>     >>>>>>> >>> > test
>     >>>>>>> >>> > or not. How can I verify that?
>     >>>>>>> >>> >
>     >>>>>>> >>> > Test Case Link:
>     >>>>>>> >>> >
>     >>>>>>> >>> >
>     >>>>>>> >>> >
>     https://drive.google.com/folderview?id=0B44HRKVuGCkFfnFDVmxqQzZZUzljNUg4YmVPZmEybEp2Q0NNclpvS2FvemZ4Tm5Xa19nemM&usp=sharing
>     >>>>>>> >>> >
>     >>>>>>> >>> > Thanks,
>     >>>>>>> >>> >
>     >>>>>>> >>> > Saurabh Gadia
>     >>>>>>> >>
>     >>>>>>> >>
>     >>>>>>> >>
>     >>>>>>> >> --
>     >>>>>>> >> Thanks,
>     >>>>>>> >>
>     >>>>>>> >> Saurabh Gadia
>     >>>>>>> >>
>     >>>>>>> >
>     >>>>>>
>     >>>>>>
>     >>>>>
>     >>>>
>     >>>
>     >>
>     >>
>     >>
>     >> --
>     >> Regards,
>     >> Cyrille Artho
>     >
>     >
>     >
>     > --
>     > Regards,
>     > Cyrille Artho
>
>

-- 
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.