a bug

Victor V. Vengerov Victor.Vengerov at oktetlabs.ru
Thu Mar 2 14:48:59 UTC 2006


OK, I have deal with MPCI in RTEMS long time ago, so it is possible I'm 
wrong in details.

In my understanding, the following sequence of events is happened:
1. task 1 creates semaphore
  - MPCI SEMAPHORE_MP_ANNOUNCE_CREATE message sent to node 2 to announce 
semaphore creation
2. MPCI task at node 2 process this message
3. task 2 obtains the semaphore
  - MPCI message SEMAPHORE_MP_OBTAIN_REQUEST sent to node 1 to get the 
  - task 2 blocked waiting for the answer
4. MPCI task at node 1 receive the obtain message and process it. As 
result, it gets the semaphore and send SEMAPHORE_MP_OBTAIN_RESPONSE 
message to node 2.
5. MPCI task at node 2 receives the response message and unblock task 2. 
Semaphore owned by task 2.
6. task 3 obtains the semaphore
  - MPCI message SEMAPHORE_MP_OBTAIN_REQUEST sent to node 1 to get the 
  - task 3 blocked waiting for the answer
7. task 2 release the semaphore
  - MPCI message SEMAPHORE_MP_RELEASE_REQUEST sent to the node 1
  - task 2 continue it's execution
8. MPCI task at node 1 process SEMAPHORE_MP_RELEASE_REQUEST
  - because task 3 waiting the semaphore, it send 
SEMAPHORE_MP_OBTAIN_RESPONSE to node 2 to resume task 3.
9. MPCI task at node 3 process SEMAPHORE_MP_OBTAIN_RESPONSE message.
  - task 3 resumed and own the semaphore.

No deadlock should happen.


FRANK wrote:

>I think there maybe something wrong in the function
>_MPCI_receive_server (in mpci.c) I have tested such a programme. I
>make two nodes. Node1 creates one task to create a semaphore, and
>Node2 creates two tasks--task2 and task3. Task2 obtain the semaphore
>and then release it. Before task2 release the semaphore task3 try to
>obtain the semaphore. As a result it causes a deadlock. The reason I
>think it's that before _MPCI_receive-server has finished the latest
>request, it never receive a new request. But as this test, the obtain
>request of task2 can be satisfied immediately, and the obtain request
>of task3 can be satisfied only after the ralease request of task2
>being satisfied. But the obtain request of task3 comes earlier than
>the ralease request of task2, so before _MPCI_receive_server satisfies
>the obtain request of task3, it will never response the released
>request of task2, and this causes a deadlock.
>Am I right? I hope you would give me a prompt reply.  Thanks a lot.

Victor Vengerov
OKTET Labs, St.-Petersburg, Russia   Web: www.oktetlabs.ru
Phone +7 812 4286709(office) +7 812 9389372(mobile) +7 812 4281653(home)

More information about the users mailing list