a bug in multi-processors

FRANK frank1997 at gmail.com
Thu Mar 2 01:38:52 UTC 2006


Hi,
I think there maybe something wrong in the function
_MPCI_receive_server (in mpci.c) I have tested such a programme: I
make two nodes. Node1 creates one task and Node2 creates two tasks.
Task1 creates a semaphore and task 2,3 scramble a semaphore.
Suppose that task2's obtaining request arrives first, Node1 will
fulfill its request and give it a semaphore, and then Node1 handles
task3's obtaining request. However since a semaphore had been given to
task2, task3's request can not be fulfilled so that deadlock happens. 
Because _MPCI_receive_server performs remote request, in fact
_MPCI_receive_server of Node1 has a deadlock.
So when task2's release request arrives, nobody come to response which
means task2's release request can't be fulfilled, so task3's abtaining
request can't be fulfilled forever.
Am I right? I hope you would give me a prompt reply.  Thanks a lot.

我作的测试是让Node1创建一个任务task1,Node2创建两个任务task2、task3。task1创建信号量,task2、3争夺信号量。假设task2的获取请求先到,Node1会满足它的获取请求把信号量给它,然后Node1会去处理task3的获取请求,但是由于信号量给了task2所以task3的请求不能完成,任务被阻塞。但因为执行远程请求的是_MPCI_receive_server本身,所以实际上现在被阻塞的是Node1的_MPCI_receive_server,所以当task2的释放请求到达时就没有人去收包解析请求,换句话说task2的释放请求得不到满足,所以task3的获取请求也永远得不到满足

Frank


More information about the users mailing list