Multiprocessor problems

Daniel Hellstrom daniel at gaisler.com
Thu Jun 18 12:14:02 UTC 2009


Hi,

On a similar MP topic, all Init tasks have the same name "UI1" 
regardless of CPU node. I have seen in the mptests that the 
CONFIGURE_INIT_TASK_ATTRIBUTES is set to RTEMS_GLOBAL, this means that 
the rtems_ident_task() can not be used to look up the ID of the remote 
node's Init task. Perhaps the Init task name could be 
{'U','I','0'+nodeid,'\0'} instead?


GRMON thread info output from the two LEON3 CPUs, CPU0 
[0x40000000-0x43FFFFFF0] and CPU1 [0x44000000-0x47FFFFFF0]:

grlib> sym rtems-mp1
read 1456 symbols
entry point: 0x40000000
grlib> thread info

  Name | Type     | Id         | Prio | Time (h:m:s)  | Entry 
point             | PC                                           | State
---------------------------------------------------------------------------------------------------------------------------------------
  Int. | internal | 0x09010001 |  255 |      0.000000 | 
_BSP_Thread_Idle_body   | 0x400030a4 _BSP_Thread_Idle_body + 0x0       | 
READY    
---------------------------------------------------------------------------------------------------------------------------------------
  Int. | classic  | 0x09010002 |    0 |      0.005648 | 
_MPCI_Receive_server    | 0x4000c66c _Thread_Dispatch + 0xd8           | 
Wsem     
---------------------------------------------------------------------------------------------------------------------------------------
* UI1  | classic  | 0x0a010001 |    1 |      0.000000 | 
Init                    | 0x40001368 Init + 0x4                        | 
READY    
---------------------------------------------------------------------------------------------------------------------------------------

grlib> sym rtems-mp2
read 1456 symbols
entry point: 0x44000000
grlib> thread info

  Name | Type     | Id         | Prio | Time (h:m:s)  | Entry 
point             | PC                                           | State
---------------------------------------------------------------------------------------------------------------------------------------
  Int. | internal | 0x09020001 |  255 |      0.000000 | 
_BSP_Thread_Idle_body   | 0x440030a4 _BSP_Thread_Idle_body + 0x0       | 
READY    
---------------------------------------------------------------------------------------------------------------------------------------
  Int. | classic  | 0x09020002 |    0 |      0.005661 | 
_MPCI_Receive_server    | 0x4400c66c _Thread_Dispatch + 0xd8           | 
Wsem     
---------------------------------------------------------------------------------------------------------------------------------------
* UI1  | classic  | 0x0a020001 |    1 |      0.000000 | 
Init                    | 0x40001368 _RAM_SIZE + 0x3c00136c            | 
READY    
---------------------------------------------------------------------------------------------------------------------------------------


Daniel



Daniel Hellstrom wrote:

>Hi,
>
>The problem seems to be the initialization of _Objects_Local_node in 
>multiprocessor enabled kernels. Since the _MPCI_Initialization() 
>initializes _Objects_Local_node later than the first semaphores and 
>tasks are created, this makes the IDs assigned to created objects incorrect.
>
>In single processor systems the _Objects_Local_node is a constant set to 
>1, but in multiprocessor systems it is initially set to zero and then 
>initialized by _MPCI_Initialization().
>
>The problem you experience is probably the same problem I ran into this 
>week when running on a dual core SPARC/LEON3 system. Two tasks are 
>created before the node number is setup correctly. See below print out 
>from GRMON after breaking at Init():
>
>grmon> thread info
>
>  Name | Type     | Id         | Prio | Time (h:m:s)  | Entry 
>point             | PC                                           | State
>---------------------------------------------------------------------------------------------------------------------------------------
>  Int. | internal | 0x09000001 |  255 |      0.000000 | ??               
>| 0x0           | READY
>---------------------------------------------------------------------------------------------------------------------------------------
>  Int. | classic  | 0x09000002 |    0 |      0.000000 | ?? 
>                | 0x0              | Wsem
>---------------------------------------------------------------------------------------------------------------------------------------
>* UI1  | classic  | 0x0a010001 |    1 |      0.000000 | 
>RAM_END                 | 0x40001368 Init + 0x4                        | 
>READY
>---------------------------------------------------------------------------------------------------------------------------------------
>
>As you can see the node number is 0 rather than 1 or 2 in the ID field.
>
>The bug appears when the first MPCI packet is received on the target 
>node, the ISR calls _MCPI_Announce which tries to release a semaphore, 
>the blocked thread are thought to be global and the system crashes. The 
>function deciding if it is a global or local object simply checks if 
>they are of the same node, not if the node number is zero.
>
>RTEMS_INLINE_ROUTINE bool _Objects_Is_local_node(
>  uint32_t   node
>)
>{
>  return ( node == _Objects_Local_node );
>}
>
>To test that this theory holds I changed the declaration of 
>_Objects_Local_node to extern instead of SCORE_EXTERN, and declared it 
>in my project initialy initialized to the node number. The LEON3 dual 
>core system now works and I have successfully managed to get semaphores 
>and tasks interacting between the two nodes.
>
>uint16_t _Objects_Local_node = CONFIGURE_MP_NODE_NUMBER;
>
>
>
>I suggest that the initialization of _Objects_Local_node is moved to be 
>initialized earlier.
>
>Regards,
>Daniel Hellstrom
>
>
>
>Joel Sherrill wrote:
>
>  
>
>>Roger Dahlkvist wrote:
>> 
>>
>>    
>>
>>>Hi,
>>>
>>>I'm using a timer ISR polling method checking for new messages from other nodes. Unfortunately the system crashes as soon as rtems_multiprocessing_announce is called.
>>> 
>>>   
>>>
>>>      
>>>
>>There are no interrupts enabled until the initialization task is switched
>>in.
>>
>>I have wondered if it wouldn't make sense to have the MP initialization
>>sycnhronization done either explicitly by the application (like 
>>initialization
>>of TCP/IP) or implicitly by the init thread like C++ global constructors.
>>
>>You can try moving this code from exinit.c to threadhandler.c where and
>>protect it somehow from being executed more than once.
>>
>> #if defined(RTEMS_MULTIPROCESSING)
>>   if ( _System_state_Is_multiprocessing ) {
>>     _MPCI_Initialization();
>>     _MPCI_Internal_packets_Send_process_packet(
>>       MPCI_PACKETS_SYSTEM_VERIFY
>>     );
>>   }
>> #endif
>>
>>Then you will at least be able to get your interrupts and call MP announce
>>to complete system initialization.
>> 
>>
>>    
>>
>>>However, rtems_multiprocessing_announce works just fine if it's called just after the initialization phase, before the initinitialization task is started. That's really strange.
>>>
>>>So for example, if I make one node get initialized and started faster than the other node (using less drivers etc), I'll be able to create global objects. and as long as the other node has not started the initialization task, the message is received and the global objects table is updated, so it can be identified later on. But I can't use them since furter calls to rtems_multiprocessing_announce will fail.
>>>
>>>At this point I feel like I have tested just about everything, with no luck. It's urgent that I get MP to work properly. 
>>>I'm using Nios II processors and I have defined my own MPCI routines. I'm confident that they work properly and I have verified that the system crashes before they are even invoked.
>>>
>>>Is there anyone with MP experience who might have a clue of what's causing my problems? Any help is MUCH appreciated.
>>>
>>>//Roger
>>>
>>>_______________________________________________
>>>rtems-users mailing list
>>>rtems-users at rtems.org
>>>http://www.rtems.org/mailman/listinfo/rtems-users
>>> 
>>>   
>>>
>>>      
>>>
>> 
>>
>>    
>>
>
>_______________________________________________
>rtems-users mailing list
>rtems-users at rtems.org
>http://www.rtems.org/mailman/listinfo/rtems-users
>
>
>  
>




More information about the users mailing list