[PATCH 2/4] score: SMP initialization and shutdown changes

Fri Feb 21 07:48:20 UTC 2014

On 2014-02-21 01:31, Chris Johns wrote:
> On 21/02/2014 3:31 am, Sebastian Huber wrote:
>> Hello Chris,
>>
>> On 2014-02-20 03:50, Chris Johns wrote:
>>> On 20/02/2014 12:42 am, Sebastian Huber wrote:
>>>> +/**
>>>> + * @brief State of a processor.
>>>> + *
>>>> + * @dot
>>>> + * digraph states {
>>>> + *   bi [label="PER_CPU_STATE_BEFORE_INITIALIZATION"];
>>>> + *   rsm [label="PER_CPU_STATE_READY_TO_START_MULTITASKING"];
>>>> + *   sm [label="PER_CPU_STATE_START_MULTITASKING"];
>>>> + *   ds [label="PER_CPU_STATE_DO_SHUTDOWN"];
>>>> + *   u [label="PER_CPU_STATE_UP"];
>>>> + *   s [label="PER_CPU_STATE_SHUTDOWN"];
>>>> + *   bi -> rsm [label="secondary processor\ncompleted initialization"];
>>>> + *   bi -> u [label="main processor\nstarts multitasking"];
>>>> + *   rsm -> sm [label="main processor\ncompleted initialization"];
>>>> + *   rsm -> ds [label="a fatal error occurred"];
>>>> + *   ds -> s [label="do shutdown\nstate observed"];
>>>> + *   sm -> u [label="secondary processor\nstarts multitasking"];
>>>> + *   u -> s [label="shutdown initiated"];
>>>> + * }
>>>> + * @enddot
>>>
>>> I do not see sm to ds if the main core goes to s after going from bi
>>> to u ?
>>>
>>> I also see u to s and not to ds; why as the state is called "do
>>> shutdown" ?
>>> Should this state be PER_CPU_STATE_SHUTTING_DOWN ?
>>
>> I had to change the procedure considerably since I noticed unsolvable
>> problems with the previous approach based only on atomic read/write
>> operations. Attached is the new state diagram.
>>
>
> I do not follow this picture. I am assume there is one state machine per CPU
> and so I do not see where the state "starting multitasking" is. I see a state
> called "request start multitasking" then "up" and I am wondering what the state
> is after the request has happened and before up.

The application defines a maximum count of processors (NP).  The configuration 
will then define the _Per_CPU_Information[NP] table.  Each table entry has one 
per-CPU state entry.  Since this table is in the BSS section, all states start 
with PER_CPU_STATE_INITIAL.

The boot processor calls boot_card() and all other processors do what they 
want, but they must wait until the boot processor gives the go (explained later).

The boot processor calls eventually:

   /**
    * @brief Performs CPU specific SMP initialization in the context of the boot
    * processor.
    *
    * This function is invoked on the boot processor by RTEMS during
    * initialization.  All interrupt stacks are allocated at this point in case
    * the CPU port allocates the interrupt stacks.
    *
    * The CPU port should start secondary processors now.
    *
    * @param[in] configured_cpu_count The count of processors requested by the
    * application configuration.
    *
    * @return The count of processors available for the application in the system.
    * This value is less than or equal to the configured count of processors.
    */
   uint32_t _CPU_SMP_Initialize( uint32_t configured_cpu_count );

So here has the CPU port (or BSP in most cases) the chance to reduce the 
configured count of processors to the actually available (AP).  We have AP <= 
NP.  In case this function returns "you have three processors", then these 
three processors MUST start properly or terminate the system.  If some 
processors are not always available or otherwise unreliable this must be dealt 
with in _CPU_SMP_Initialize().

System termination can happen at anytime on any processor.  So you can change 
from every state into PER_CPU_STATE_SHUTDOWN.  This ability is the core of this 
change set.  In the previous implementation this was not guaranteed.

Now lets look at the normal start-up (no shutdown).  The next state after 
PER_CPU_STATE_INITIAL is PER_CPU_STATE_READY_TO_START_MULTITASKING.

   /**
    * @brief Processor is ready to start multitasking.
    *
    * The secondary processor performed its basic initialization and is ready to
    * receive inter-processor interrupts.  Interrupt delivery must be disabled
    * in this state, but requested inter-processor interrupts must be recorded
    * and must be delivered once the secondary processor enables interrupts for
    * the first time.  The boot processor will wait for all secondary processors
    * to change into this state.  In case a secondary processor does not reach
    * this state the system will not start.  The secondary processors wait now
    * for a change into the PER_CPU_STATE_REQUEST_START_MULTITASKING state set
    * by the boot processor once all secondary processors reached the
    * PER_CPU_STATE_READY_TO_START_MULTITASKING state.
    */
   PER_CPU_STATE_READY_TO_START_MULTITASKING,

The key point for a per-CPU state and not a global state is that every 
processor must perform some initialization steps to set up the "I can receive 
inter-processor interrupts (IPI)" state.  Before IPIs are possible the only why 
are spin variables (e.g. this per-CPU state variables).

This PER_CPU_STATE_READY_TO_START_MULTITASKING is a synchronization barrier.

The boot processor will then set the state to 
PER_CPU_STATE_REQUEST_START_MULTITASKING on all processors configured and 
available (AP).  Once the secondary processor observes this state change (it 
spins on its state variable), it will go into the PER_CPU_STATE_UP state and 
perform a context switch to the first thread.

>
>>>
>>> I do not see why we have main and secondary processors ?
>>
>> I changed "main processor" into "boot processor" to highlight that this
>> is only the case during system boot.
>>
>
> Is there any code checking the processor number ? I see it in the
>
>>> This is symmetric
>>> multiprocessing which means each core is the same therefore capable of
>>> completing any required task. I understand there are paths which need to
>>> complete so if we have states for these phases as gates then any
>>> processor that
>>> arrives should be able to enter the gate (spinning lock for those that
>>> need to
>>> wait) and complete the work. This means a degraded state can exist and
>>> things
>>> at least start. The application would need to detect and manage the
>>> degraded
>>> state and so RTEMS should not be concerned with this condition other
>>> than doing
>>> its best to run where possible.
>>
>> In case the boot procedure of the system is unstable then it makes no
>> sense to run an application.
>>
>
> This depends on the application and its system requirements and how the
> hardware is constructed. A boot monitor can see cores are not present and
> decide not to start RTEMS or it can decide to start a single core. This relates
> to the system's requirement and can vary and has little to do with RTEMS. If a
> boot monitor passes control to RTEMS then it should run and only fail if it's
> integrity it not correct. RTEMS does should not move into the area of system
> requirements or constraints. Adding constraints like this in the operating
> system does not seem a good idea to me.

I see no problem here.  See _CPU_SMP_Initialize() above.

>
>>>
>>> I assume if we have n cores where n is 1..cpus available we enter the
>>> static
>>> constructors once and 'main' [1] once and this independent of the
>>> number of
>>> defined and/or operating cores. If an application's static constructor
>>> starts
>>> further threads it needs to manage the concurrency issues and main is
>>> only
>>> entered once.
>>
>> This change is about the low-level boot procedure.  High level concepts
>> like threads are not an issue here.
>>
>
> They are if you remove the all cores needing to sync plus using cpu numbers to
> control which cores go through to the static constructors and main and which
> cores go direct to the score thread code.

If you remove this synchronization barrier via 
PER_CPU_STATE_READY_TO_START_MULTITASKING, then you mandate that there is boot 
loader that does these steps.  With the current set-up this boot loader is 
optional.

> It is this sort of code that concerns
> me ...
>
> http://git.rtems.org/rtems/tree/c/src/lib/libbsp/arm/shared/include/arm-a9mpcore-start.h#n88

Yes, this is a dirty hack.  It works at the moment due to our immature SMP 
state, but this must definitely change.

>
>
> http://git.rtems.org/rtems/tree/c/src/lib/libbsp/sparc/shared/start/start.S#n226

Yes, exactly the same issue.  This processor index 0 == boot processor is a bad 
assumption.

>
> I see no need for this code if the per cpu and the system level state are
> handled symmetrically. That is all cores enter the same entry point and the
> state manages where and what they need to do.

Yes, I can imagine that all processors may call boot_card() and then something 
like this

SMP_lock boot_lock

boot_card:
	boot_lock.acquire()
	make first processor reaching this the boot processor
	boot_lock.release()

	bsp_start(is_boot_processor)

	if is_boot_processor:
		start sequential boot code
	else:
		wait for start multitasking request

>
> By the way is there a LEON4 or LEON5 in the works so is this dependence on
> LEON3 because it is the first to support SMP or because it is specific to this
> variant of the cpu ?

I didn't notice any differences between the LEON3 and LEON4 so far.

>
>>>
>>> [1] I am not concerned with the Classic API init task tables with more
>>> than one
>>> init task and what happens with static constructors as this is not
>>> defined by
>>> any standard. I will never directly use init task tables because it is
>>> not
>>> portable. All we can do is make sure static constructors and
>>> destructors are
>>> only called once.
>>
>> Yes, the constructor calls are an open issue.
>>
>
> Well lets get the issue on the table and discuss it. I see it is very much a
> part of this discussion and the overall design.

Ok, but please open another thread for this.  This is not related to this 
change set.

-- 
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.