SMP Initialization
Gedare Bloom
gedare at rtems.org
Thu Mar 6 17:09:43 UTC 2014
On Thu, Mar 6, 2014 at 5:04 AM, Sebastian Huber
<sebastian.huber at embedded-brains.de> wrote:
> Hello,
>
> there is a potential problem in the SMP initialization procedure.
>
> One processor in the system has a special role, the so called boot
> processor. Currently this is the processor with index zero. The way to
> select the boot processor may change in the future, but what will not change
> is that we have a boot processor.
>
> The boot processors initializes the data and BSS sections. It performs also
> the sequential part of the RTEMS initialization.
>
> During the sequential initialization the function
>
> /**
> * @brief Performs CPU specific SMP initialization in the context of the
> boot
> * processor.
> *
> * This function is invoked on the boot processor by RTEMS during
> * initialization. All interrupt stacks are allocated at this point in case
> * the CPU port allocates the interrupt stacks.
> *
> * The CPU port should start secondary processors now.
> *
> * @param[in] configured_cpu_count The count of processors requested by the
> * application configuration.
> *
> * @return The count of processors available for the application in the
> system.
> * This value is less than or equal to the configured count of processors.
> */
> uint32_t _CPU_SMP_Initialize( uint32_t configured_cpu_count );
>
> called. This function is currently implemented by the BSPs. An example
> which starts the processor on its own:
>
> http://git.rtems.org/rtems/tree/c/src/lib/libbsp/sparc/leon3/smp/smp_leon3.c#n38
>
> An example which uses U-Boot to start the second processor:
>
> http://git.rtems.org/rtems/tree/c/src/lib/libbsp/powerpc/qoriq/startup/smp.c#n144
>
> The return value of _CPU_SMP_Initialize() will tell the RTEMS system how
> many processors are present.
>
> void _SMP_Handler_initialize( void )
> {
> uint32_t max_cpus = rtems_configuration_get_maximum_processors();
> uint32_t cpu;
>
> [...]
>
> /*
> * Discover and initialize the secondary cores in an SMP system.
> */
> max_cpus = _CPU_SMP_Initialize( max_cpus );
>
This assignment to max_cpus is also a problem that silently discards
the application's configured number of processors. At least, the
current BSP implementations ignore the case that the application
requests more cores than the BSP provides. Where should this problem
be handled, here or in the implementation of CPU_SMP_Initialize?
> _SMP_Processor_count = max_cpus;
> }
>
> If the BSP says "you have three processors", and one of them is actually not
> available, then we have a problem later.
>
> Before the system starts multitasking there is a synchronization barrier.
> This synchronization barrier is necessary to have a defined starting point
> for the scheduler.
>
> void _SMP_Request_start_multitasking( void )
> {
> Per_CPU_Control *self_cpu = _Per_CPU_Get();
> uint32_t ncpus = _SMP_Get_processor_count();
> uint32_t cpu;
>
> _Per_CPU_State_change( self_cpu, PER_CPU_STATE_READY_TO_START_MULTITASKING
> );
>
> for ( cpu = 0 ; cpu < ncpus ; ++cpu ) {
> Per_CPU_Control *per_cpu = _Per_CPU_Get_by_index( cpu );
>
> _Per_CPU_State_change( per_cpu, PER_CPU_STATE_REQUEST_START_MULTITASKING
> );
> }
> }
>
> So before this function returns ALL (!) processors must have changed into
> the PER_CPU_STATE_REQUEST_START_MULTITASKING (or into PER_CPU_STATE_SHUTDOWN
> which will terminate the system right now).
>
> In case one of the processors doesn't start, then we will wait here FOREVER
> (unless a watchdog kill us).
>
> There are now several ways to deal with this.
>
> 1. You can consider this a BSP bug. The BSP told the system via
> _CPU_SMP_Initialize() that so many processors are available. If this is not
> the case then the BSP lied and you should fix the BSP.
>
> 2. You can consider this a feature of the BSP that it tells you wrong
> numbers. So now what to do?
>
> 2.1. You can install a watchdog driver that kills you no matter what corrupt
> systems state you have. If you analyze the per-CPU states in this case you
> will notice that some of the processors didn't start.
>
> 2.2. You can limit the time spent waiting. If a timeout occurs then we can
> issue a fatal error that indicates exactly the problem area.
>
> 2.2.1 Now we need a facility to measure time (e.g. the CPU counter
> introduced recently).
>
> 2.2.2 Now we need a timeout.
>
> 2.2.2.1 The RTEMS kernel cannot know a proper timeout value.
>
> 2.2.2.2 The CPU/BSP may know the timeout value. How can the CPU/BSP tell
> the RTEMS kernel timeout value?
>
> 2.2.2.3 We can add an application configuration item that specifies the
> timeout value and move the responsibility to the application developer.
>
> I am in favor of 1. in combination with 2.1 and 2.2.2.2. For BSPs with
> unreliably start of secondary processors we should add a support function,
> e.g.
>
> /**
> * @brief Waits for all other processors to enter the ready to start
> * multitasking state with a timeout in microseconds.
> *
> * In case one processor enters the shutdown state, this function does not
> * return.
> *
> * This function should be called only in _CPU_SMP_Initialize() if required
> by
> * the CPU port or BSP.
> *
> * @param[in] processor_count The processor count which will later returned
> by
> * _CPU_SMP_Initialize().
> * @param[in] timeout_in_us The timeout in microseconds.
> *
> * @retval true All other processors entered the ready to start multitasking
> * state.
> * @retval false Not all the other processors entered the ready to start
> * multitasking state and the timeout expired.
> */
> bool _Per_CPU_State_wait_for_ready_to_start_multitasking(
> uint32_t processor_count,
> uint32_t timeout_in_us
> );
>
> This avoids the burden for the application developer to know about the
> timeout configuration option and to select a proper value. It moves the
> responsibility to deal with issue to the BSP which knows best what to do.
> In case false is returned it can either issue a fatal error or reduce the
> processor count.
>
This seems reasonable to me.
Gedare
> --
> Sebastian Huber, embedded brains GmbH
>
> Address : Dornierstr. 4, D-82178 Puchheim, Germany
> Phone : +49 89 189 47 41-16
> Fax : +49 89 189 47 41-09
> E-Mail : sebastian.huber at embedded-brains.de
> PGP : Public key available on request.
>
> Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
> _______________________________________________
> rtems-devel mailing list
> rtems-devel at rtems.org
> http://www.rtems.org/mailman/listinfo/rtems-devel
More information about the devel
mailing list