Proposed Affinity Changes

Joel Sherrill joel.sherrill at
Mon May 19 16:31:55 UTC 2014

On 5/15/2014 6:30 AM, Sebastian Huber wrote:
> Hello Joel,
> On 2014-05-15 01:55, Joel Sherrill wrote:
>> Hi
>> Although I think there are only a few code paths to
>> address for affinity added to the Priority SMP Scheduler,
> the Deterministic Priority SMP Scheduler has already support for thread 
> processor affinity.  It is _Scheduler_default_Set_affinity().  A thread is 
> allowed to execute on any processor of a scheduler instance.
> Please add support for arbitrary processor affinity sets to the Deterministic 
> Priority Affinity SMP Scheduler (schedulerpriorityaffinitysmp.c).
>> the modifications appear to be very subtle and I want
>> to get feedback given the potential impact on other
>> schedulers.
>> + set_affinity
>> + _Scheduler_SMP_Enqueue_ordered()
>> + _Scheduler_SMP_Extract()
>> + _Scheduler_SMP_Schedule()
>> == set_affinity
>> set_affinity scheduler support is being discussed in Sebastian's
>> change priority patch thread. But basically use the pattern
>> he proposed for _Scheduler_Change_priority() but for affinity.
> Yes, it should be local to the set affinity operation 
> (_Scheduler_priority_affinity_SMP_Set_affinity()).
>> == _Scheduler_SMP_Extract()
>> _Scheduler_SMP_Extract() doesn't appear to need anything
>> done.
> I deleted this operation.
>> ==_Scheduler_SMP_Enqueue_ordered()
>> For the enqueue ordered path, this is what I think needs to be
>> done. The changes appear to be minor but the interweaving of
>> indirect calls and rapid changes means I need some feedback.
>> When a node is in the air and we look for highest ready  in
>> _Scheduler_SMP_Enqueue_ordered, the highest ready should
>> have an affinity for this core.  It is does, then it gets returned
>> by (*get_highest_ready)(). If not, we can return NULL is
>> returned since we couldn't impact the thread allocated to this
>> node.
>>     How do you think this filter should be inserted into the framework
>>     of indirect calls leading here?
> I am not sure if you can use the existing _Scheduler_SMP_Enqueue_ordered().  My 
> feeling is that you need something completely new.  The current SMP scheduler 
> uses a simple chain to manage the scheduled threads.
> Which algorithm do you use to calculate the subset of ready threads allowed to 
> execute taking all the affinity sets into account?  I think you have to solve a 
> matching problem in a bipartite graph:
I was actually planning to implement something more like Linux Push/Pull.
It is very easy to create scenarios with inversions with arbitrary affinity
sets.  But does a real-time embedded system which is supposed to be
highly analyzed really have to deal with arbitrary overlapping affinities?

Any, I was following the guidance in Branderburg's dissertation and
ECRTS2013 paper.

Brandeburg's PhD notes that a one-pass pull that picks an arbitrary
higher priority process suffers from possible migration. I was planning
on selecting the highest with matching affinity since we do have global
knowledge. This should avoid the problem in Example 3.1.

Similarly on a push, you want to select the lowest priority process
with a matching affinity. This is Example 3.2

The ECRTS 2013 paper is concerned with arbitrary affinity sets
and the Linux Push/Pull.  It specifically uses the word highest
in association with a Pull. It says Lower on a push but I think
lowest is probably better and easy to implement given the
sorting of the executing queue.

In the last paragraph of section 3.2, he notes that they used
globally shared state with coarse-grained locking. This is more
desirable for analysis purposes. This is a point we need to seriously
consider as we move forward but no need to discuss now. We
just need working. :)

I suppose that at some point in the future, we could evaluate the
entire set space but focusing on lowest and highest seems to be

Either way, once you select the target thread/CPU to
impact, it should naturally work like it does now.
> You probably have to take all threads in the ready state into account to 
> determine the new scheduled threads.  You can then use something like this to 
> allocate an exact processor for them:
OK.  Thanks for this suggestion.
> static inline void _Scheduler_SMP_Allocate_processor_exact(
>    Scheduler_SMP_Context *self,
>    Thread_Control *scheduled,
>    Thread_Control *victim
> )
> {
>    Scheduler_SMP_Node *scheduled_node = _Scheduler_SMP_Node_get( scheduled );
>    Per_CPU_Control *cpu_of_scheduled = _Thread_Get_CPU( scheduled );
>    Per_CPU_Control *cpu_of_victim = _Thread_Get_CPU( victim );
>    Per_CPU_Control *cpu_self = _Per_CPU_Get();
>    _Scheduler_SMP_Node_change_state(
>      scheduled_node,
>    );
>    _Thread_Set_CPU( scheduled, cpu_of_victim );
>    _Scheduler_SMP_Update_heir( cpu_self, cpu_of_victim, scheduled );
> }
> You can even use this function to do things like this:
> _Scheduler_SMP_Allocate_processor_exact(self, executing, other);
> _Scheduler_SMP_Allocate_processor_exact(self, other, executing);
>> Similarly, if a node is not in the air, we look for lowest scheduled.
>> I am thinking that the hard-coded call to
>> _Scheduler_SMP_Get_lowest_scheduled() needs to be an indirect
>> call so an affinity aware version which scans scheduled to
>> find lowest with affinity for this node.
>> And is it safe to assume that node == current processor ID?
>> So there don't need to be arguments added to the calls?
>> Just check affinity against current processor number.
>> ==_Scheduler_SMP_Schedule()
>> This is only called as a side-effect of _Thread_Change_priority().
>> Sebastian.. is this entry point needed after your changes?
> The SMP scheduler use now _Scheduler_default_Schedule() which does nothing.
Great. So everything is a side-effect of enqueue/dequeue operations that
are unblocking and blocking.
>> If needed, then...
>> _Scheduler_SMP_Schedule_highest_ready() already is passed
>> get_highest_ready().  If we go with what was discussed for
>> _Scheduler_SMP_Enqueue_ordered() and use the same
>> get_highest_ready() implementation, then (*get_highest_ready)()
>> will return a NULL if the highest priority thread does not have
>> affinity for the core we are scheduling for.  Returning NULL
>> from get_highest_priority appears to be a bad idea for this
>> case.
>> Plus we need a thread with affinity for the core/node that
>> the priority was changed on so we can't assume that
>> the processor ID is easily available.
>> So there are two hard things I don't see an obvious answer to.
>> I think this path has a decision.

Joel Sherrill, Ph.D.             Director of Research & Development
joel.sherrill at        On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available                (256) 722-9985

More information about the devel mailing list