[PATCH 1/2] score: Implement scheduler helping protocol

Tue Jul 8 18:50:57 UTC 2014

On 7/8/2014 1:28 PM, Gedare Bloom wrote:
> On Tue, Jul 8, 2014 at 2:20 PM, Sebastian Huber
> <sebastian.huber at embedded-brains.de> wrote:
>>>>> diff --git a/cpukit/score/include/rtems/score/threadimpl.h
>>>>> b/cpukit/score/include/rtems/score/threadimpl.h
>>>>> index 4971e9d..cb7d5fe 100644
>>>>> --- a/cpukit/score/include/rtems/score/threadimpl.h
>>>>> +++ b/cpukit/score/include/rtems/score/threadimpl.h
>>>>> @@ -828,6 +828,16 @@ RTEMS_INLINE_ROUTINE bool _Thread_Owns_resources(
>>>>>    return owns_resources;
>>>>>  }
>>>>>
>>>>> +#if defined(RTEMS_SMP)
>>>>> +RTEMS_INLINE_ROUTINE Thread_Control *_Thread_Resource_node_to_thread(
>>>>> +  Resource_Node *node
>>>>> +)
>>>>> +{
>>>>> +  return (Thread_Control *)
>>>>> +    ( (char *) node - offsetof( Thread_Control, Resource_node ) );
>>>>> +}
>>> We should include some generic container_of function in rtems instead
>>> of reproducing it multiple places.
>>
>> In <sys/cdefs.h> we have:
>>
>> /*
>>  * Given the pointer x to the member m of the struct s, return
>>  * a pointer to the containing structure.  When using GCC, we first
>>  * assign pointer x to a local variable, to check that its type is
>>  * compatible with member m.
>>  */
>> #if __GNUC_PREREQ__(3, 1)
>> #define    __containerof(x, s, m) ({                    \
>>     const volatile __typeof__(((s *)0)->m) *__x = (x);        \
>>     __DEQUALIFY(s *, (const volatile char *)__x - __offsetof(s, m));\
>> })
>> #else
>> #define    __containerof(x, s, m)                        \
>>     __DEQUALIFY(s *, (const volatile char *)(x) - __offsetof(s, m))
>> #endif
>>
>> What about adding a similar
>>
>>  _Container_of()
>>
>> or
>>
>> rtems_container_of()
>>
>> to <rtems/score/basedefs.h>?
>>
> Probably it should be _Container_of() for supercore visible code.
>
>>>>> +
>>>>> +  _Resource_Node_set_root( resource_node, &root->Resource_node );
>>>>> +
>>>>> +  needs_also_help = ( *scheduler->Operations.ask_for_help )(
>>>>> +    scheduler,
>>>>> +    offers_help,
>>>>> +    needs_help
>>>>> +  );
>>>>> +
>>>>> +  if ( needs_also_help != needs_help && needs_also_help != NULL ) {
>>>>> +    _Assert( ctx->needs_help == NULL );
>>>>> +    ctx->needs_help = needs_also_help;
>>>>> +  }
>>>>> +
>>>>> +  return false;
>>>>> +}
>>>>> +
>>>>> +void _Scheduler_Thread_change_resource_root(
>>>>> +  Thread_Control *top,
>>>>> +  Thread_Control *root
>>>>> +)
>>>>> +{
>>>>> +  Scheduler_Set_root_context ctx = { root, NULL };
>>>>> +  Thread_Control *offers_help = top;
>>>>> +  Scheduler_Node *offers_help_node;
>>>>> +  Thread_Control *offers_also_help;
>>>>> +  ISR_Level level;
>>>>> +
>>>>> +  _ISR_Disable( level );
>>>>> +
>>>>> +  offers_help_node = _Scheduler_Thread_get_node( offers_help );
>>>>> +  offers_also_help = _Scheduler_Node_get_owner( offers_help_node );
>>>>> +
>>>>> +  if ( offers_help != offers_also_help ) {
>>>>> +    _Scheduler_Set_root_visitor( &offers_also_help->Resource_node, &ctx
>>>>> );
>>>>> +    _Assert( ctx.needs_help == offers_help );
>>>>> +    ctx.needs_help = NULL;
>>>>> +  }
>>>>> +
>>>>> +  _Scheduler_Set_root_visitor( &top->Resource_node, &ctx );
>>>>> +  _Resource_Iterate( &top->Resource_node, _Scheduler_Set_root_visitor,
>>>>> &ctx );
>>>>> +
>>> Does this iterate() with disabled interrupts have bad implications for
>>> schedulability / worst-case latency?
>>>
>> Yes, the worst-case latency depends now on the resource tree size. I don't
>> think its easy to avoid this.  You have at least the following options.
>>
>> 1. Use one lock or a hierarchy of locks to freeze the tree state and thus
>> enable a safe iteration.
>>
>> 2. Partially lock the tree and somehow provide safe iteration. How?  Is this
>> possible with fine grained locking at all?
>>
>> 3. Organize the tree so that the interesting elements are the min/max nodes.
>> I don't know how this can be done.  Each scheduler state change may result
>> in updates of all resource trees in the system.
>>
>> This implementation was done with fine grained locking in mind.   So I did
>> choose 1.  We can use the tree to get a partial order of per-resource locks
>> necessary to avoid deadlocks.
>>
> Ok, this is a tricky problem, and it should definitely be documented.
> I don't have a good idea right now about how the resource tree grows.
> Perhaps the size of the tree is bounded such that the cost isn't too
> bad.
Documented for sure but "I don't have a good idea right now..."
is a good thing for us to remember.

I have thought a lot in the background about how much effort we
should put into optimizing for the theoretical cases in SMP.  We
know on paper that a lot of things are potentially bad like coarse
grained locking, O(XXX) where XXX is non-constant, etc.

But in the back of my head, a little voice keeps saying avoid optimizing
too early. Let's not make stupid decisions when a good one is possible
but optimizing too early is an easy trap to step into. 

I wrote a lot more but decided to delete it. The short version is that
the use of SMP is going to be new to RTEMS users and other traditional
single address space RTOSes. We don't have any real-world feedback
to know what potential performance issues users will face. 

The only thing I am sure about if that we have as lot of education to
do about all the new features and their impact. And we need to be
on the lookout for good application patterns and best practices we
can pass along.

> -Gedare
> _______________________________________________
> devel mailing list
> devel at rtems.org
> http://lists.rtems.org/mailman/listinfo/devel

-- 
Joel Sherrill, Ph.D.             Director of Research & Development
joel.sherrill at OARcorp.com        On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available                (256) 722-9985