RBTree inline hooks Was: Re: SMP: ISR disable/enable vs. mutual exclustion

Mon Sep 16 13:06:21 UTC 2013

Hey Pavel,

I spent a couple hours trying to hack together a version of Daniel's
approach in RTEMS. I think there might be a fundamental problem that
RTEMS compiles the rbtree code into score library (libscore.a). I
think this means that the compiler cannot inline comparison hooks
within the score insert/search rbtree routines. Perhaps I
misunderstood the approach and should try again later, but I don't
have time right now.

-Gedare

On Wed, Aug 28, 2013 at 1:02 PM, Gedare Bloom <gedare at rtems.org> wrote:
> Thanks Pavel. More below.
>
> On Wed, Aug 28, 2013 at 10:31 AM, Pavel Pisa <ppisa4lists at pikron.com> wrote:
>> Hello All,
>>
>> I want to express my two cents even that I cannot find time
>> to do more than chin-wag, sorry too many other project
>> required for university an company stay in a operation.
>>
>> The containers basic API should be without locking.
>> You usually need something like atomic extract from one list
>> and insert to the other list. And you waste additional
>> locking in list remove/insert when you have to provide
>> lock to make operations sequence atomic.
>> There is usually good/natural place where lock belongs
>> in such cases. Or you do not want to use IRQ+spin-lock
>> there but whole sequence is protected by mutex (should be
>> with priority inheritance for each case).
>>
> I tend to agree. Let the "user" of the data structure handle the
> locking around it at least for RTEMS core code. But this means right
> now going through all of RTEMS core and replacing chain_xxx with
> chain_xxx_unprotected.
>
> Conversely, we make all the functions unprotected by default and
> introduce "_protected" variants for the API layer.
>
>> This is how Linux kernel defines containers and data structures
>> too. Problem is that it is really very hard to switch
>> to unlocked containers when code in some places and sequences
>> depends on locking in containers.
>>
> The big problem for RTEMS is the exposure of chains through the user API.
>
>> On the other hand there are mechanism which can belong
>> to some containers variants or can require containers
>> implementation support. Linux is moving now from locking
>> to RCU (read-copy update)[*1] mechanism for heavily accessed
>> data where reads prevails and some short time work
>> with old copy is allowed or can be detected by other mechanism.
>> The interration over list requires use of barriers and single
>> next pointer access ins such case. So there is separate list next
>> function variant for this case.
>>
>> So my suggestion is to define lists unprotected and use them
>> as such in new code. There can be protected variant for legacy
>> users but not for critical/core code.
>>
>> I have same concerns with RTEMS red-black trees from beginning.
>> But I was not able to provide better code and was happy things
>> move forward.
>>
>> As for red-black trees: I really like much Daniel Santos's
>> Generic Red-Black Trees API proposed for Linux kernel.
>> It is more demanding on compiler than my user pace/RTEMS
>> used uLUt AVL, but it has great API and new RTEMS releases
>> has not such problems to be compiled with very old toolchains.
>> RTEMS toolchains moves forward with versions.
>>
>> So I suggest to look at that proposal even that it is not
>> used by Linux kernel yet
>>
>>   http://thread.gmane.org/gmane.linux.kernel/1367138/focus=1367138
>>
>> I think that wrappers code license would not be problem to
>> get from Daniel with RTEMS GPL exception. Base RB insert, balance
>> has to be taken from actual RTEMS or other license compatible
>> source.
>>
> His approach looks really neat, although the performance is tightly
> tied to gcc. I'm not a huge fan of relying on CPP for code generation,
> but if the performance/programmability tradeoff is good I think it is
> worth considering. (Heavy CPP usage is one reason I do not like the
> BSD tree.h approach, which uses macros to generate type-specific
> red-black or splay trees.)
>
> Even if we cannot get Daniel's wrappers code directly, I think the
> idea could be recreated in the RTEMS RBTree code base with some
> effort. RTEMS RBTree uses the compare function as an indirection
> already. I wonder if we can start with upgrading the current RBTree to
> use this new approach of an "initializer struct" and then consider
> enhancements such as locking, and "find_near/insert_near" functions
> which I have implemented but did not commit.
>
> I suppose the new approach could also be used to offer indirection for
> locking? Daniel's method does not, because Linux does not require
> it...
>
>> Preparation of similar API for lists would not take much time
>> as well. It would be straightforward to follow same model.
>> He has done massive testing that proposal overhead is zero
>> for most of/all architectures when recent GCC is used.
>> It results in some instruction level overhead when older GCC
>> is used. Type safety has zero size tree node and root structures
>> overhead. Overhead of actual RTEMS RB tree is higher even
>> for unlocked version due to functions indirection.
>>
>> [*1] RCU is available under GPL and LGPL licenses and IBM
>> pattents are pledged for such use. Use of RCU idea and even more
>> of code with RTEMS exception would require to ask RCU maintainer/
>> author Paul E. McKenney. On the other hand, RCU has usually
>> some memory overhead and its gain is critical for big
>> SMP machines with NUMA architecture. As I consider RTEMS
>> as more targetted to smaller systems and more deterministic
>> behavior, I think that RCU is not on the table for longer time there.
>>
> I agree, we do not need to think too much about RCU. Especially
> because the patent + license issue is a concern.
>
>> Thanks much to Sebastian for his huge amount of work
>> and best wishes to RTEMS and developers,
>>
>>                  Pavel
>>
>> On Wednesday 28 of August 2013 00:44:58 Chris Johns wrote:
>>> Sebastian Huber wrote:
>>> > On 2013-08-27 01:26, Chris Johns wrote:
>>> >> Sebastian Huber wrote:
>>> >>> On 2013-08-24 04:10, Chris Johns wrote:
>>> >>>>> Thus the normal extract operation is not available on SMP. An extract
>>> >>>>> variant which needs also the chain control as a parameter must be
>>> >>>>> used.
>>> >>>>
>>> >>>> I think a node may need a back pointer to the chain control that
>>> >>>> contains a
>>> >>>> lock. I suspect we cannot have a single score chain control structure
>>> >>>> for both
>>> >>>> protected and unprotected operations and support the current extract.
>>> >>>> I have
>>> >>>> not looked at all the uses of extract in the code so I do not know if
>>> >>>> the chain
>>> >>>> control is available and even it is I think the node should handle
>>> >>>> this.
>>> >>>
>>> >>> In order to use the back pointer you have to lock the chain, so this
>>> >>> cannot be used. You have to know on which chain the node is.
>>> >>
>>> >> Yes you are correct. Should the locking be something the user of the
>>> >> chains
>>> >> should manage ? The chains is starting to become more and more complex.
>>> >
>>> > On single processor configuration the locking is done by the chains (ISR
>>> > disable/enable) so I think there is no way out.
>>>
>>> The RTEMS chains has protected, unprotected and now explicit and I am
>>> starting to wonder if the API has become too complicated for users with
>>> too many choices. Some choices work with SMP other do not. We really
>>> should have one version that works in all cases however this means
>>> breaking the existing API. On the other hand anyone not using the
>>> explicit API will break when moving code to an SMP target and only finds
>>> out once they move or try. They may have made design choices without
>>> being aware of the differences and possible issues. The one related to
>>> this change is the need to have the chain control available when
>>> extracting a node. This is a big difference.
>>>
>>> I understand the migration and the need not to break an API however I am
>>> asking the question about the complexity of the API and if we should
>>> review what it is doing. With this change where you need the control
>>> chain to extract the node you have some data following the node and that
>>> could contain the lock and I suspect in a number of cases will contain a
>>> lock. I also wonder if the user of the chain can manage the locking
>>> better in some cases than letting the chain do it. The locking was fast
>>> before and now it is not as fast changing the dynamic.
>>>
>>> Extending the API makes sense now and from a kernel change point of view
>>> however I am not convinced of the long term benefit when all the current
>>> rapid change settles and is forgotten.
>>>
>>> > Please have a look at the patch here:
>>> >
>>> > https://www.rtems.org/pipermail/rtems-devel/2013-August/004370.html
>>> >
>>> > I had to add rtems_chain_explicit_extract() and
>>> > rtems_chain_explicit_insert() routines.
>>>
>>> I understand the reason you have added the extra call. We either need to
>>> break the existing API or extend the current one. Having two APIs is not
>>> helpful to users and we need just one. If there is an agree path to one
>>> API I would be happier. It could be a warning for non-SMP that the API
>>> will change.
>>>
>>> Chris
>>> _______________________________________________
>>> rtems-devel mailing list
>>> rtems-devel at rtems.org
>>> http://www.rtems.org/mailman/listinfo/rtems-devel
>> _______________________________________________
>> rtems-devel mailing list
>> rtems-devel at rtems.org
>> http://www.rtems.org/mailman/listinfo/rtems-devel