POSIX key structure in rbtree approach

Tue May 8 17:41:42 UTC 2012

It occurred to me that you might also be able to put the
_RBTree_Control in the thread control block (tcb) of each thread. then
you can store the keys per thread and lookup time will be log in the
number of keys a particular thread uses.

-Gedare

On Tue, May 8, 2012 at 11:06 AM, Gedare Bloom <gedare at rtems.org> wrote:
> On Tue, May 8, 2012 at 3:17 AM, Ashi <ashi08104 at gmail.com> wrote:
>>
>>
>> 2012/5/7 Gedare Bloom <gedare at rtems.org>
>>>
>>> On Mon, May 7, 2012 at 11:16 AM, Ashi <ashi08104 at gmail.com> wrote:
>>> > I'm terribly sorry for my late reply. I'm off for my weekend.
>>> >
>>> Most of us are. :)
>>>
>>> > 2012/5/4 Gedare Bloom <gedare at rtems.org>
>>> >>
>>> >> On Fri, May 4, 2012 at 3:28 AM, Ashi <ashi08104 at gmail.com> wrote:
>>> >> > Hi, all. I'm working on the "Use hash or map in POSIX keys" project.
>>> >> > As Joel said, the first thing to be determined is the problem of what
>>> >> > data goes with each thread and what goes with each key. I find it's
>>> >> > difficult to design the interface without some specific approach. And
>>> >> > since the rbtree api is available in RTEMS, I have tried to do the
>>> >> > design in the rbtree approach.
>>> >> >
>>> >> Good. The rbtree should support an efficient map operation.
>>> >>
>>> > Gedare, thanks for your reply!
>>> >
>>> >>
>>> >> > I find 2 data structures need revised(added): POSIX_Keys_Control and
>>> >> > key_node.
>>> >> >
>>> >> > typedef struct {
>>> >> >            Objects_Control     Object;
>>> >> >            void     (*destructor)( void * );
>>> >> >            key_node * key_node_ptr;
>>> >> > }POSIX_Keys_Control;
>>> >> >
>>> >> > typedef struct{
>>> >> >          rtems_rbtree_node Node;
>>> >> >          Objects_Id thread_id;
>>> >> >          void **value;
>>> >> > }key_node;
>>> >> >
>>> >> > POSIX_Keys_Control is used to manage all keys. And key_node is the
>>> >> > structure contains key data and rbtree node.
>>> >> I think you might want an rtems_rbtree_control key_tree field in
>>> >> POSIX_Keys_Control instead of this key_node* dynamic array.  Then your
>>> >> code can add structs containing an rtems_rbtree_node dynamically to
>>> >> the key_tree.
>>> >
>>> > After read the rbtree.h in score, I find I confused the
>>> > rtems_rbtree_node
>>> > and rtems_rbtree_control. It should be the rtems_rbtree_control here.
>>> >
>>> >> Also the key_node structure should follow a naming convention of
>>> >> POSIX_Keys_Key_node or something similar.
>>> >> API_Package_name_Struct_type_name is the general format for
>>> >> structures. API_Package_name_Method_name is the general format for
>>> >> functions. Note the mixture of lower and upper case letters.
>>> >
>>> > I see.
>>> >
>>> >>
>>> >> > I'm not quite sure about the Objects_Control member's function in
>>> >> > current implementation. I find key = Object.id, in which key is
>>> >> > pthread_key_t type, and  _POSIX_Keys_Get(key, &location) is used to
>>> >> > find key's corresponding the_key, which is POSIX_Keys_Control type.So
>>> >> > is it the only function of Object member ?
>>> >> >
>>> >> Object_Control means the structure embeds an RTEMS Object so that the
>>> >> structure can be managed by the Object Manager whose responsibilities
>>> >> include pre-allocating enough objects of a certain type to satisfy
>>> >> requests to Create those objects.
>>> >>
>>> >> It seems to me that the Key (key_node) should be the Object since the
>>> >> number of keys is the resource being managed. The present solution
>>> >> uses an array of values (one for each thread) because each thread can
>>> >> have its own version of a given key. The key is "looked up" by the
>>> >> Object Id of the Key and then the requesting thread.  What we should
>>> >> prefer is a more scalable solution that does not require
>>> >> pre-allocating an array of values for all possible threads.  Something
>>> >> like...
>>> >> typedef struct {
>>> >>           Objects_Control     Object;
>>> >>           void     (*destructor)( void * );
>>> >>          rtems_rbtree_node Node;
>>> >>          Objects_Id thread_id;
>>> >>          void **value;
>>> >> } POSIX_Keys_Control;
>>> >>
>>> >> And maybe a global:
>>> >> rtems_rbtree_control POSIX_Keys_Tree;
>>> >
>>> >
>>> > I think this is a new and great idea that I haven't thought before,Does
>>> > it
>>> > mean a single POSIX_Keys_Tree contains all node in the system?
>>> yes
>>>
>>> > I know a rbtree contains only key data belongs to one specific key in my
>>> > design above, and I was thinking the idea of manage all keys by a
>>> > rbtree,
>>> > then there would be a hiearchy: all keys are in one rbtree, and each
>>> > key's
>>> > data are in their own rbtrees. But I think this design is a little
>>> > complex,
>>> > and your idea is more uniform. However, will these two differ much in
>>> > runtime?
>>> >
>>> not sure because i do not quite understand your design. but I get the
>>> feeling that your design will have a big problem of managing multiple
>>> rbtrees simultaneously. I have not tried to do that before. Both
>>> approaches seem like they would have similar (log-time) asymptotic
>>> behavior.
>>>
>> I mean all keys are managed in one globle rbtree, in which each node has a
>> member of rtems_rbtree_control, like:
>> typedef struct {
>> Objects_Control Object;
>> void (*destructor)(void*);
>> rtems_rbtree_node Node;
>> rtems_rbtree_control key_tree;
>> }POSIX_Keys_Control;
>>
>> and for each key, all the key data is managed in each key's rbtree, like:
>> typedef struct{
>> rtems_rbtree_node node;
>> Objects_Id thread_id;
>> void *value;
>> }POSIX_Keys_Key_node;
>> However, this design seems more complex. And I don't know what could lead to
>> the problem of managing multiple
>> rbtrees simultaneously, can you give some example?
>>
> What do you mean "give some example"?
>
> I agree that the design of multiple rbtree's in a hierarchy is more
> complex. It would work, but it seems unnecessary here. A single rbtree
> of all the keys should be OK. Lookup/insert/delete times would all be
> O(lg n) where n is the total number of keys. In a hierarchical design
> the cost is lg k + lg t where k is the number of distinct keys and t
> is the number of tasks. The worst-case for the single rbtree is when
> every key is used by every thread and n = k*t. Then
> lg(n)=lg(k*t)=lg(k)+lg(t) which is the worst-case for the hierarchical
> approach. Granted there are cases where the hierarchical approach can
> do better, but in worst-case thinking the single rbtree approach is
> the best.
>
> Let's take an example. Consider a case where you have 4 distinct keys
> and 4 threads. If each key is only used by 1 thread then the
> worst-case lookup time for either approach is about lg(4) or 2
> branches traversed. Now suppose that 1 of the keys is used by all 4
> threads, 1 key is used by 2 threads, and the other 2 are used by 1
> thread each. Then the worst-case lookup time for the hierarchical
> approach is approx lg(4)+lg(4) = 4 for the key shared by all threads,
> and for the single rbtree it is lg(8) = 3. The expected cost for
> accessing an unshared key is about lg(4) =2 for hierarchical and
> lg(8)=3 for the single rbtree, so as I said sometimes the hierarchical
> can do better, but it does so by sacrificing the worst-case
> performance.
>
> Also, you should consider how the implementation can be done in the
> supercore layer and shared between Posix Keys and Classic Task vars.
> Examples of sharing implementations between Posix and Classic APIs are
> plentiful. See for example the corebarrier in score and corresponding
> barrier interfaces in posix/classic.
>
> -Gedare
>
>> -zw_yao
>>>
>>> >>
>>> >> Then you would implement an rbtree_compare function that will combine
>>> >> the POSIX_Keys_control.Object.id with POSIX_Keys_control.thread_id as
>>> >> a "key" for the rbtree to use for storing each POSIX_Keys_Control.
>>> >>
>>> > I see, I can compare POSIX_Keys_control.Object.id first and if they are
>>> > equal, then compare the POSIX_Keys_control.thread_id.
>>> >
>>> >
>>> >> all key related operations are described as follow:
>>> >> - key create:
>>> >> create POSIX_Keys_Control instance and intialize an empty rbtree, in
>>> >> which rtems_rbtree_control instance will be created and an rbtree
>>> >> node's key compare function is also needed. Thread_id is compared in
>>> >> the compare function. So this approach doesdn't consider allocate same
>>> >> data for the same thread in the same key, because only different
>>> >> Thread_id are comparable. But I find something about duplicate node in
>>> >> the Gedare's rbtree patch, maybe the duplicate feature also can be
>>> >> added into POSIX key.
>>> >>
>>> >>
>>> >> You still want unique rbtree comparisons so that a thread can
>>> >> distinguish its own keys, unless it is valid for a thread to allocate
>>> >> multiple values with the same key?
>>> >>
>>> >> Key create should not create the red-black tree; the rbtree should be
>>> >> created during initialization of the POSIX_Keys manager---which by the
>>> >> way is improperly named right now as POSIX_Key_Manager_initialization
>>> >> and should be renamed POSIX_Keys_Manager_initialization with an s on
>>> >> Keys (or just POSIX_Keys_Initialization).
>>> >>
>>> > I didn't notice the  POSIX_Key_Manager_initialization before, thanks.
>>> > BTW
>>> > I've a question about the _Objects_Initialize_information(), which
>>> > _POSIX_Key_Manager_initialization calls. I read the code of
>>> > _Objects_Initialize_information(), but still don't know what the maximum
>>> > parameter is used for? I think it's not for determining the memory size
>>> > of
>>> > all objects in one manager and then allocating such size of memory, is
>>> > it?
>>> >
>>> maximum determines the total number of that kind of object. it is
>>> controlled by some CONFIGURE_MAXIMUM_XXX, e.g.
>>> CONIFIGURE_MAXIMUM_POSIX_KEYS.
>>>
>>> -Gedare
>>>
>>> >> > - key delete
>>> >> > delete all the nodes in rbtree by rtems_rbtree_get_min() or
>>> >> > rtems_rbtree_get_max(), and delete the RBTree_Control,
>>> >> > POSIX_Keys_Control instance. The key data's deallocation is done by
>>> >> > user.
>>> >> >
>>> >> > - key get specific
>>> >> > we can use _RBTree_Find() to do the actually work behind
>>> >> > get_specific(),  the runtime is O(log(n))
>>> >> >
>>> >> > - key set specific
>>> >> > after a proper key_node is create, we can use _RBTree_Insert() to do
>>> >> > the set specific. the runtime is rbtree insert runtime, which is
>>> >> > O(log(n)).
>>> >> >
>>> >> > I'm not sure whether this design is appropriate and this design is
>>> >> > very simple, need many more work to do. Hope further discussion to
>>> >> > make it more clear!
>>> >> >
>>> >> Simple is good. I think you're on the right track other than my few
>>> >> comments about organizing the structures. You have the right ideas for
>>> >> how to implement the Keys functions I believe.
>>> >> -Gedare
>>> >
>>> > Yeah, it remind me the K.I.S.S. principle!
>>> >
>>> > -best regards!
>>> > -zw_yao
>>> >>
>>> >>
>>> >> > --best regards!
>>> >> > --zw_yao
>>> >> > _______________________________________________
>>> >> > rtems-devel mailing list
>>> >> > rtems-devel at rtems.org
>>> >> > http://www.rtems.org/mailman/listinfo/rtems-devel
>>> >
>>> >
>>
>>