[GSoC]use hash/map in POSIX key and Classic notepad

Thu Apr 5 18:33:41 UTC 2012

On 04/05/2012 01:06 PM, Gedare Bloom wrote:
> On Thu, Apr 5, 2012 at 12:52 PM, Joel Sherrill
> <joel.sherrill at oarcorp.com>  wrote:
>> On 04/05/2012 11:09 AM, Gedare Bloom wrote:
>>
>> On Thu, Apr 5, 2012 at 10:44 AM, 阿四<ashi08104 at gmail.com>  wrote:
>>
>> And another idea of hash + map, if the map part is a sub rbtree, then
>> the
>> size of hash table can be not too big, the search operation could
>> benifit
>> from the rbtree's O(lg(n)) speed, could it be a candidate of the
>> solution?
>>
>> You mean to use a hash-array with an rbtree to handle collisions?
>>
>> Yeah, I mean it.
>>
>> It
>> could work, and be faster on average than just using an rbtree without
>> losing any worst-case time. The space overhead might be quite a bit
>> more to have multiple rbtree control headers and also the extra
>> overhead of rbtree nodes. I think that any rbtree- or chain-based
>> approach shifts the problem of determining the number of Keys to be
>> the problem of determining the number of nodes, since we cannot assume
>> that the Key data has a node embedded inside of it.
>>
>> Gedare, I didn't quite understand this. Does rbtree approach has the problem
>> of determining the number of nodes? I think when the key_create() function
>> called, only a empty rbtree create. And when key_setspecific() function
>> called, a node is added to rbtree. Or you mean create a 'proper' big rbtree
>> when key_create(), which frontloads the insert node cost to key_create() as
>> you mention above? I understand when using the hash approach, determing the
>> number of slots(I'm confused with the Keys you use here, I guess Keys here
>> is the slots in hash table) of hash table is the problem of determining the
>> number of nodes.
>>
>> It depends on the restrictions on key_create and key_setspecific; if
>> creation is allowed to block/fail then dynamic allocation might be ok.
>> I haven't looked closely enough to tell. In that case you can create
>> an rbtree on-the-fly by allocating it when there is a collision and
>> inserting the colliding nodes to it. You could also just as easily use
>> a linked-list (chain). The colliding nodes will need to be embedded
>> inside some other structure like
>> struct {
>>    RBTree_Node r; // or Chain_Node
>>    void * user_data;
>> }
>>
>> this structure would have to be allocated as well whenever inserting a
>> new key (setspecific).  We need to better define the restrictions on
>> the keys in order to understand what kind of solutions we can use.
>> Restrictions means timing and memory constraints, whether operations
>> can block..maybe others.
>>
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_setspecific.html
>> allows pthread_setspecific to return ENOMEM. Is that a hint?
>>
> OK; then it just matters how 'fast' we think it should be on avg or in
> worst-case.
>
>> I don't see why we don't just put a pointer to the Key in the tcb.
>>
>> Does it mean delete the POSIX Key data structure, and let the Thread
>> Specific Data be managed in TCB?
>>
>> Yeah. We could define generic thread-specific data at the score level.
>> I'm not sure if this is a good idea.
>>
>> It isn't if generic. It could be part of the POSIX API thread extension
>> structure.
>>
>> But it helps with unlimited threads but not with unlimited keys
>> if you use a simple array.
>>
>> But a hash/rbtree for "keys used by this thread" might be an implementation
>> option. That is different than "threads using this key" which puts
>> the management structure in keys.
>>
> Yes that makes sense. I could see having an rbtree of the keys in a
> posix thread extension. This would also avoid problems with collisions
> between threads, especially since threads are allowed to specify the
> same key for different data. Holding the keys in separate per-thread
> structures seems the best solution. If the number of keys is expected
> to be small then a linked list/chain would be fine too.
Key IDs are unique so would be good as hash tags.

Key lookup is assumed to be very fast. They are usually used
by libraries like language run-times which need a "per-thread"
context. Ada's POSIX implementation uses this unless you provide
a more optimized solution. Go could use this but went with
the Classic API per task variables which are lighter for access
but heavier on context switch.

Putting the key values in the POSIX API thread extension
requires you to consider the magnitude of deleting a key
and finding out who used it. The memory per thread would
have to be reclaimed.

Personally I could see the key value tracked on two structures.
>> You also have to consider the case of deleting threads and deleting
>> keys and what is required to clean up outstanding memory.
>> It is simple now. But it is possible that you have a "management
>> structure" in one object (key or thread) and a "clean up" structure
>> on the other.
>>
>> I am sure you don't want to do a scan of all thread/keys and
>> then see if it is impacted by a key/thread delete.
>>
>> I hope that makes sense.
>> structure/list
>>
>>
>> -Gedare
>>
>>
>>
>> --
>> Joel Sherrill, Ph.D.             Director of Research&   Development
>> joel.sherrill at OARcorp.com        On-Line Applications Research
>> Ask me about RTEMS: a free RTOS  Huntsville AL 35805
>>      Support Available             (256) 722-9985
>>

-- 
Joel Sherrill, Ph.D.             Director of Research&   Development
joel.sherrill at OARcorp.com        On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
     Support Available             (256) 722-9985