[GSoC]use hash/map in POSIX key and Classic notepad

Gedare Bloom gedare at rtems.org
Fri Apr 6 15:37:28 UTC 2012


On Fri, Apr 6, 2012 at 11:27 AM, 阿四 <ashi08104 at gmail.com> wrote:
>
>
> 2012/4/6 Gedare Bloom <gedare at rtems.org>
>>
>> On Thu, Apr 5, 2012 at 12:52 PM, Joel Sherrill
>> <joel.sherrill at oarcorp.com> wrote:
>> > On 04/05/2012 11:09 AM, Gedare Bloom wrote:
>> >
>> > On Thu, Apr 5, 2012 at 10:44 AM, 阿四 <ashi08104 at gmail.com> wrote:
>> >
>> > And another idea of hash + map, if the map part is a sub rbtree, then
>> > the
>> > size of hash table can be not too big, the search operation could
>> > benifit
>> > from the rbtree's O(lg(n)) speed, could it be a candidate of the
>> > solution?
>> >
>> > You mean to use a hash-array with an rbtree to handle collisions?
>> >
>> > Yeah, I mean it.
>> >
>> > It
>> > could work, and be faster on average than just using an rbtree without
>> > losing any worst-case time. The space overhead might be quite a bit
>> > more to have multiple rbtree control headers and also the extra
>> > overhead of rbtree nodes. I think that any rbtree- or chain-based
>> > approach shifts the problem of determining the number of Keys to be
>> > the problem of determining the number of nodes, since we cannot assume
>> > that the Key data has a node embedded inside of it.
>> >
>> > Gedare, I didn't quite understand this. Does rbtree approach has the
>> > problem
>> > of determining the number of nodes? I think when the key_create()
>> > function
>> > called, only a empty rbtree create. And when key_setspecific() function
>> > called, a node is added to rbtree. Or you mean create a 'proper' big
>> > rbtree
>> > when key_create(), which frontloads the insert node cost to key_create()
>> > as
>> > you mention above? I understand when using the hash approach, determing
>> > the
>> > number of slots(I'm confused with the Keys you use here, I guess Keys
>> > here
>> > is the slots in hash table) of hash table is the problem of determining
>> > the
>> > number of nodes.
>> >
>> > It depends on the restrictions on key_create and key_setspecific; if
>> > creation is allowed to block/fail then dynamic allocation might be ok.
>> > I haven't looked closely enough to tell. In that case you can create
>> > an rbtree on-the-fly by allocating it when there is a collision and
>> > inserting the colliding nodes to it. You could also just as easily use
>> > a linked-list (chain). The colliding nodes will need to be embedded
>> > inside some other structure like
>> > struct {
>> >   RBTree_Node r; // or Chain_Node
>> >   void * user_data;
>> > }
>> >
>> > this structure would have to be allocated as well whenever inserting a
>> > new key (setspecific).  We need to better define the restrictions on
>> > the keys in order to understand what kind of solutions we can use.
>> > Restrictions means timing and memory constraints, whether operations
>> > can block..maybe others.
>> >
>> >
>> > http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_setspecific.html
>> > allows pthread_setspecific to return ENOMEM. Is that a hint?
>> >
>> OK; then it just matters how 'fast' we think it should be on avg or in
>> worst-case.
>>
>> > I don't see why we don't just put a pointer to the Key in the tcb.
>> >
>> > Does it mean delete the POSIX Key data structure, and let the Thread
>> > Specific Data be managed in TCB?
>> >
>> > Yeah. We could define generic thread-specific data at the score level.
>> > I'm not sure if this is a good idea.
>> >
>> > It isn't if generic. It could be part of the POSIX API thread extension
>> > structure.
>> >
>> > But it helps with unlimited threads but not with unlimited keys
>> > if you use a simple array.
>> >
>> > But a hash/rbtree for "keys used by this thread" might be an
>> > implementation
>> > option. That is different than "threads using this key" which puts
>> > the management structure in keys.
>> >
>> Yes that makes sense. I could see having an rbtree of the keys in a
>> posix thread extension. This would also avoid problems with collisions
>> between threads, especially since threads are allowed to specify the
>> same key for different data. Holding the keys in separate per-thread
>> structures seems the best solution. If the number of keys is expected
>> to be small then a linked list/chain would be fine too.
>>
> Gedare, I haven't thought about the problem of collisions about same thread
> specify the same key for different data. In current implementation, if the
> same thread specify different data in the same key by pthread_setspecific(),
> the old data is replaced by new data, isn't it? And is it allowed? since it
> would lead to memory leak. However, I find this project can support this
> kind operation by adding some index.
>
I was unclear: different threads can use the same key for different
data. So you must be able to distinguish them. This would be hard if
all keys are managed in a single structure, but if there is a
structure of keys per-thread it is less difficult.

I suspect you can just use a chain (linked list) of keys for each
thread, with the chain control (linked list head) somewhere the thread
can get to easily. Converting the Key_Control to represent a single
key instead of all keys might be an easy way to go; it already embeds
a Chain_Node and some other metadata, and the Object allocation can
deal with limits on numbers of keys.

>> > You also have to consider the case of deleting threads and deleting
>> > keys and what is required to clean up outstanding memory.
>> > It is simple now. But it is possible that you have a "management
>> > structure" in one object (key or thread) and a "clean up" structure
>> > on the other.
>> >
>> > I am sure you don't want to do a scan of all thread/keys and
>> > then see if it is impacted by a key/thread delete.
>> >
>> > I hope that makes sense.
>> > structure/list
>> >
>> >
>> > -Gedare
>> >
>> >
>> >
>> > --
>> > Joel Sherrill, Ph.D.             Director of Research&  Development
>> > joel.sherrill at OARcorp.com        On-Line Applications Research
>> > Ask me about RTEMS: a free RTOS  Huntsville AL 35805
>> >     Support Available             (256) 722-9985
>> >
>
>




More information about the users mailing list