[GSoC]use hash/map in POSIX key and Classic notepad

Wed Apr 4 15:40:04 UTC 2012

On Wed, Apr 4, 2012 at 10:42 AM, 阿四 <ashi08104 at gmail.com> wrote:
> Sorry for reply late! I spent some time to take a close look at hash and
> rbtree. Thanks for your idea!
>
>> On 2/04/12 4:21 AM, Joel Sherrill wrote:
>>>
>>>  From phone. I don't know how using rbtree would impact memory allocation
>>> in the sense of when allocations occur but since if the tree stays balanced,
>>> that may be a viable implementation.
>>>
>>> Or a hash... I know the problem and don't claim to know the perfect
>>> solution. :)
>>>
>>> But if you use thread ids as keys, you have known properties on the
>>> values being hashed or indexed.
>
>
> Yes, I realise use pthread_key_t as the values being hashed is wrong. Thread
> ids should be appropriate.
>
>
>>
>> Maybe this is something the project could focus on. Looking at both and
>> seeing how they stack up against each other.
>
>
>> Using the rbtree you don't need a hash since you can use thread ids
>> directly (I think?); using a hash you still need some structure to map
>> into for insert/search of key data, such as a tree or an array of
>> linked lists. The time complexity of the different map algorithms is
>> what matters most here. The usual hash+array solution is theoretically
>> quite bad in the worst case; leveraging properties about the threads
>> themselves might make it better. One problem that I see with using an
>> rbtree (or array+chain) is that the Key.Values will need to embed an
>> rbtree_node (or chain_node) along with the user data (void*).
>>
> A question about the time complexity of map algorithms in hash table: can we
> make any assumption about the number of Threads or Tasks that RTEMS can
> create when POSIX threads are configured as "unlimited"? Say the number
> can't be larger than some value. Then a hash talbe with enough slot can be
> created, and the runtime of map algorithm here may be not very important.
> However, if the hash table is too big, the problem of memory waste comes
> again. But I wonder compared to the exatra reserved memory in current
> implement of POSIX key, is memory waste of big hash table a big probem?
I don't think you can make such an assumption since unlimited objects
are not statically determined. As you point out the space wasted would
be quite large.

>
> And another idea of hash + map, if the map part is a sub rbtree, then the
> size of hash table can be not too big, the search operation could benifit
> from the rbtree's O(lg(n)) speed, could it be a candidate of the solution?
You mean to use a hash-array with an rbtree to handle collisions? It
could work, and be faster on average than just using an rbtree without
losing any worst-case time. The space overhead might be quite a bit
more to have multiple rbtree control headers and also the extra
overhead of rbtree nodes. I think that any rbtree- or chain-based
approach shifts the problem of determining the number of Keys to be
the problem of determining the number of nodes, since we cannot assume
that the Key data has a node embedded inside of it.

I don't see why we don't just put a pointer to the Key in the tcb.

-Gedare