[GSoC]use hash/map in POSIX key and Classic notepad
Joel Sherrill
joel.sherrill at OARcorp.com
Fri Apr 6 16:35:17 UTC 2012
On 04/06/2012 11:24 AM, 阿四 wrote:
>
>
> 2012/4/6 Gedare Bloom <gedare at rtems.org <mailto:gedare at rtems.org>>
>
> On Fri, Apr 6, 2012 at 11:27 AM, 阿四 <ashi08104 at gmail.com
> <mailto:ashi08104 at gmail.com>> wrote:
> >
> >
> > 2012/4/6 Gedare Bloom <gedare at rtems.org <mailto:gedare at rtems.org>>
> >>
> >> On Thu, Apr 5, 2012 at 12:52 PM, Joel Sherrill
> >> <joel.sherrill at oarcorp.com <mailto:joel.sherrill at oarcorp.com>>
> wrote:
> >> > On 04/05/2012 11:09 AM, Gedare Bloom wrote:
> >> >
> >> > On Thu, Apr 5, 2012 at 10:44 AM, 阿四 <ashi08104 at gmail.com
> <mailto:ashi08104 at gmail.com>> wrote:
> >> >
> >> > And another idea of hash + map, if the map part is a sub
> rbtree, then
> >> > the
> >> > size of hash table can be not too big, the search operation could
> >> > benifit
> >> > from the rbtree's O(lg(n)) speed, could it be a candidate of the
> >> > solution?
> >> >
> >> > You mean to use a hash-array with an rbtree to handle collisions?
> >> >
> >> > Yeah, I mean it.
> >> >
> >> > It
> >> > could work, and be faster on average than just using an
> rbtree without
> >> > losing any worst-case time. The space overhead might be quite
> a bit
> >> > more to have multiple rbtree control headers and also the extra
> >> > overhead of rbtree nodes. I think that any rbtree- or chain-based
> >> > approach shifts the problem of determining the number of Keys
> to be
> >> > the problem of determining the number of nodes, since we
> cannot assume
> >> > that the Key data has a node embedded inside of it.
> >> >
> >> > Gedare, I didn't quite understand this. Does rbtree approach
> has the
> >> > problem
> >> > of determining the number of nodes? I think when the key_create()
> >> > function
> >> > called, only a empty rbtree create. And when
> key_setspecific() function
> >> > called, a node is added to rbtree. Or you mean create a
> 'proper' big
> >> > rbtree
> >> > when key_create(), which frontloads the insert node cost to
> key_create()
> >> > as
> >> > you mention above? I understand when using the hash approach,
> determing
> >> > the
> >> > number of slots(I'm confused with the Keys you use here, I
> guess Keys
> >> > here
> >> > is the slots in hash table) of hash table is the problem of
> determining
> >> > the
> >> > number of nodes.
> >> >
> >> > It depends on the restrictions on key_create and
> key_setspecific; if
> >> > creation is allowed to block/fail then dynamic allocation
> might be ok.
> >> > I haven't looked closely enough to tell. In that case you can
> create
> >> > an rbtree on-the-fly by allocating it when there is a
> collision and
> >> > inserting the colliding nodes to it. You could also just as
> easily use
> >> > a linked-list (chain). The colliding nodes will need to be
> embedded
> >> > inside some other structure like
> >> > struct {
> >> > RBTree_Node r; // or Chain_Node
> >> > void * user_data;
> >> > }
> >> >
> >> > this structure would have to be allocated as well whenever
> inserting a
> >> > new key (setspecific). We need to better define the
> restrictions on
> >> > the keys in order to understand what kind of solutions we can
> use.
> >> > Restrictions means timing and memory constraints, whether
> operations
> >> > can block..maybe others.
> >> >
> >> >
> >> >
> http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_setspecific.html
> >> > allows pthread_setspecific to return ENOMEM. Is that a hint?
> >> >
> >> OK; then it just matters how 'fast' we think it should be on
> avg or in
> >> worst-case.
> >>
> >> > I don't see why we don't just put a pointer to the Key in the
> tcb.
> >> >
> >> > Does it mean delete the POSIX Key data structure, and let the
> Thread
> >> > Specific Data be managed in TCB?
> >> >
> >> > Yeah. We could define generic thread-specific data at the
> score level.
> >> > I'm not sure if this is a good idea.
> >> >
> >> > It isn't if generic. It could be part of the POSIX API thread
> extension
> >> > structure.
> >> >
> >> > But it helps with unlimited threads but not with unlimited keys
> >> > if you use a simple array.
> >> >
> >> > But a hash/rbtree for "keys used by this thread" might be an
> >> > implementation
> >> > option. That is different than "threads using this key" which
> puts
> >> > the management structure in keys.
> >> >
> >> Yes that makes sense. I could see having an rbtree of the keys in a
> >> posix thread extension. This would also avoid problems with
> collisions
> >> between threads, especially since threads are allowed to
> specify the
> >> same key for different data. Holding the keys in separate
> per-thread
> >> structures seems the best solution. If the number of keys is
> expected
> >> to be small then a linked list/chain would be fine too.
> >>
> > Gedare, I haven't thought about the problem of collisions about
> same thread
> > specify the same key for different data. In current
> implementation, if the
> > same thread specify different data in the same key by
> pthread_setspecific(),
> > the old data is replaced by new data, isn't it? And is it
> allowed? since it
> > would lead to memory leak. However, I find this project can
> support this
> > kind operation by adding some index.
> >
> I was unclear: different threads can use the same key for different
> data. So you must be able to distinguish them. This would be hard if
> all keys are managed in a single structure, but if there is a
> structure of keys per-thread it is less difficult.
>
>
> Sorry, Gedare, I feel a structure of keys per-thread would make thing
> easy. However, I'm still confused with the hard of all keys are
> managed in a single structure. I need some help to make myself clear
> about key manager.
>
> First, in current implementation I know the the key is identified by
> the_key->Object.id (I copy it from keycreate.c, line 96), and
> the_key's type is Key_Control, namely, the key is identified by a
> Object id, which is managed by Chain Manager. So keys can be
> distinguished by Object id. And for different thread data in the same
> key, they are distinguished by _Thread_Executing->Object.id (this is
> the current executing thread's TCB's object id). Then all things are
> distinguished in current implementation.
Each key has a unique id and is used for a unique purpose. Think of
it in terms of say a Java run-time where each Java thread is implemented
as a POSIX thread. The run-time would use a key to hold a pointer to
the Java run-time specific data.
So a key holds a pointer to the same type of data for the same purpose.
Each thread can have its own "instance of the library's context"
No thread will have two values for the same key simultaneously.
Each set operation overwrites the previous value:
The scenarios of interest are:
+ key create (what do do)
+ key delete (clean up all references to this key)
+ key get specific - should be very fast
+ key set specific - not so fast because generally only
done once when you initialize a thread to use
a specific library
+ thread delete - if key data is held in the TCB,
then it needs to be freed. Clean up generally.
>
> And in this project, like in the hash approach, I think different
> keys can be distinguished as before, and very key have a member
> pointer to hash table, and different thread's data in the same key is
> distinguish by its thread id (when in collisions, for different
> thread, their data can be distinguished by thread id again.) But for
> the different data of same thread in same key, I think they need extra
> information to be distinguished.
>
> Am I missed something?
> Or in this project, different keys can't be distinguished as before. I
> feel there's may be an unacceptiable cost of lookup with unlimited key
> configuration.
>
> Best Regard
> --zw_yao
>
>
> I suspect you can just use a chain (linked list) of keys for each
> thread, with the chain control (linked list head) somewhere the thread
> can get to easily. Converting the Key_Control to represent a single
> key instead of all keys might be an easy way to go; it already embeds
> a Chain_Node and some other metadata, and the Object allocation can
> deal with limits on numbers of keys.
>
> >> > You also have to consider the case of deleting threads and
> deleting
> >> > keys and what is required to clean up outstanding memory.
> >> > It is simple now. But it is possible that you have a "management
> >> > structure" in one object (key or thread) and a "clean up"
> structure
> >> > on the other.
> >> >
> >> > I am sure you don't want to do a scan of all thread/keys and
> >> > then see if it is impacted by a key/thread delete.
> >> >
> >> > I hope that makes sense.
> >> > structure/list
> >> >
> >> >
> >> > -Gedare
> >> >
> >> >
> >> >
> >> > --
> >> > Joel Sherrill, Ph.D. Director of Research&
> Development
> >> > joel.sherrill at OARcorp.com On-Line Applications Research
> >> > Ask me about RTEMS: a free RTOS Huntsville AL 35805
> >> > Support Available (256) 722-9985 <tel:%28256%29%20722-9985>
> >> >
> >
> >
>
>
--
Joel Sherrill, Ph.D. Director of Research& Development
joel.sherrill at OARcorp.com On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available (256) 722-9985
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20120406/42478cef/attachment-0001.html>
More information about the devel
mailing list