[GSoC]Todo in POSIX Key project

Tue Aug 7 11:30:39 UTC 2012

Hi, all.
Here is a short description of my first approach to POSIX Key project and
an alternative approach. I haven't decided whether I should proceed to
implement the alternative, and need discussion.

- Current approach: one rbtree approach
In this approach, there is only one rbtree which is used to manage all
POSIX Keys' data. If there are m POSIX Keys and t POSIX Threads, and each
Thread has m Key values, then there is n( n = m x t ) nodes in the global
rbtree. And the worst-case runtime of key deletion, thread deletion, key
setspecific and key getspecific are all O(lg(n)), and the key creation
operation has a constant runtime O(1). This approach is implemented now,
which is in my first pull request[0].

- Alternative approach: one rbtree per thread approach
Suppose there are also m POSIX Keys and t POSIX Threads in the system, then
each thread maintains one rbtree, all key data of specific thread is in
that rbtree. For example, say if each thread has m Key values, then there
are m nodes in each thread's rbtree. Gedare suggested if there is no clear
advantage than current approach, it may be not worth trying to implement
this. So I try to figure out the advantage of this approach than current
one. The runtime of this approach:
(here n = m x t, there are m keys and t threads)

Key create: O(1), create one key
Key delete: O(tlg(m)), when key deleted, we must search each thread's
rbtree to delete the specific key's node, there are t rbtree and each has m
nodes.
Key setspecific: O(lg(m)), insert one node to m-k nodes rbtree, in which k
is constant.
Key getspecfic: O(lg(m)), search in m nodes rbtree
Thread delete: O(lg(m)), traverse the m nodes rbtree to release each node's
data

Compared to one rbtree approach, the runtime of Key setspecific,
getspecific and thread delete are better. Though lg(m) = lg(n/t) = lg(n) -
lg(t), not much better. But the key delete's runtime O(tlg(m)) is kinds of
slow than O(lg(n)) when n > n0, and O(tlg(m)) = O(tlg(n/t)), I also have
drawn a comparison[1] between tlg(n/t) and lg(n) to illustrate. So there
are both some advantage and disadvantage in this alternative approach. IMO,
there seems not much advantage than current approach. However, I'm still
not sure whether I should give up this alternative. Maybe I have missed
some advantage in this approach.

- BTW, there is also a hash approach discussed before, however, it's worst
case runtime is lg(n), and may be unacceptable to RTEMS as Sebastian
suggested before[2].

And another problem is how to know or analysis the memory overhead of these
approaches on RTEMS. I've no idea about that yet. Could anyone give me some
advice?

links:
[0]:https://github.com/RTEMS/rtems/pull/3
[1]:
https://docs.google.com/drawings/d/1P4B9ePAN57OGVywxDhsOVwne1JvGEmyPaGb0RvFolKo/edit
[2]:http://www.rtems.org/pipermail/rtems-devel/2012-May/001138.html
-- 
Best wishes!
Zhongwei Yao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20120807/d39909a9/attachment.html>