[GSoC]Todo in POSIX Key project

Tue Aug 7 14:45:03 UTC 2012

On Tue, Aug 7, 2012 at 10:06 AM, Joel Sherrill <joel.sherrill at oarcorp.com>wrote:

>  On 08/07/2012 06:30 AM, Ashi wrote:
>
> Hi, all.
> Here is a short description of my first approach to POSIX Key project and
> an alternative approach. I haven't decided whether I should proceed to
> implement the alternative, and need discussion.
>
>  Just to remind everyone, both of these approaches are better than the
> current approach
> which is simply broken for unlimited task/threads.
>
> - Current approach: one rbtree approach
> In this approach, there is only one rbtree which is used to manage all
> POSIX Keys' data. If there are m POSIX Keys and t POSIX Threads, and each
> Thread has m Key values, then there is n( n = m x t ) nodes in the global
> rbtree. And the worst-case runtime of key deletion, thread deletion, key
> setspecific and key getspecific are all O(lg(n)), and the key creation
> operation has a constant runtime O(1). This approach is implemented now,
> which is in my first pull request[0].
>
> Is thread deletion actually O( lg(n) + m ) or O( m * lg(n) ) depending on
how you implement iterating through the deleted thread's keys?

I have made some more comments on your pull request.

>  IMO create and delete times are not as critical as set/get.
>
> This approach is simpler and that generally is more desirable. Neither
> approach completely
> resolves the nature that when pushed far enough, the approach gets pushed.
> Can we put
> a rough instruction per m & T cost on set and get?  For example, if this
> were a simple loop
> of a chain, we would be fetching memory, checking for end of chain,
> looking for match, etc.
> We can count the memory accesses and rough instruction count (e.g. cmp,
> branch, etc).
>
> In practical terms, the seemingly growth in execution time per "unit"
> increased may be
> fairly small and not be critical until 100s or 1000s of tasks/keys are in
> the system. Our
> expected "normal worst case" is likely a few (2-3) keys and 100 tasks.
> That would be a
> huge system in my experience. Most systems are likely to be only 1 key and
> a few dozen
> tasks at most.
>
> I agree. Maybe you can get cycle counts for the key operations from some
timing test? Is there one already in the tree? If not you could write one
in psxtiming.

> - Alternative approach: one rbtree per thread approach
> Suppose there are also m POSIX Keys and t POSIX Threads in the system,
> then each thread maintains one rbtree, all key data of specific thread is
> in that rbtree. For example, say if each thread has m Key values, then
> there are m nodes in each thread's rbtree. Gedare suggested if there is no
> clear advantage than current approach, it may be not worth trying to
> implement this. So I try to figure out the advantage of this approach than
> current one. The runtime of this approach:
> (here n = m x t, there are m keys and t threads)
>
> Key create: O(1), create one key
> Key delete: O(tlg(m)), when key deleted, we must search each thread's
> rbtree to delete the specific key's node, there are t rbtree and each has m
> nodes.
> Key setspecific: O(lg(m)), insert one node to m-k nodes rbtree, in which k
> is constant.
> Key getspecfic: O(lg(m)), search in m nodes rbtree
> Thread delete: O(lg(m)), traverse the m nodes rbtree to release each
> node's data
>
> I believe thread delete is O(m). Deleting a thread must require at least
O(m) to deal with all of the keys of that thread when the keys are managed
separately (i.e. no "bulk delete" operation is available).

>
> Compared to one rbtree approach, the runtime of Key setspecific,
> getspecific and thread delete are better. Though lg(m) = lg(n/t) = lg(n) -
> lg(t), not much better. But the key delete's runtime O(tlg(m)) is kinds of
> slow than O(lg(n)) when n > n0, and O(tlg(m)) = O(tlg(n/t)), I also have
> drawn a comparison[1] between tlg(n/t) and lg(n) to illustrate. So there
> are both some advantage and disadvantage in this alternative approach. IMO,
> there seems not much advantage than current approach. However, I'm still
> not sure whether I should give up this alternative. Maybe I have missed
> some advantage in this approach.
>
>  Don't lose this. Please write up the O(x) information on the approaches
> in your project page
> in the RTEMS Wiki. Include any factors in our decision... etc
>
> This write up is very good. Please put a table of the characteristics of
> each algorithm
> in O(), space, etc. Capture the same information for each approach.
>
> This will really help in the future when we all ask the same questions
> again. Trust me,
> we all have shorter memories than the life of this code.
>
>
> - BTW, there is also a hash approach discussed before, however, it's worst
> case runtime is lg(n), and may be unacceptable to RTEMS as Sebastian
> suggested before[2].
>
>  I think it also has the disadvantage that we would have to add hash code
> to RTEMS score
> which is not there now. The other approaches reuse a score object.
>
> And another problem is how to know or analysis the memory overhead of
> these approaches on RTEMS. I've no idea about that yet. Could anyone give
> me some advice?
>
> This last point I think is important. My intuition is that having a
separate rbtree for each thread will substantially bloat the memory
overhead. One way to get a handle on this is to estimate the size of the
structures that will be involved. That is, how many bytes in an rbtree
header, how many in an rbtree node, and how many of each is created for
your two approaches? This is pretty simple arithmetic and is also the way
to determine how much workspace size to reserve for these structures.

-Gedare

>
>    links:
> [0]:https://github.com/RTEMS/rtems/pull/3
> [1]:
> https://docs.google.com/drawings/d/1P4B9ePAN57OGVywxDhsOVwne1JvGEmyPaGb0RvFolKo/edit
> [2]:http://www.rtems.org/pipermail/rtems-devel/2012-May/001138.html
> --
> Best wishes!
> Zhongwei Yao
>
>
>
> --
> Joel Sherrill, Ph.D.             Director of Research&  Developmentjoel.sherrill at OARcorp.com        On-Line Applications Research
> Ask me about RTEMS: a free RTOS  Huntsville AL 35805
>     Support Available             (256) 722-9985
>
>
> _______________________________________________
> rtems-devel mailing list
> rtems-devel at rtems.org
> http://www.rtems.org/mailman/listinfo/rtems-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20120807/bb49cddc/attachment-0001.html>