<br><br><div class="gmail_quote">2012/8/7 Gedare Bloom <span dir="ltr"><<a href="mailto:gedare@rtems.org" target="_blank">gedare@rtems.org</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br><br><div class="gmail_quote"><div class="im">On Tue, Aug 7, 2012 at 10:06 AM, Joel Sherrill <span dir="ltr"><<a href="mailto:joel.sherrill@oarcorp.com" target="_blank">joel.sherrill@oarcorp.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><div>
On 08/07/2012 06:30 AM, Ashi wrote:
<blockquote type="cite">Hi, all.<br>
Here is a short description of my first approach to POSIX Key
project and an alternative approach. I haven't decided whether I
should proceed to implement the alternative, and need discussion.<br>
<br>
</blockquote></div>
Just to remind everyone, both of these approaches are better than
the current approach<br>
which is simply broken for unlimited task/threads.<div><br>
<blockquote type="cite">- Current approach: one rbtree approach<br>
In this approach, there is only one rbtree which is used to manage
all POSIX Keys' data. If there are m POSIX Keys and t POSIX
Threads, and each Thread has m Key values, then there is n( n = m
x t ) nodes in the global rbtree. And the worst-case runtime of
key deletion, thread deletion, key setspecific and key getspecific
are all O(lg(n)), and the key creation operation has a constant
runtime O(1). This approach is implemented now, which is in my
first pull request[0].<br>
<br></blockquote></div></div></blockquote></div><div>Is thread deletion actually O( lg(n) + m ) or O( m * lg(n) ) depending on how you implement iterating through the deleted thread's keys?<br><br></div></div></blockquote>
<div>Yeah, after re-check the keyrundestructor() function, it's runtime should be O( m * lg(n) ).<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="gmail_quote"><div>I have made some more comments on your pull request. <br></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="gmail_quote"><div>
</div><div class="im"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><blockquote type="cite">
</blockquote></div>
IMO create and delete times are not as critical as set/get. <br>
<br>
This approach is simpler and that generally is more desirable.
Neither approach completely<br>
resolves the nature that when pushed far enough, the approach gets
pushed. Can we put<br>
a rough instruction per m & T cost on set and get? For example,
if this were a simple loop<br>
of a chain, we would be fetching memory, checking for end of chain,
looking for match, etc.<br>
We can count the memory accesses and rough instruction count (e.g.
cmp, branch, etc).<br>
<br>
In practical terms, the seemingly growth in execution time per
"unit" increased may be<br>
fairly small and not be critical until 100s or 1000s of tasks/keys
are in the system. Our<br>
expected "normal worst case" is likely a few (2-3) keys and 100
tasks. That would be a<br>
huge system in my experience. Most systems are likely to be only 1
key and a few dozen<br>
tasks at most.<div><br></div></div></blockquote></div><div>I agree. Maybe you can get cycle counts for the key operations from some timing test? Is there one already in the tree? If not you could write one in psxtiming.<br>
</div></div></blockquote><div>There is 2 benchmark tests in psxtmtests:<br>psxtmkey01 tests key create and key delete<br>psxtmkey01 tests key setspecific and key getspecific<br><br>Is it enough for runtime test, or should I add runtime test under multi-keys and multi-threads?<br>
</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><div>
<br></div><div class="im"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div>
<blockquote type="cite">- Alternative approach: one rbtree per thread approach<br>
Suppose there are also m POSIX Keys and t POSIX Threads in the
system, then each thread maintains one rbtree, all key data of
specific thread is in that rbtree. For example, say if each thread
has m Key values, then there are m nodes in each thread's rbtree.
Gedare suggested if there is no clear advantage than current
approach, it may be not worth trying to implement this. So I try
to figure out the advantage of this approach than current one. The
runtime of this approach:<br>
(here n = m x t, there are m keys and t threads)<br>
<br>
Key create: O(1), create one key<br>
Key delete: O(tlg(m)), when key deleted, we must search each
thread's rbtree to delete the specific key's node, there are t
rbtree and each has m nodes.<br>
Key setspecific: O(lg(m)), insert one node to m-k nodes rbtree, in
which k is constant.<br>
Key getspecfic: O(lg(m)), search in m nodes rbtree<br>
Thread delete: O(lg(m)), traverse the m nodes rbtree to release
each node's data<br></blockquote></div></div></blockquote></div><div>I believe thread delete is O(m). Deleting a thread must require at least O(m) to deal with all of the keys of that thread when the keys are managed separately (i.e. no "bulk delete" operation is available).<br>
</div></div></blockquote><div>Yeah, I think I have made a mistake here, it should be O(m). <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="gmail_quote"><div>
</div><div class="im"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><blockquote type="cite">
<br>
Compared to one rbtree approach, the runtime of Key setspecific,
getspecific and thread delete are better. Though lg(m) = lg(n/t) =
lg(n) - lg(t), not much better. But the key delete's runtime
O(tlg(m)) is kinds of slow than O(lg(n)) when n > n0, and
O(tlg(m)) = O(tlg(n/t)), I also have drawn a comparison[1] between
tlg(n/t) and lg(n) to illustrate. So there are both some advantage
and disadvantage in this alternative approach. IMO, there seems
not much advantage than current approach. However, I'm still not
sure whether I should give up this alternative. Maybe I have
missed some advantage in this approach.<br>
<br>
</blockquote></div>
Don't lose this. Please write up the O(x) information on the
approaches in your project page<br>
in the RTEMS Wiki. Include any factors in our decision... etc<br>
<br></div></blockquote></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><div class="im"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
This write up is very good. Please put a table of the
characteristics of each algorithm <br>
in O(), space, etc. Capture the same information for each approach.
<br>
<br>
This will really help in the future when we all ask the same
questions again. Trust me,<br>
we all have shorter memories than the life of this code.</div></blockquote></div></div></blockquote><div>I see and glad to add this:) <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="gmail_quote"><div class="im"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><br>
<br>
<blockquote type="cite">- BTW, there is also a hash approach discussed before,
however, it's worst case runtime is lg(n), and may be unacceptable
to RTEMS as <span>Sebastian suggested before[2].</span><br>
<br></blockquote></div></div></blockquote></div></div></blockquote><div>I also made a typo here, the runtime of hash approach is actually O(n). <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="gmail_quote"><div class="im"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><blockquote type="cite">
</blockquote></div>
I think it also has the disadvantage that we would have to add hash
code to RTEMS score<br>
which is not there now. The other approaches reuse a score object.</div></blockquote></div></div></blockquote><div>I see. <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="gmail_quote"><div class="im"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><br>
<blockquote type="cite">And another problem is how to know or analysis the
memory overhead of these approaches on RTEMS. I've no idea about
that yet. Could anyone give me some advice?<br>
<span></span><br></blockquote></div></div></blockquote></div><div>This last point I think is important. My intuition is that having a separate rbtree for each thread will substantially bloat the memory overhead. One way to get a handle on this is to estimate the size of the structures that will be involved. That is, how many bytes in an rbtree header, how many in an rbtree node, and how many of each is created for your two approaches? This is pretty simple arithmetic and is also the way to determine how much workspace size to reserve for these structures.<br>
<br></div></div></blockquote><div>OK, I'll count all the bytes and post the result later. <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="gmail_quote"><div>-Gedare<br><br> </div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="im"><div text="#000000" bgcolor="#FFFFFF">
<div><blockquote type="cite">
<div>
<div>
<table cellpadding="0">
<tbody>
<tr>
<td><br>
</td>
</tr>
</tbody>
</table>
</div>
</div>
links:<br>
[0]:<a href="https://github.com/RTEMS/rtems/pull/3" target="_blank">https://github.com/RTEMS/rtems/pull/3</a><br>
[1]:<a href="https://docs.google.com/drawings/d/1P4B9ePAN57OGVywxDhsOVwne1JvGEmyPaGb0RvFolKo/edit" target="_blank">https://docs.google.com/drawings/d/1P4B9ePAN57OGVywxDhsOVwne1JvGEmyPaGb0RvFolKo/edit</a><br>
[2]:<a href="http://www.rtems.org/pipermail/rtems-devel/2012-May/001138.html" target="_blank">http://www.rtems.org/pipermail/rtems-devel/2012-May/001138.html</a><br>
-- <br>
Best wishes!<br>
Zhongwei Yao<br>
<br>
</blockquote>
<br>
<br>
</div><span><font color="#888888"><pre cols="72">--
Joel Sherrill, Ph.D. Director of Research& Development
<a href="mailto:joel.sherrill@OARcorp.com" target="_blank">joel.sherrill@OARcorp.com</a> On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available <a href="tel:%28256%29%20722-9985" value="+12567229985" target="_blank">(256) 722-9985</a>
</pre>
</font></span></div>
<br></div>_______________________________________________<br>
rtems-devel mailing list<br>
<a href="mailto:rtems-devel@rtems.org" target="_blank">rtems-devel@rtems.org</a><br>
<a href="http://www.rtems.org/mailman/listinfo/rtems-devel" target="_blank">http://www.rtems.org/mailman/listinfo/rtems-devel</a><br>
<br></blockquote></div><br>
</blockquote></div><br><br clear="all"><br>-- <br>Best wishes!<br>Zhongwei Yao<br><br>