<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jul 6, 2017 at 4:14 PM, Cillian O'Donnell <span dir="ltr"><<a href="mailto:cpodonnell8@gmail.com" target="_blank">cpodonnell8@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 6 July 2017 at 19:14, Joel Sherrill <<a href="mailto:joel@rtems.org">joel@rtems.org</a>> wrote:<br>

><br>

><br>

> On Thu, Jul 6, 2017 at 11:11 AM, Cillian O'Donnell <<a href="mailto:cpodonnell8@gmail.com">cpodonnell8@gmail.com</a>><br>

> wrote:<br>

>><br>

>> On 6 July 2017 at 15:34, Joel Sherrill <<a href="mailto:joel@rtems.org">joel@rtems.org</a>> wrote:<br>

>> ><br>

>> ><br>

>> > On Thu, Jul 6, 2017 at 5:53 AM, Cillian O'Donnell<br>

>> > <<a href="mailto:cpodonnell8@gmail.com">cpodonnell8@gmail.com</a>><br>

>> > wrote:<br>

>> >><br>

>> >> On 6 July 2017 at 08:05, Chris Johns <<a href="mailto:chrisj@rtems.org">chrisj@rtems.org</a>> wrote:<br>

>> >> > On 05/07/2017 22:46, Cillian O'Donnell wrote:<br>

>> >> >><br>

>> >> >> Now the thing is, I might have to run RTEMS Tester about 3 times for<br>

>> >> >> the full testsuite before I can get the report, as the covoar runs<br>

>> >> >> will often hang on an arbitrary method and have to be manually<br>

>> >> >> exited<br>

>> >> >> and re-run.<br>

>> >> ><br>

>> >> > How is covoar being run? Is it being run by the 'rtems-test' command?<br>

>> >><br>

>> >> Yes covoar runs after the testsuite if a --coverage flag is included<br>

>> >> with rtems-test command.<br>

>> >> ><br>

>> >> >> The point that this happens is not repeatable and there's<br>

>> >> >> no error message to go on.<br>

>> >> ><br>

>> >> > Is it the covoar process that is stopping? Is it the process idle or<br>

>> >> > using 100%?<br>

>> >><br>

>> >> Covoar locks up and becomes idle. Joel mentioned in the IRC meeting<br>

>> >> he's seen it before and I'll need to track down the exe causing it and<br>

>> >> debug it to get rid of this<br>

>> ><br>

>> ><br>

>> > There must be a misunderstanding. I have never seen covoar lock up<br>

>> > like this.<br>

>><br>

>> Oh sorry, you must of just meant debugging covoar in general, picked<br>

>> it up wrong.<br>

>> ><br>

>> > It will ignore records when it thinks things are inconsistent. This can<br>

>> > occur<br>

>> > when a method appears in two different executables and has different<br>

>> > sizes. The cause of this is usually padding at the end of the method so<br>

>> > the subsequent method in memory starts on a nice cache-line alignment.<br>

>> > The code tries to recognize the nops/padding at the end and ignore them.<br>

>> > When the padding inserted by ld changes or the objdump output being<br>

>> > parsed changes, covoar needs to be adjusted.<br>

>> ><br>

>> > The ignored record message I saw is in the code that reads Couverture<br>

>> > trace records. The info in the record appears to be inconsistent with<br>

>> > the<br>

>> > code to which it has been matched. This could be because the trace<br>

>> > record format has been changed or a simple bug.<br>

>> ><br>

>> > From what I have seen, it is likely that it is primarily bugs in input<br>

>> > processing.<br>

>> ><br>

>> > But the code to do the "big union" of the trace data has to iterate over<br>

>> > all tests and all coverage blocks. It can take a while to run. I guess<br>

>> > there could be an infinite loop in it.  I always debugged it by having<br>

>> > a test case I could run in gdb.<br>

>> ><br>

>> >><br>

>> >> ><br>

>> >> >> The only common theme is the methods will<br>

>> >> >> often be called thread...something. Even more rarely it will stop<br>

>> >> >> with<br>

>> >> >> the message 'resource temporarily unavailable'. I'm wondering is<br>

>> >> >> this<br>

>> >> >> normal behaviour for covoar on a long test-run? This never happens<br>

>> >> >> when running just the samples.<br>

>> >> ><br>

>> >> > It is not normal behavior.<br>

>> ><br>

>> ><br>

>> > +1<br>

>> ><br>

>> > Is Linux or covoar printing the message? It holds all data in memory<br>

>> > while<br>

>> > doing the analysis. Is it consuming a lot of RAM?<br>

>><br>

>> I done two rtems-test coverage runs there while monitoring 'top'<br>

>> output. The first run is successful and coverage  report is generated<br>

>> and the second run produced IO Error: Resource temporarily unavailable<br>

>> on the second. Both times theres about 3.5-4GB RAM free and CPU runs<br>

>> at 70-80% so nothing seems to be strained there.<br>

>> Timecounter_Tick_simple is being checked before it happens. The output<br>

>> is<br>

>><br>

>> 4000c9d8:    0a 80 00 03     bcs  4000c9e4 <_Timecounter_Tick_simple+<wbr>0x78><br>

>> 4000c9dc:    84 01 20 01     add  %g4, 1, %g2<br>

>><br>

>>     bt = th->th_offset;<br>

>> 4000c9e0:    84 10 00 04     mov  %g4, %g2<br>

>>     _bt->sec += _bt2->sec;<br>

>> 4000c9e4:    fa 06 e0 a0     ld  [ %i3 + 0xa0 ], %i5<br>

>> 4000c9e8:    84 00 80 1d     add  %g2, %i5, %g2<br>

>>     _tv->tv_usec = ((uint64_t)1000000 * (uint32_t)(_bt->frac >> 32)) >><br>

>> 32;<br>

>> 4000c9ec:    39 00 03 d0     sethi  %hi(0xf4000), %i4<br>

>>     _ts->tv_nsec = ((uint64_t)1000000000 *<br>

>> 4000c9f0:    3b 0e e6 b2     sethi  %hi(0x3b9ac800), %i5<br>

>>     _tv->tv_usec = ((uint64_t)1000000 * (uint32_t)(_bt->frac >> 32)) >><br>

>> 32;<br>

>> 4000c9f4:    b8 17 22 40     or  %i4, 0x240, %i4<br>

>>     _ts->tv_nsec = ((uint64_t)1000000000 *<br>

>> 4000c9f8:    ba 17 62 00     or  %i5, 0x200, %i5<br>

>>     _tv->tv_usec = ((uint64_t)1000000 * (uint32_t)(_bt->frac >> 32))<br>

>> >> 32;Exception in thread _stderr[c]:<br>

>> Traceback (most recent call last):<br>

>>   File "/usr/lib/python2.7/threading.<wbr>py", line 801, in __bootstrap_inner<br>

>>     self.run()<br>

>>   File "/usr/lib/python2.7/threading.<wbr>py", line 754, in run<br>

>>     self.__target(*self.__args, **self.__kwargs)<br>

>>   File<br>

>> "/home/cpod/development/rtems/<wbr>test/rtems-tools/rtemstoolkit/<wbr>execute.py",<br>

>> line 206, in _readthread<br>

>>     _output_line(line, exe, prefix, out, count)<br>

>>   File<br>

>> "/home/cpod/development/rtems/<wbr>test/rtems-tools/rtemstoolkit/<wbr>execute.py",<br>

>> line 184, in _output_line<br>

>>     out(prefix + line)<br>

>>   File<br>

>> "/home/cpod/development/rtems/<wbr>test/rtems-tools/tester/rt/<wbr>coverage.py",<br>

>> line 387, in output_handler<br>

>>     log.notice(text, stdout_only = False)<br>

>>   File<br>

>> "/home/cpod/development/rtems/<wbr>test/rtems-tools/rtemstoolkit/<wbr>log.py",<br>

>> line 104, in notice<br>

>>     print(l)<br>

>> IOError: [Errno 11] Resource temporarily unavailable<br>

><br>

><br>

> This looks like qemu is running too long with an RTEMS executable. This<br>

> stack trace is RTEMS code. Not covoar code. Please check what executable<br>

> is consuming so much run-time before it is killed. This will change the<br>

> solution. :)<br>

><br>

> covoar spends a lot of time iterating over sets of containers and<br>

> executables.<br>

> There could be an infinite loop in it but it is using STL containers for<br>

> everything and is single threaded so I rather doubt it would go into an<br>

> infinite loop.<br>

<br>

</div></div>The executable running here is sprmsched01.exe I produced it again and<br>

it was sp22.exe nnd then again with psxaio01.exe so it doesn't seem to<br>

pin point a single test repeatably.<br></blockquote><div><br></div><div>The rtems tester tries to run the tests in parallel. I wonder if this trips</div><div>some conflict which we are not aware of.</div><div><br></div><div>Chris.. Are there arguments to make it single-thread the qemu invocations</div><div>as a test?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div><div class="h5">><br>

>><br>

>><br>

>> ><br>

>> >><br>

>> >> >> When a test run is successful it will also finish with a lot of<br>

>> >> >> these<br>

>> >> >> messages:<br>

>> >> >><br>

>> >> >> *** Trace block is inconsistent with coverage map<br>

>> >> >> *** Trace block (0x4000c4fc - 0x4000c51f) for 36 bytes<br>

>> >> >> *** Coverage map<br>

>> >> >> /home/cpod/coverage_test/test/<wbr>coverage/unlimited.exe.cov<br>

>> >> >> INFO: DesiredSymbols::<wbr>mergeCoverageMap - Unable to merge coverage<br>

>> >> >> map<br>

>> >> >> for _Thread_queue_Extract_with_<wbr>proxy because the sizes are different<br>

>> >> ><br>

>> >> > I wonder what sizes are different?<br>

>> >><br>

>> >> For this one, Joel mentioned it could be something to do with nops<br>

>> >> being added or space not matching and to take a look at _Thread_<br>

>> >> queue_Extract_with_proxy  in the objdump and see if I can dig up<br>

>> >> anything from that. That message appears for _Thread_<br>

>> >> queue_Extract_with_proxy  on all tests from base_sp.exe to<br>

>> >> unlimited.exe when running the full testsuite and on a few instances<br>

>> >> it also mentions unable to merge coverage map for TOD_TICKS_PER_SECOND<br>

>> >> for the same reason. The mismatch is always 36 bytes.<br>

>> ><br>

>> ><br>

>> > You need to find the two executables it differs on and see why the size<br>

>> > is different. It processes them from first to last on the command line<br>

>> > so<br>

>> > find the first one that has the symbol and compare the dump to that<br>

>> > one it is flagging as different.<br>

>><br>

>> Will the 2 executables be say for example unlimited.exe from the build<br>

>> tree and the temporarily copied covoar executable unlimited.exe.cov?<br>

>> So libscore.a info matches the original unlimited.exe and something<br>

>> has changed in the covoar copies?<br>

><br>

><br>

> No. It will be unlimited and one of the tests before it on the covoar<br>

> command line. In particular, it should be the first one with the symbol<br>

> in question. Compare the objdump's.<br>

<br>

</div></div>Ok I just checked and the first instance is base_sp.exe, which is<br>

actually the first test run so this appears for every single test.<br>

Where we hoping to compare one without the problem to the one it first<br>

appears on?<br></blockquote><div><br></div><div>Yep. covoar builds its "truth" of the method from the first one instance of </div><div>each symbol it sees. So this one symbol is coming from base_sp.exe</div><div>and there is a difference between that and the first one where the error</div><div>is reported.</div><div><br></div><div>The first instance or the error reporting one may be incorrectly processed</div><div>or formatted/padded in an unexpected way.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="HOEnZb"><div class="h5">><br>

><br>

>><br>

>> >><br>

>> >><br>

>> >> Also I wanted to mention, there is quite a bit of work from the 2<br>

>> >> previous students on the integration of the coverage work into RTEMS<br>

>> >> Tester, would you be interested in starting the merging process of<br>

>> >> this. It might take a while to get everything reviewed and make any<br>

>> >> necessary changes. At this point it is working. its just fixing bugs,<br>

>> >> which I'm working on at the moment.<br>

>> ><br>

>> ><br>

>> > Chris.. I suggested this. It seems the basic flow and integration of<br>

>> > coverage into the tester is good enough to review and merge.<br>

>> ><br>

>> > --joel<br>

>> ><br>

>> >><br>

>> >><br>

>> >> Thanks,<br>

>> >><br>

>> >> Cillian.<br>

>> >> ><br>

>> >> > Chris<br>

>> >> ______________________________<wbr>_________________<br>

>> >> devel mailing list<br>

>> >> <a href="mailto:devel@rtems.org">devel@rtems.org</a><br>

>> >> <a href="http://lists.rtems.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.rtems.org/<wbr>mailman/listinfo/devel</a><br>

>> ><br>

>> ><br>

><br>

><br>

</div></div></blockquote></div><br></div></div>