Coverage analysis update

Thu Jul 6 16:11:56 UTC 2017

On 6 July 2017 at 15:34, Joel Sherrill <joel at rtems.org> wrote:
>
>
> On Thu, Jul 6, 2017 at 5:53 AM, Cillian O'Donnell <cpodonnell8 at gmail.com>
> wrote:
>>
>> On 6 July 2017 at 08:05, Chris Johns <chrisj at rtems.org> wrote:
>> > On 05/07/2017 22:46, Cillian O'Donnell wrote:
>> >>
>> >> Now the thing is, I might have to run RTEMS Tester about 3 times for
>> >> the full testsuite before I can get the report, as the covoar runs
>> >> will often hang on an arbitrary method and have to be manually exited
>> >> and re-run.
>> >
>> > How is covoar being run? Is it being run by the 'rtems-test' command?
>>
>> Yes covoar runs after the testsuite if a --coverage flag is included
>> with rtems-test command.
>> >
>> >> The point that this happens is not repeatable and there's
>> >> no error message to go on.
>> >
>> > Is it the covoar process that is stopping? Is it the process idle or
>> > using 100%?
>>
>> Covoar locks up and becomes idle. Joel mentioned in the IRC meeting
>> he's seen it before and I'll need to track down the exe causing it and
>> debug it to get rid of this
>
>
> There must be a misunderstanding. I have never seen covoar lock up
> like this.

Oh sorry, you must of just meant debugging covoar in general, picked
it up wrong.
>
> It will ignore records when it thinks things are inconsistent. This can
> occur
> when a method appears in two different executables and has different
> sizes. The cause of this is usually padding at the end of the method so
> the subsequent method in memory starts on a nice cache-line alignment.
> The code tries to recognize the nops/padding at the end and ignore them.
> When the padding inserted by ld changes or the objdump output being
> parsed changes, covoar needs to be adjusted.
>
> The ignored record message I saw is in the code that reads Couverture
> trace records. The info in the record appears to be inconsistent with the
> code to which it has been matched. This could be because the trace
> record format has been changed or a simple bug.
>
> From what I have seen, it is likely that it is primarily bugs in input
> processing.
>
> But the code to do the "big union" of the trace data has to iterate over
> all tests and all coverage blocks. It can take a while to run. I guess
> there could be an infinite loop in it.  I always debugged it by having
> a test case I could run in gdb.
>
>>
>> >
>> >> The only common theme is the methods will
>> >> often be called thread...something. Even more rarely it will stop with
>> >> the message 'resource temporarily unavailable'. I'm wondering is this
>> >> normal behaviour for covoar on a long test-run? This never happens
>> >> when running just the samples.
>> >
>> > It is not normal behavior.
>
>
> +1
>
> Is Linux or covoar printing the message? It holds all data in memory while
> doing the analysis. Is it consuming a lot of RAM?

I done two rtems-test coverage runs there while monitoring 'top'
output. The first run is successful and coverage  report is generated
and the second run produced IO Error: Resource temporarily unavailable
on the second. Both times theres about 3.5-4GB RAM free and CPU runs
at 70-80% so nothing seems to be strained there.
Timecounter_Tick_simple is being checked before it happens. The output
is

4000c9d8:    0a 80 00 03     bcs  4000c9e4 <_Timecounter_Tick_simple+0x78>
4000c9dc:    84 01 20 01     add  %g4, 1, %g2

    bt = th->th_offset;
4000c9e0:    84 10 00 04     mov  %g4, %g2
    _bt->sec += _bt2->sec;
4000c9e4:    fa 06 e0 a0     ld  [ %i3 + 0xa0 ], %i5
4000c9e8:    84 00 80 1d     add  %g2, %i5, %g2
    _tv->tv_usec = ((uint64_t)1000000 * (uint32_t)(_bt->frac >> 32)) >> 32;
4000c9ec:    39 00 03 d0     sethi  %hi(0xf4000), %i4
    _ts->tv_nsec = ((uint64_t)1000000000 *
4000c9f0:    3b 0e e6 b2     sethi  %hi(0x3b9ac800), %i5
    _tv->tv_usec = ((uint64_t)1000000 * (uint32_t)(_bt->frac >> 32)) >> 32;
4000c9f4:    b8 17 22 40     or  %i4, 0x240, %i4
    _ts->tv_nsec = ((uint64_t)1000000000 *
4000c9f8:    ba 17 62 00     or  %i5, 0x200, %i5
    _tv->tv_usec = ((uint64_t)1000000 * (uint32_t)(_bt->frac >> 32))
>> 32;Exception in thread _stderr[c]:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/cpod/development/rtems/test/rtems-tools/rtemstoolkit/execute.py",
line 206, in _readthread
    _output_line(line, exe, prefix, out, count)
  File "/home/cpod/development/rtems/test/rtems-tools/rtemstoolkit/execute.py",
line 184, in _output_line
    out(prefix + line)
  File "/home/cpod/development/rtems/test/rtems-tools/tester/rt/coverage.py",
line 387, in output_handler
    log.notice(text, stdout_only = False)
  File "/home/cpod/development/rtems/test/rtems-tools/rtemstoolkit/log.py",
line 104, in notice
    print(l)
IOError: [Errno 11] Resource temporarily unavailable

>
>>
>> >> When a test run is successful it will also finish with a lot of these
>> >> messages:
>> >>
>> >> *** Trace block is inconsistent with coverage map
>> >> *** Trace block (0x4000c4fc - 0x4000c51f) for 36 bytes
>> >> *** Coverage map
>> >> /home/cpod/coverage_test/test/coverage/unlimited.exe.cov
>> >> INFO: DesiredSymbols::mergeCoverageMap - Unable to merge coverage map
>> >> for _Thread_queue_Extract_with_proxy because the sizes are different
>> >
>> > I wonder what sizes are different?
>>
>> For this one, Joel mentioned it could be something to do with nops
>> being added or space not matching and to take a look at _Thread_
>> queue_Extract_with_proxy  in the objdump and see if I can dig up
>> anything from that. That message appears for _Thread_
>> queue_Extract_with_proxy  on all tests from base_sp.exe to
>> unlimited.exe when running the full testsuite and on a few instances
>> it also mentions unable to merge coverage map for TOD_TICKS_PER_SECOND
>> for the same reason. The mismatch is always 36 bytes.
>
>
> You need to find the two executables it differs on and see why the size
> is different. It processes them from first to last on the command line so
> find the first one that has the symbol and compare the dump to that
> one it is flagging as different.

Will the 2 executables be say for example unlimited.exe from the build
tree and the temporarily copied covoar executable unlimited.exe.cov?
So libscore.a info matches the original unlimited.exe and something
has changed in the covoar copies?
>>
>>
>> Also I wanted to mention, there is quite a bit of work from the 2
>> previous students on the integration of the coverage work into RTEMS
>> Tester, would you be interested in starting the merging process of
>> this. It might take a while to get everything reviewed and make any
>> necessary changes. At this point it is working. its just fixing bugs,
>> which I'm working on at the moment.
>
>
> Chris.. I suggested this. It seems the basic flow and integration of
> coverage into the tester is good enough to review and merge.
>
> --joel
>
>>
>>
>> Thanks,
>>
>> Cillian.
>> >
>> > Chris
>> _______________________________________________
>> devel mailing list
>> devel at rtems.org
>> http://lists.rtems.org/mailman/listinfo/devel
>
>