Coverage analysis update

Joel Sherrill joel at rtems.org
Thu Jul 6 22:11:40 UTC 2017


On Thu, Jul 6, 2017 at 4:14 PM, Cillian O'Donnell <cpodonnell8 at gmail.com>
wrote:

> On 6 July 2017 at 19:14, Joel Sherrill <joel at rtems.org> wrote:
> >
> >
> > On Thu, Jul 6, 2017 at 11:11 AM, Cillian O'Donnell <
> cpodonnell8 at gmail.com>
> > wrote:
> >>
> >> On 6 July 2017 at 15:34, Joel Sherrill <joel at rtems.org> wrote:
> >> >
> >> >
> >> > On Thu, Jul 6, 2017 at 5:53 AM, Cillian O'Donnell
> >> > <cpodonnell8 at gmail.com>
> >> > wrote:
> >> >>
> >> >> On 6 July 2017 at 08:05, Chris Johns <chrisj at rtems.org> wrote:
> >> >> > On 05/07/2017 22:46, Cillian O'Donnell wrote:
> >> >> >>
> >> >> >> Now the thing is, I might have to run RTEMS Tester about 3 times
> for
> >> >> >> the full testsuite before I can get the report, as the covoar runs
> >> >> >> will often hang on an arbitrary method and have to be manually
> >> >> >> exited
> >> >> >> and re-run.
> >> >> >
> >> >> > How is covoar being run? Is it being run by the 'rtems-test'
> command?
> >> >>
> >> >> Yes covoar runs after the testsuite if a --coverage flag is included
> >> >> with rtems-test command.
> >> >> >
> >> >> >> The point that this happens is not repeatable and there's
> >> >> >> no error message to go on.
> >> >> >
> >> >> > Is it the covoar process that is stopping? Is it the process idle
> or
> >> >> > using 100%?
> >> >>
> >> >> Covoar locks up and becomes idle. Joel mentioned in the IRC meeting
> >> >> he's seen it before and I'll need to track down the exe causing it
> and
> >> >> debug it to get rid of this
> >> >
> >> >
> >> > There must be a misunderstanding. I have never seen covoar lock up
> >> > like this.
> >>
> >> Oh sorry, you must of just meant debugging covoar in general, picked
> >> it up wrong.
> >> >
> >> > It will ignore records when it thinks things are inconsistent. This
> can
> >> > occur
> >> > when a method appears in two different executables and has different
> >> > sizes. The cause of this is usually padding at the end of the method
> so
> >> > the subsequent method in memory starts on a nice cache-line alignment.
> >> > The code tries to recognize the nops/padding at the end and ignore
> them.
> >> > When the padding inserted by ld changes or the objdump output being
> >> > parsed changes, covoar needs to be adjusted.
> >> >
> >> > The ignored record message I saw is in the code that reads Couverture
> >> > trace records. The info in the record appears to be inconsistent with
> >> > the
> >> > code to which it has been matched. This could be because the trace
> >> > record format has been changed or a simple bug.
> >> >
> >> > From what I have seen, it is likely that it is primarily bugs in input
> >> > processing.
> >> >
> >> > But the code to do the "big union" of the trace data has to iterate
> over
> >> > all tests and all coverage blocks. It can take a while to run. I guess
> >> > there could be an infinite loop in it.  I always debugged it by having
> >> > a test case I could run in gdb.
> >> >
> >> >>
> >> >> >
> >> >> >> The only common theme is the methods will
> >> >> >> often be called thread...something. Even more rarely it will stop
> >> >> >> with
> >> >> >> the message 'resource temporarily unavailable'. I'm wondering is
> >> >> >> this
> >> >> >> normal behaviour for covoar on a long test-run? This never happens
> >> >> >> when running just the samples.
> >> >> >
> >> >> > It is not normal behavior.
> >> >
> >> >
> >> > +1
> >> >
> >> > Is Linux or covoar printing the message? It holds all data in memory
> >> > while
> >> > doing the analysis. Is it consuming a lot of RAM?
> >>
> >> I done two rtems-test coverage runs there while monitoring 'top'
> >> output. The first run is successful and coverage  report is generated
> >> and the second run produced IO Error: Resource temporarily unavailable
> >> on the second. Both times theres about 3.5-4GB RAM free and CPU runs
> >> at 70-80% so nothing seems to be strained there.
> >> Timecounter_Tick_simple is being checked before it happens. The output
> >> is
> >>
> >> 4000c9d8:    0a 80 00 03     bcs  4000c9e4 <_Timecounter_Tick_simple+
> 0x78>
> >> 4000c9dc:    84 01 20 01     add  %g4, 1, %g2
> >>
> >>     bt = th->th_offset;
> >> 4000c9e0:    84 10 00 04     mov  %g4, %g2
> >>     _bt->sec += _bt2->sec;
> >> 4000c9e4:    fa 06 e0 a0     ld  [ %i3 + 0xa0 ], %i5
> >> 4000c9e8:    84 00 80 1d     add  %g2, %i5, %g2
> >>     _tv->tv_usec = ((uint64_t)1000000 * (uint32_t)(_bt->frac >> 32)) >>
> >> 32;
> >> 4000c9ec:    39 00 03 d0     sethi  %hi(0xf4000), %i4
> >>     _ts->tv_nsec = ((uint64_t)1000000000 *
> >> 4000c9f0:    3b 0e e6 b2     sethi  %hi(0x3b9ac800), %i5
> >>     _tv->tv_usec = ((uint64_t)1000000 * (uint32_t)(_bt->frac >> 32)) >>
> >> 32;
> >> 4000c9f4:    b8 17 22 40     or  %i4, 0x240, %i4
> >>     _ts->tv_nsec = ((uint64_t)1000000000 *
> >> 4000c9f8:    ba 17 62 00     or  %i5, 0x200, %i5
> >>     _tv->tv_usec = ((uint64_t)1000000 * (uint32_t)(_bt->frac >> 32))
> >> >> 32;Exception in thread _stderr[c]:
> >> Traceback (most recent call last):
> >>   File "/usr/lib/python2.7/threading.py", line 801, in
> __bootstrap_inner
> >>     self.run()
> >>   File "/usr/lib/python2.7/threading.py", line 754, in run
> >>     self.__target(*self.__args, **self.__kwargs)
> >>   File
> >> "/home/cpod/development/rtems/test/rtems-tools/rtemstoolkit/
> execute.py",
> >> line 206, in _readthread
> >>     _output_line(line, exe, prefix, out, count)
> >>   File
> >> "/home/cpod/development/rtems/test/rtems-tools/rtemstoolkit/
> execute.py",
> >> line 184, in _output_line
> >>     out(prefix + line)
> >>   File
> >> "/home/cpod/development/rtems/test/rtems-tools/tester/rt/coverage.py",
> >> line 387, in output_handler
> >>     log.notice(text, stdout_only = False)
> >>   File
> >> "/home/cpod/development/rtems/test/rtems-tools/rtemstoolkit/log.py",
> >> line 104, in notice
> >>     print(l)
> >> IOError: [Errno 11] Resource temporarily unavailable
> >
> >
> > This looks like qemu is running too long with an RTEMS executable. This
> > stack trace is RTEMS code. Not covoar code. Please check what executable
> > is consuming so much run-time before it is killed. This will change the
> > solution. :)
> >
> > covoar spends a lot of time iterating over sets of containers and
> > executables.
> > There could be an infinite loop in it but it is using STL containers for
> > everything and is single threaded so I rather doubt it would go into an
> > infinite loop.
>
> The executable running here is sprmsched01.exe I produced it again and
> it was sp22.exe nnd then again with psxaio01.exe so it doesn't seem to
> pin point a single test repeatably.
>

The rtems tester tries to run the tests in parallel. I wonder if this trips
some conflict which we are not aware of.

Chris.. Are there arguments to make it single-thread the qemu invocations
as a test?


> >
> >>
> >>
> >> >
> >> >>
> >> >> >> When a test run is successful it will also finish with a lot of
> >> >> >> these
> >> >> >> messages:
> >> >> >>
> >> >> >> *** Trace block is inconsistent with coverage map
> >> >> >> *** Trace block (0x4000c4fc - 0x4000c51f) for 36 bytes
> >> >> >> *** Coverage map
> >> >> >> /home/cpod/coverage_test/test/coverage/unlimited.exe.cov
> >> >> >> INFO: DesiredSymbols::mergeCoverageMap - Unable to merge coverage
> >> >> >> map
> >> >> >> for _Thread_queue_Extract_with_proxy because the sizes are
> different
> >> >> >
> >> >> > I wonder what sizes are different?
> >> >>
> >> >> For this one, Joel mentioned it could be something to do with nops
> >> >> being added or space not matching and to take a look at _Thread_
> >> >> queue_Extract_with_proxy  in the objdump and see if I can dig up
> >> >> anything from that. That message appears for _Thread_
> >> >> queue_Extract_with_proxy  on all tests from base_sp.exe to
> >> >> unlimited.exe when running the full testsuite and on a few instances
> >> >> it also mentions unable to merge coverage map for
> TOD_TICKS_PER_SECOND
> >> >> for the same reason. The mismatch is always 36 bytes.
> >> >
> >> >
> >> > You need to find the two executables it differs on and see why the
> size
> >> > is different. It processes them from first to last on the command line
> >> > so
> >> > find the first one that has the symbol and compare the dump to that
> >> > one it is flagging as different.
> >>
> >> Will the 2 executables be say for example unlimited.exe from the build
> >> tree and the temporarily copied covoar executable unlimited.exe.cov?
> >> So libscore.a info matches the original unlimited.exe and something
> >> has changed in the covoar copies?
> >
> >
> > No. It will be unlimited and one of the tests before it on the covoar
> > command line. In particular, it should be the first one with the symbol
> > in question. Compare the objdump's.
>
> Ok I just checked and the first instance is base_sp.exe, which is
> actually the first test run so this appears for every single test.
> Where we hoping to compare one without the problem to the one it first
> appears on?
>

Yep. covoar builds its "truth" of the method from the first one instance of
each symbol it sees. So this one symbol is coming from base_sp.exe
and there is a difference between that and the first one where the error
is reported.

The first instance or the error reporting one may be incorrectly processed
or formatted/padded in an unexpected way.


> >
> >
> >>
> >> >>
> >> >>
> >> >> Also I wanted to mention, there is quite a bit of work from the 2
> >> >> previous students on the integration of the coverage work into RTEMS
> >> >> Tester, would you be interested in starting the merging process of
> >> >> this. It might take a while to get everything reviewed and make any
> >> >> necessary changes. At this point it is working. its just fixing bugs,
> >> >> which I'm working on at the moment.
> >> >
> >> >
> >> > Chris.. I suggested this. It seems the basic flow and integration of
> >> > coverage into the tester is good enough to review and merge.
> >> >
> >> > --joel
> >> >
> >> >>
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Cillian.
> >> >> >
> >> >> > Chris
> >> >> _______________________________________________
> >> >> devel mailing list
> >> >> devel at rtems.org
> >> >> http://lists.rtems.org/mailman/listinfo/devel
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20170706/1daf7120/attachment-0002.html>


More information about the devel mailing list