covoar SIGKILL Investigation

Vijay Kumar Banerjee vijaykumar9597 at gmail.com
Thu Sep 13 08:38:50 UTC 2018


I have filed a ticket with all the information.

https://devel.rtems.org/ticket/3515

On Sat, 1 Sep 2018 at 00:35, Joel Sherrill <joel at rtems.org> wrote:

>
>
> On Fri, Aug 31, 2018 at 1:42 PM Vijay Kumar Banerjee <
> vijaykumar9597 at gmail.com> wrote:
>
>>
>>
>>
>> On Sat, 1 Sep 2018 at 00:01, Joel Sherrill <joel at rtems.org> wrote:
>>
>>>
>>>
>>> On Fri, Aug 31, 2018, 12:44 PM Vijay Kumar Banerjee <
>>> vijaykumar9597 at gmail.com> wrote:
>>>
>>>> Hello,
>>>> I have listed all the specific exes for which covoar fails.
>>>> I have observed that covoar fails only for some specific exe for some
>>>> symbol-sets,
>>>> this was the reason for covoar not completing successfully. If we
>>>> exclude these exe,
>>>> covoar finishes successfully for the whole testsuites. So the
>>>> prolem has become more
>>>> specific now, we'll have to investigate for only these exe.
>>>>
>>>> Here's the list of all the symbol-sets that were failing, along with
>>>> the exe files that made them
>>>> trip.
>>>>
>>>> -----------------------
>>>> posix ----
>>>> sptests/spmutex01.exe
>>>> fstests/fsclose01.exe
>>>>
>>>> sapi ----
>>>> sptests/spversion01.exe
>>>>
>>>> librfs ----
>>>> fstests/mrfs_fserror.exe
>>>> fstests/mrfs_fsrdwr.exe
>>>> (all fstests/mrfs_*.exe fails)
>>>> fstests/fsrofs01.exe
>>>> libtests/flashdisk01.exe
>>>>
>>>> libdevnull ----
>>>> sptests/sp21.exe
>>>> psxtests/psxmmap01.exe
>>>> tmtests/tm20.exe
>>>>
>>>> libblock ----
>>>> fstests/fsbdpart01.exe
>>>> ---------------------
>>>>
>>>
>>> Thanks.
>>>
>>> Have you had a chance to try get with catch throw?
>>>
>>  I'm trying it now with posix and fsclose01.exe, here's what I see.
>>
>
> I think we need Chris to wake up and give us some feedback. The throw is
> using
> rld::error.
>
> My guess is that something in those executables is not making rld happy.
> #4 is the line of the following line of code in covoar.cc:
>
>     // Load the objdump for the symbols in this executable.
>     objdumpProcessor->load( exe, objdumpFile, err );
>   }
>
> That's the loop loading the executables. Something in each of these exes
> is not making the underlying ELF/DWARF code happy.
>
> Hopefully Chris will wake up fresh on a nice Australian Saturday and
> have an answer. :)
>
>
>
>>
>> ------------------------------------
>> Catchpoint 1 (exception thrown), 0x00007ffff7ad7f51 in
>> __cxxabiv1::__cxa_throw (obj=obj at entry=0x832c80,
>>     tinfo=tinfo at entry=0x488a70 <typeinfo for rld::error>, dest=dest at entry=0x40f980
>> <rld::error::~error()>)
>>     at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:78
>> 78   PROBE2 (throw, obj, tinfo);
>> (gdb) bt
>> #0  0x00007ffff7ad7f51 in __cxxabiv1::__cxa_throw (obj=obj at entry=0x832c80,
>> tinfo=tinfo at entry=0x488a70 <typeinfo for rld::error>,
>>     dest=dest at entry=0x40f980 <rld::error::~error()>) at
>> ../../../../libstdc++-v3/libsupc++/eh_throw.cc:78
>> #1  0x0000000000405926 in Coverage::ExecutableInfo::findCoverageMap
>> (this=<optimized out>, symbolName="wait")
>> #2  0x000000000041bc1d in
>> Coverage::finalizeSymbol(Coverage::ExecutableInfo*,
>> std::__cxx11::basic_string<char, std::char_traits<char>,
>> std::allocator<char> >&,
>> std::__cxx11::list<Coverage::ObjdumpProcessor::objdumpLine_t,
>> std::allocator<Coverage::ObjdumpProcessor::objdumpLine_t> >)
>>     () at ../tester/covoar/ObjdumpProcessor.cc:35
>>
> #3  0x000000000041c9e6 in
>> Coverage::ObjdumpProcessor::load(Coverage::ExecutableInfo*,
>> rld::process::tempfile&, rld::process::tempfile&) ()
>>     at /usr/include/c++/8/new:169
>> #4  0x000000000040e058 in covoar(int, char**) () at
>> ../tester/covoar/covoar.cc:381
>> #5  0x000000000040bacc in main () at ../tester/covoar/covoar.cc:563
>> #6  0x00007ffff70fd1bb in __libc_start_main (main=0x40b9b0 <main>,
>> argc=11, argv=0x7fffffffd4b8, init=<optimized out>, fini=<optimized out>,
>>     rtld_fini=<optimized out>, stack_end=0x7fffffffd4a8) at
>> ../csu/libc-start.c:308
>> #7  0x000000000040c32a in _start () at ../tester/covoar/covoar.cc:593
>>
>> ------------------------------------
>>
>>>
>>>>
>>>> On Wed, 22 Aug 2018 at 21:11, Vijay Kumar Banerjee <
>>>> vijaykumar9597 at gmail.com> wrote:
>>>>
>>>>> I will send the attachment offlist as it's too big for the list.
>>>>> On Wed, 22 Aug 2018 at 21:07, Vijay Kumar Banerjee <
>>>>> vijaykumar9597 at gmail.com> wrote:
>>>>>
>>>>>> The coverage for whole testsuite completed successfully,
>>>>>> I have attached the zip file here and also included the js and css
>>>>>> files in it.
>>>>>>
>>>>>> coverage didn't complete for the following symbol-set libraries and
>>>>>> was returning,
>>>>>> covoar error 10
>>>>>>
>>>>>> * libsapi.a
>>>>>> * libposix.a
>>>>>> * librfs.a
>>>>>> * libcsupport.a
>>>>>> * libdevnull.a
>>>>>> * libblock.a
>>>>>>
>>>>>>
>>>>>> On Wed, 22 Aug 2018 at 10:22, Chris Johns <chrisj at rtems.org> wrote:
>>>>>>
>>>>>>> On 22/08/2018 14:41, Joel Sherrill wrote:
>>>>>>> > On Tue, Aug 21, 2018, 10:26 PM Chris Johns <chrisj at rtems.org
>>>>>>> > <mailto:chrisj at rtems.org>> wrote:
>>>>>>> >
>>>>>>> >     On 22/08/2018 09:29, Joel Sherrill wrote:
>>>>>>> >     > On Tue, Aug 21, 2018, 4:05 PM Vijay Kumar Banerjee
>>>>>>> >     <vijaykumar9597 at gmail.com <mailto:vijaykumar9597 at gmail.com>
>>>>>>> >     > <mailto:vijaykumar9597 at gmail.com <mailto:
>>>>>>> vijaykumar9597 at gmail.com>>> wrote:
>>>>>>> >     >     On Wed, 22 Aug 2018 at 01:55, Joel Sherrill <
>>>>>>> joel at rtems.org
>>>>>>> >     <mailto:joel at rtems.org>
>>>>>>> >     >     <mailto:joel at rtems.org <mailto:joel at rtems.org>>> wrote:
>>>>>>> >     >
>>>>>>> >     >         How long is covoar taking for the entire set?
>>>>>>> >     >
>>>>>>> >     >     It works great. this is what `time` says
>>>>>>> >     >     --------
>>>>>>> >     >     real17m49.887s
>>>>>>> >     >     user14m25.620s
>>>>>>> >     >     sys0m37.847s
>>>>>>> >     >     --------
>>>>>>> >     >
>>>>>>> >     > What speed and type of processor do you have?
>>>>>>> >     >
>>>>>>> >
>>>>>>> >     The program is single threaded so the preprocessing of each
>>>>>>> executable is
>>>>>>> >     sequential. Memory usage is reasonable so there is no swapping.
>>>>>>> >
>>>>>>> >     Running covoar from the command line on a box with:
>>>>>>> >
>>>>>>> >      hw.machine: amd64
>>>>>>> >      hw.model: Intel(R) Core(TM) i7-6900K CPU @ 3.20GHz
>>>>>>> >      hw.ncpu: 16
>>>>>>> >      hw.machine_arch: amd64
>>>>>>> >
>>>>>>> >     plus 32G of memory has a time of:
>>>>>>> >
>>>>>>> >           366.32 real       324.97 user        41.33 sys
>>>>>>> >
>>>>>>> >     The approximate time break down is:
>>>>>>> >
>>>>>>> >      ELF/DWARF loading  : 110s (1m50s)
>>>>>>> >      Objdump            : 176s (2m56s)
>>>>>>> >      Processing         :  80s (1m20s)
>>>>>>> >
>>>>>>> >
>>>>>>> > I don't mind this execution time for the near future. It is far
>>>>>>> from obscene
>>>>>>> > after building and running 600 tests.
>>>>>>>
>>>>>>> Yeah, there are other things we need to do first.
>>>>>>>
>>>>>>> >     The DWARF loading is not optimised and I load all source line
>>>>>>> to address maps
>>>>>>> >     and all functions rather that selectively scanning for
>>>>>>> specific names at the
>>>>>>> >     DWARF level. It is not clear to me scanning would be better or
>>>>>>> faster.
>>>>>>> >
>>>>>>> > I doubt it is worth the effort. There should be few symbols in an
>>>>>>> exe we don't
>>>>>>> > care about. Especially once we start to worry about libc and libm.
>>>>>>>
>>>>>>> Yeah, this is what I thought at the start.
>>>>>>>
>>>>>>> >     My hope
>>>>>>> >     is moving to Capstone would help lower or remove the objdump
>>>>>>> overhead. Then
>>>>>>> >     there is threading for the loading.
>>>>>>> >
>>>>>>> >     > I don't recall it taking near this long in the past. I used
>>>>>>> to run it as
>>>>>>> >     part of
>>>>>>> >     > development.
>>>>>>> >
>>>>>>> >     The objdump processing is simpler than before so I suspect the
>>>>>>> time would have
>>>>>>> >     been at least 4 minutes.
>>>>>>> >
>>>>>>> >     > But we may have more tests and the code has changed.
>>>>>>> >
>>>>>>> >     I think having more tests is the dominant factor.
>>>>>>> >
>>>>>>> >     > Reading dwarf
>>>>>>> >     > with the file open/closes, etc just may be more expensive
>>>>>>> than parsing the
>>>>>>> >     text
>>>>>>> >     > files.
>>>>>>> >
>>>>>>> >     The reading DWARF is a cost and at the moment it is not
>>>>>>> optimised but it is only
>>>>>>> >     a cost because we still parse the objdump data. I think
>>>>>>> opening and closing
>>>>>>> >     files is not a factor.
>>>>>>> >
>>>>>>> >     The parsing the objdump is the largest component of time.
>>>>>>> Maybe using Capstone
>>>>>>> >     with the ELF files will help.
>>>>>>> >
>>>>>>> >     > But it is more accurate and lays the groundwork.for more
>>>>>>> types of analysis.
>>>>>>> >
>>>>>>> >     Yes and think this is important.
>>>>>>> >
>>>>>>> > +1
>>>>>>> >
>>>>>>> >     > Eventually we will have to profile this code. Whatever is
>>>>>>> costly is done for
>>>>>>> >     > each exe so there is a multiplier.
>>>>>>> >     >
>>>>>>> >     > I suspect this code would parallelize reading info from the
>>>>>>> exes fairly well.
>>>>>>> >
>>>>>>> >     Agreed.
>>>>>>> >
>>>>>>> > Might be a good case for C++11 threads if one of the thread
>>>>>>> container classes is
>>>>>>> > a nice pool.
>>>>>>>
>>>>>>> Good idea. I think we need to look at some of the global object
>>>>>>> pointers before
>>>>>>> we head down this path.
>>>>>>>
>>>>>>> > And we might have some locking to account for in core data
>>>>>>> structures. Are STL
>>>>>>> > container instances thread safe?
>>>>>>>
>>>>>>> We need to manage all locking.
>>>>>>>
>>>>>>> > But an addition after feature stable relative to old output plus
>>>>>>> Capstone.
>>>>>>>
>>>>>>> Agreed.
>>>>>>>
>>>>>>> >     > Merging the info and generating the reports not well due to
>>>>>>> data contention.
>>>>>>> >
>>>>>>> >     Yes.
>>>>>>> >
>>>>>>> >     > But optimizing too early and the wrong way is not smart.
>>>>>>> >
>>>>>>> >     Yes. We need Capstone to be added before this can happen.
>>>>>>> >
>>>>>>> > +1
>>>>>>> >
>>>>>>> > I would also like to see gcov support but that will not be a
>>>>>>> factor in the
>>>>>>> > performance we have. It will add reading a lot more files (gcno)
>>>>>>> and writing a
>>>>>>> > lot of gcda at the end. Again more important to be right than fast
>>>>>>> at first. And
>>>>>>> > completely an addition.
>>>>>>>
>>>>>>> Agreed.
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20180913/d19adc02/attachment-0001.html>


More information about the devel mailing list