covoar SIGKILL Investigation

Chris Johns chrisj at rtems.org
Tue Aug 14 21:04:26 UTC 2018


On 15/8/18 2:47 am, Joel Sherrill wrote:
> Hi
> 
> There has been a thread offlist as we have been attempting to resolve
> some issues that popped up using covoar at scale with Vijay's and
> Chris' recent work. It appears that the magic number is 400 executables
> which can be processed before covoar dies with a SIGKILL. This 
> should indicate that it has hit a hard limit. 
> 
> We have tried using ulimit to limit things. I have no idea where 
> exceeding limits in a process is logged. Knowing that would be
> very useful if someone can point us to the right place.
> 
> My current suspicion is that we have hit a maximum open
> files limit. I don't recall when ExecutableInfo.cc opened and
> closed files before but if the DWARF support includes an open 
> file descriptor, then there is one left open for the entire duration
> of the program run. The instance of executable information for
> each executable exists for the entire program run. It wouldn't
> hurt if we could close the DWARF helper file descriptor when
> we have gathered all the information.

The number of open files would seem like a possible source of the problem. The
ExecuableInfo constructor opens the executable and debug info which has a
reference to the ELF handle and that holds the file open.

I wonder if the phases of how we read ELF and DWARF info can be changed so the
cached the data is all we need and the ELF handle does not need to be open?

Chris

> 
> I am hoping gcov processing is turned off for Vijay's current
> runs since that would add more open file descriptors.
> 
> I could be completely wrong on the limit being hit and am
> open to ideas on how to figure out which one it is.
> 
> Vijay.. running valgrind on a run of covoar with only one
> RTEMS executable would be a good start at figuring
> out if there are resources allocated that could be freed
> along the execution. As Chris has pointed out to me
> before, I didn't really care much about clean up because
> when the program ended, all the memory was freed anyway.
> This is OK until you hit some limit.
> 
> Ideas appreciated on how to debug this enough to find
> the cause.
> 
> Thanks.
> 
> --joel
> 
> 
> _______________________________________________
> devel mailing list
> devel at rtems.org
> http://lists.rtems.org/mailman/listinfo/devel
> 


More information about the devel mailing list