[PATCH v2] covoar: Store address-to-line info outside of DWARF
Alex White
alex.white at oarcorp.com
Thu Jun 17 03:01:09 UTC 2021
On Mon, Jun 7, 2021 at 8:14 PM Chris Johns <chrisj at rtems.org> wrote:
>
> On 8/6/21 1:44 am, Joel Sherrill wrote:
> >
> >
> > On Mon, Jun 7, 2021 at 7:00 AM Alex White <alex.white at oarcorp.com
> > <mailto:alex.white at oarcorp.com>> wrote:
> >
> >
> > On Wed, Jun 2, 2021 at 7:37 PM Chris Johns <chrisj at rtems.org
> > <mailto:chrisj at rtems.org>> wrote:
> > >
> > > Looking good with a single comment below ...
> > >
> > > On 3/6/21 6:08 am, Alex White wrote:
> > > > This adds the AddressToLineMapper class and supporting classes to
> > > > assume responsibility of tracking address-to-line information.
> > > >
> > > > This allows the DWARF library to properly cleanup all of its resources
> > > > and leads to significant memory savings.
> > > >
> > > > Closes #4383
> > > > ---
> > > > rtemstoolkit/rld-dwarf.cpp | 5 +
> > > > rtemstoolkit/rld-dwarf.h | 5 +
> > > > tester/covoar/AddressToLineMapper.cc | 104 +++++++++++++
> > > > tester/covoar/AddressToLineMapper.h | 211 +++++++++++++++++++++++++++
> > > > tester/covoar/ExecutableInfo.cc | 73 +++++----
> > > > tester/covoar/ExecutableInfo.h | 11 +-
> > > > tester/covoar/wscript | 1 +
> > > > 7 files changed, 368 insertions(+), 42 deletions(-)
> > > > create mode 100644 tester/covoar/AddressToLineMapper.cc
> > > > create mode 100644 tester/covoar/AddressToLineMapper.h
> > > >
> > > > diff --git a/rtemstoolkit/rld-dwarf.cpp b/rtemstoolkit/rld-dwarf.cpp
> > > > index 2fce0e4..1eae50c 100644
> > > > --- a/rtemstoolkit/rld-dwarf.cpp
> > > > +++ b/rtemstoolkit/rld-dwarf.cpp
> > > > @@ -1893,6 +1893,11 @@ namespace rld
> > > > return false;
> > > > }
> > > >
> > > > + const addresses& compilation_unit::get_addresses () const
> > > > + {
> > > > + return addr_lines_;
> > > > + }
> > > > +
> > > > functions&
> > > > compilation_unit::get_functions ()
> > > > {
> > > > diff --git a/rtemstoolkit/rld-dwarf.h b/rtemstoolkit/rld-dwarf.h
> > > > index 1210813..eadb50d 100644
> > > > --- a/rtemstoolkit/rld-dwarf.h
> > > > +++ b/rtemstoolkit/rld-dwarf.h
> > > > @@ -707,6 +707,11 @@ namespace rld
> > > > */
> > > > unsigned int pc_high () const;
> > > >
> > > > + /**
> > > > + * The addresses associated with this compilation unit.
> > > > + */
> > > > + const addresses& get_addresses () const;
> > > > +
> > > > /**
> > > > * Get the source and line for an address. If the address does
> > not match
> > > > * false is returned the file is set to 'unknown' and the line is
> > set to
> > > > diff --git a/tester/covoar/AddressToLineMapper.cc
> > b/tester/covoar/AddressToLineMapper.cc
> > > > new file mode 100644
> > > > index 0000000..c305e3b
> > > > --- /dev/null
> > > > +++ b/tester/covoar/AddressToLineMapper.cc
> > > > @@ -0,0 +1,104 @@
> > > > +/*! @file AddressToLineMapper.cc
> > > > + * @brief AddressToLineMapper Implementation
> > > > + *
> > > > + * This file contains the implementation of the functionality
> > > > + * of the AddressToLineMapper class.
> > > > + */
> > > > +
> > > > +#include "AddressToLineMapper.h"
> > > > +
> > > > +namespace Coverage {
> > > > +
> > > > + uint64_t SourceLine::location() const
> > > > + {
> > > > + return address;
> > > > + }
> > > > +
> > > > + bool SourceLine::is_an_end_sequence() const
> > > > + {
> > > > + return is_end_sequence;
> > > > + }
> > > > +
> > > > + const std::string& SourceLine::path() const
> > > > + {
> > > > + return path_;
> > > > + }
> > > > +
> > > > + int SourceLine::line() const
> > > > + {
> > > > + return line_num;
> > > > + }
> > > > +
> > > > + void AddressLineRange::addSourceLine(const rld::dwarf::address& address)
> > > > + {
> > > > + auto insertResult = sourcePaths.insert(address.path());
> > > > +
> > > > + sourceLines.emplace_back(
> > > > + SourceLine (
> > > > + address.location(),
> > > > + *insertResult.first,
> > > > + address.line(),
> > > > + address.is_an_end_sequence()
> > > > + )
> > > > + );
> > > > + }
> > > > +
> > > > + const SourceLine& AddressLineRange::getSourceLine(uint32_t address) const
> > > > + {
> > > > + if (address < lowAddress || address > highAddress) {
> > > > + throw SourceNotFoundError(std::to_string(address));
> > > > + }
> > > > +
> > > > + const SourceLine* last_line = nullptr;
> > > > + for (const auto &line : sourceLines) {
> > > > + if (address <= line.location())
> > > > + {
> > > > + if (address == line.location())
> > > > + last_line = &line;
> > > > + break;
> > > > + }
> > > > + last_line = &line;
> > > > + }
> > > > +
> > > > + if (last_line == nullptr) {
> > > > + throw SourceNotFoundError(std::to_string(address));
> > > > + }
> > > > +
> > > > + return *last_line;
> > > > + }
> > >
> > > How often is this function being called? If it is a hot spot are there other
> > > ways to search for the address? I suppose it depends on the number of lines in
> > > the vector.
> >
> > According to a quick gprof run, this function is being called a decent
> > amount, somewhere around 1.5 million times for a full ARM coverage run.
> >
> > But it only takes up a very small amount (less than 0.02%) of cumulative
> > execution time so I don't believe it is an issue.
> >
> >
> > Glad we have profiling data to guide optimizations. I also suspect that as
> > the code is c-plus-plus-ified more, some performance improvements will
> > occur naturally. I know you (Alex) have mentioned some performance
> > improvements just from this alone. But this discussion reminded me of
> > an XKCD and remarks by Dijkstra which are always good to remember.
> >
> > https://m.xkcd.com/1691/ <https://m.xkcd.com/1691/>
> >
> > Dijkstra is famously remembered for a number of variants on the premature
> > optimization theme:
> >
> > "Programmers waste enormous amounts of time thinking about, or worrying about,
> > the speed of noncritical parts of their programs, and these attempts at
> > efficiency actually have a strong negative impact when debugging and maintenance
> > are considered. We should forget about small efficiencies, say about 97% of the
> > time: premature optimization is the root of all evil. Yet we should not pass up
> > our opportunities in that critical 3%."
> >
> > I'd add that I want RTEMS to stay algorithmically optimized instead of
> > tricks. "Optimize" for memory usage, readability, testability, maintainability,
> > etc. If we had focused on tricks 30+ years ago, all of those tricks would
> > likely be inappropriate and burdens now which only served to negatively
> > impacts.
>
> Agreed. I was drawn to the code because it uses a pointer and I was looking for
> a C++ approach that did not use a pointer.
So is this ok to push now that it is known to have an insignificant performance impact or is a different approach still warranted?
Thanks,
Alex
>
> > <old man stops yelling at clouds now>
>
> How did you know I was? ;)
>
> Chris
>
> >
> > --joel
> >
> >
> > Alex
> >
> > >
> > > If the sourcesLines vector was sorted can this loop be replaced with some form
> > > of std::upper_bound? That is ...
> > >
> > > auto last_line =
> > > std::upper_bound(sourceLines.begin(),
> > > sourceLines.end(),
> > > address,
> > > [&address](const SourceLine& sl) {
> > > return address < sl.location();
> > > });
> > >
> > > [ not sure if the code I posted would work and if I got the lamda right :) ]
> > >
> > > Just wondering.
> > >
> > > Chris
> > >
> > > > +
> > > > + void AddressToLineMapper::getSource(
> > > > + uint32_t address,
> > > > + std::string& sourceFile,
> > > > + int& sourceLine
> > > > + ) const {
> > > > + const SourceLine default_sourceline = SourceLine();
> > > > + const SourceLine* match = &default_sourceline;
> > > > +
> > > > + for (const auto &range : addressLineRanges) {
> > > > + try {
> > > > + const SourceLine& potential_match = range.getSourceLine(address);
> > > > +
> > > > + if (match->is_an_end_sequence() ||
> > !potential_match.is_an_end_sequence()) {
> > > > + match = &potential_match;
> > > > + }
> > > > + } catch (const AddressLineRange::SourceNotFoundError&) {}
> > > > + }
> > > > +
> > > > + sourceFile = match->path();
> > > > + sourceLine = match->line();
> > > > + }
> > > > +
> > > > + AddressLineRange& AddressToLineMapper::makeRange(
> > > > + uint32_t low,
> > > > + uint32_t high
> > > > + )
> > > > + {
> > > > + addressLineRanges.emplace_back(
> > > > + AddressLineRange(low, high)
> > > > + );
> > > > +
> > > > + return addressLineRanges.back();
> > > > + }
> > > > +
> > > > +}
> > > > diff --git a/tester/covoar/AddressToLineMapper.h
> > b/tester/covoar/AddressToLineMapper.h
> > > > new file mode 100644
> > > > index 0000000..e6ab4a8
> > > > --- /dev/null
> > > > +++ b/tester/covoar/AddressToLineMapper.h
> > > > @@ -0,0 +1,211 @@
> > > > +/*! @file AddressToLineMapper.h
> > > > + * @brief AddressToLineMapper Specification
> > > > + *
> > > > + * This file contains the specification of the AddressToLineMapper class.
> > > > + */
> > > > +
> > > > +#ifndef __ADDRESS_TO_LINE_MAPPER_H__
> > > > +#define __ADDRESS_TO_LINE_MAPPER_H__
> > > > +
> > > > +#include <cstdint>
> > > > +#include <set>
> > > > +#include <string>
> > > > +#include <vector>
> > > > +
> > > > +#include <rld-dwarf.h>
> > > > +
> > > > +namespace Coverage {
> > > > +
> > > > + /*! @class SourceLine
> > > > + *
> > > > + * This class stores source information for a specific address.
> > > > + */
> > > > + class SourceLine {
> > > > +
> > > > + public:
> > > > +
> > > > + SourceLine()
> > > > + : address(0),
> > > > + path_("unknown"),
> > > > + line_num(-1),
> > > > + is_end_sequence(true)
> > > > + {
> > > > + }
> > > > +
> > > > + SourceLine(
> > > > + uint64_t addr,
> > > > + const std::string& src,
> > > > + int line,
> > > > + bool end_sequence
> > > > + ) : address(addr),
> > > > + path_(src),
> > > > + line_num(line),
> > > > + is_end_sequence(end_sequence)
> > > > + {
> > > > + }
> > > > +
> > > > + /*!
> > > > + * This method gets the address of this source information.
> > > > + *
> > > > + * @return Returns the address of this source information
> > > > + */
> > > > + uint64_t location() const;
> > > > +
> > > > + /*!
> > > > + * This method gets a value indicating whether this address
> > represents an
> > > > + * end sequence.
> > > > + *
> > > > + * @return Returns whether this address represents an end sequence
> > or not
> > > > + */
> > > > + bool is_an_end_sequence() const;
> > > > +
> > > > + /*!
> > > > + * This method gets the source file path of this address.
> > > > + *
> > > > + * @return Returns the source file path of this address
> > > > + */
> > > > + const std::string& path() const;
> > > > +
> > > > + /*!
> > > > + * This method gets the source line number of this address.
> > > > + *
> > > > + * @return Returns the source line number of this address
> > > > + */
> > > > + int line() const;
> > > > +
> > > > + private:
> > > > +
> > > > + /*!
> > > > + * The address of this source information.
> > > > + */
> > > > + uint64_t address;
> > > > +
> > > > + /*!
> > > > + * An iterator pointing to the location in the set that contains the
> > > > + * source file path of the address.
> > > > + */
> > > > + const std::string& path_;
> > > > +
> > > > + /*!
> > > > + * The source line number of the address.
> > > > + */
> > > > + int line_num;
> > > > +
> > > > + /*!
> > > > + * Whether the address represents an end sequence or not.
> > > > + */
> > > > + bool is_end_sequence;
> > > > +
> > > > + };
> > > > +
> > > > + typedef std::vector<SourceLine> SourceLines;
> > > > +
> > > > + typedef std::set<std::string> SourcePaths;
> > > > +
> > > > + /*! @class AddressLineRange
> > > > + *
> > > > + * This class stores source information for an address range.
> > > > + */
> > > > + class AddressLineRange {
> > > > +
> > > > + public:
> > > > +
> > > > + class SourceNotFoundError : public std::runtime_error {
> > > > + /* Use the base class constructors. */
> > > > + using std::runtime_error::runtime_error;
> > > > + };
> > > > +
> > > > + AddressLineRange(
> > > > + uint32_t low,
> > > > + uint32_t high
> > > > + ) : lowAddress(low), highAddress(high)
> > > > + {
> > > > + }
> > > > +
> > > > + /*!
> > > > + * This method adds source and line information for a specified
> > address.
> > > > + *
> > > > + * @param[in] address specifies the DWARF address information
> > > > + */
> > > > + void addSourceLine(const rld::dwarf::address& address);
> > > > +
> > > > + /*!
> > > > + * This method gets the source file name and line number for a given
> > > > + * address.
> > > > + *
> > > > + * @param[in] address specifies the address to look up
> > > > + *
> > > > + * @return Returns the source information for the specified address.
> > > > + */
> > > > + const SourceLine& getSourceLine(uint32_t address) const;
> > > > +
> > > > + private:
> > > > +
> > > > + /*!
> > > > + * The low address for this range.
> > > > + */
> > > > + uint32_t lowAddress;
> > > > +
> > > > + /*!
> > > > + * The high address for this range.
> > > > + */
> > > > + uint32_t highAddress;
> > > > +
> > > > + /*!
> > > > + * The source information for addresses in this range.
> > > > + */
> > > > + SourceLines sourceLines;
> > > > +
> > > > + /*!
> > > > + * The set of source file names for this range.
> > > > + */
> > > > + SourcePaths sourcePaths;
> > > > +
> > > > + };
> > > > +
> > > > + typedef std::vector<AddressLineRange> AddressLineRanges;
> > > > +
> > > > + /*! @class AddressToLineMapper
> > > > + *
> > > > + * This class provides address-to-line capabilities.
> > > > + */
> > > > + class AddressToLineMapper {
> > > > +
> > > > + public:
> > > > +
> > > > + /*!
> > > > + * This method gets the source file name and line number for a given
> > > > + * address.
> > > > + *
> > > > + * @param[in] address specifies the address to look up
> > > > + * @param[out] sourceFile specifies the name of the source file
> > > > + * @param[out] sourceLine specifies the line number in the source file
> > > > + */
> > > > + void getSource(
> > > > + uint32_t address,
> > > > + std::string& sourceFile,
> > > > + int& sourceLine
> > > > + ) const;
> > > > +
> > > > + /*!
> > > > + * This method creates a new range with the specified addresses.
> > > > + *
> > > > + * @param[in] low specifies the low address of the range
> > > > + * @param[in] high specifies the high address of the range
> > > > + *
> > > > + * @return Returns a reference to the newly created range
> > > > + */
> > > > + AddressLineRange& makeRange(uint32_t low, uint32_t high);
> > > > +
> > > > + private:
> > > > +
> > > > + /*!
> > > > + * The address and line information ranges.
> > > > + */
> > > > + AddressLineRanges addressLineRanges;
> > > > +
> > > > + };
> > > > +
> > > > +}
> > > > +
> > > > +#endif
> > > > diff --git a/tester/covoar/ExecutableInfo.cc
> > b/tester/covoar/ExecutableInfo.cc
> > > > index 861e60d..a6184e7 100644
> > > > --- a/tester/covoar/ExecutableInfo.cc
> > > > +++ b/tester/covoar/ExecutableInfo.cc
> > > > @@ -40,61 +40,60 @@ namespace Coverage {
> > > > executable.begin();
> > > > executable.load_symbols(symbols);
> > > >
> > > > + rld::dwarf::file debug;
> > > > +
> > > > debug.begin(executable.elf());
> > > > debug.load_debug();
> > > > debug.load_functions();
> > > >
> > > > - try {
> > > > - for (auto& cu : debug.get_cus()) {
> > > > - for (auto& func : cu.get_functions()) {
> > > > - if (!func.has_machine_code()) {
> > > > - continue;
> > > > - }
> > > > + for (auto& cu : debug.get_cus()) {
> > > > + AddressLineRange& range = mapper.makeRange(cu.pc_low(),
> > cu.pc_high());
> > > > + // Does not filter on desired symbols under the assumption that
> > the test
> > > > + // code and any support code is small relative to what is being
> > tested.
> > > > + for (const auto &address : cu.get_addresses()) {
> > > > + range.addSourceLine(address);
> > > > + }
> > > >
> > > > - if (!SymbolsToAnalyze->isDesired(func.name
> > <http://func.name>())) {
> > > > - continue;
> > > > + for (auto& func : cu.get_functions()) {
> > > > + if (!func.has_machine_code()) {
> > > > + continue;
> > > > + }
> > > > +
> > > > + if (!SymbolsToAnalyze->isDesired(func.name <http://func.name>())) {
> > > > + continue;
> > > > + }
> > > > +
> > > > + if (func.is_inlined()) {
> > > > + if (func.is_external()) {
> > > > + // Flag it
> > > > + std::cerr << "Function is both external and inlined: "
> > > > + << func.name <http://func.name>() << std::endl;
> > > > }
> > > >
> > > > - if (func.is_inlined()) {
> > > > - if (func.is_external()) {
> > > > - // Flag it
> > > > - std::cerr << "Function is both external and inlined: "
> > > > - << func.name <http://func.name>() << std::endl;
> > > > - }
> > > > -
> > > > - if (func.has_entry_pc()) {
> > > > - continue;
> > > > - }
> > > > -
> > > > - // If the low PC address is zero, the symbol does not appear in
> > > > - // this executable.
> > > > - if (func.pc_low() == 0) {
> > > > - continue;
> > > > - }
> > > > + if (func.has_entry_pc()) {
> > > > + continue;
> > > > }
> > > >
> > > > - // We can't process a zero size function.
> > > > - if (func.pc_high() == 0) {
> > > > + // If the low PC address is zero, the symbol does not appear in
> > > > + // this executable.
> > > > + if (func.pc_low() == 0) {
> > > > continue;
> > > > }
> > > > + }
> > > >
> > > > - createCoverageMap (cu.name <http://cu.name>(), func.name
> > <http://func.name>(),
> > > > - func.pc_low(), func.pc_high() - 1);
> > > > + // We can't process a zero size function.
> > > > + if (func.pc_high() == 0) {
> > > > + continue;
> > > > }
> > > > +
> > > > + createCoverageMap (cu.name <http://cu.name>(), func.name
> > <http://func.name>(),
> > > > + func.pc_low(), func.pc_high() - 1);
> > > > }
> > > > - } catch (...) {
> > > > - debug.end();
> > > > - throw;
> > > > }
> > > > -
> > > > - // Can't cleanup handles until the destructor because the
> > information is
> > > > - // referenced elsewhere. NOTE: This could cause problems from too
> > many open
> > > > - // file descriptors.
> > > > }
> > > >
> > > > ExecutableInfo::~ExecutableInfo()
> > > > {
> > > > - debug.end();
> > > > }
> > > >
> > > > void ExecutableInfo::dumpCoverageMaps( void ) {
> > > > @@ -197,7 +196,7 @@ namespace Coverage {
> > > > {
> > > > std::string file;
> > > > int lno;
> > > > - debug.get_source (address, file, lno);
> > > > + mapper.getSource (address, file, lno);
> > > > std::ostringstream ss;
> > > > ss << file << ':' << lno;
> > > > line = ss.str ();
> > > > diff --git a/tester/covoar/ExecutableInfo.h b/tester/covoar/ExecutableInfo.h
> > > > index 5d5a595..dfc71aa 100644
> > > > --- a/tester/covoar/ExecutableInfo.h
> > > > +++ b/tester/covoar/ExecutableInfo.h
> > > > @@ -15,6 +15,7 @@
> > > > #include <rld-files.h>
> > > > #include <rld-symbols.h>
> > > >
> > > > +#include "AddressToLineMapper.h"
> > > > #include "CoverageMapBase.h"
> > > > #include "SymbolTable.h"
> > > >
> > > > @@ -157,11 +158,6 @@ namespace Coverage {
> > > > uint32_t highAddress
> > > > );
> > > >
> > > > - /*!
> > > > - * The DWARF data to the ELF executable.
> > > > - */
> > > > - rld::dwarf::file debug;
> > > > -
> > > > /*!
> > > > * The executable's file name.
> > > > */
> > > > @@ -172,6 +168,11 @@ namespace Coverage {
> > > > */
> > > > rld::symbols::table symbols;
> > > >
> > > > + /*!
> > > > + * The address-to-line mapper for this executable.
> > > > + */
> > > > + AddressToLineMapper mapper;
> > > > +
> > > > /*!
> > > > * This map associates a symbol with its coverage map.
> > > > */
> > > > diff --git a/tester/covoar/wscript b/tester/covoar/wscript
> > > > index 82599b0..a928b1b 100644
> > > > --- a/tester/covoar/wscript
> > > > +++ b/tester/covoar/wscript
> > > > @@ -83,6 +83,7 @@ def build(bld):
> > > >
> > > > bld.stlib(target = 'ccovoar',
> > > > source = ['app_common.cc',
> > > > + 'AddressToLineMapper.cc',
> > > > 'CoverageFactory.cc',
> > > > 'CoverageMap.cc',
> > > > 'CoverageMapBase.cc',
> > > >
> > > _______________________________________________
> > > devel mailing list
> > > devel at rtems.org <mailto:devel at rtems.org>
> > > http://lists.rtems.org/mailman/listinfo/devel
> > <http://lists.rtems.org/mailman/listinfo/devel>
> > _______________________________________________
> > devel mailing list
> > devel at rtems.org <mailto:devel at rtems.org>
> > http://lists.rtems.org/mailman/listinfo/devel
> > <http://lists.rtems.org/mailman/listinfo/devel>
> >
> _______________________________________________
> devel mailing list
> devel at rtems.org
> http://lists.rtems.org/mailman/listinfo/devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20210617/4a146e94/attachment-0001.html>
More information about the devel
mailing list