Proposal for hardware configuration dependent performance limits

Gedare Bloom gedare at rtems.org
Fri Nov 20 16:43:35 UTC 2020


On Thu, Nov 19, 2020 at 4:51 PM Chris Johns <chrisj at rtems.org> wrote:
>
> On 19/11/20 7:26 pm, Sebastian Huber wrote:
> > Hello Chris,
> >
> > On 17/11/2020 22:43, Chris Johns wrote:
> >
> >>
> >> On 17/11/20 6:14 pm, Sebastian Huber wrote:
> >>> On 16/11/2020 23:42, Chris Johns wrote:
> >>>> On 16/11/20 5:40 pm, Sebastian Huber wrote:
> >>>>> On 16/11/2020 00:33, Chris Johns wrote:
> >>>>>
> >>>>>>>>> In the proposal, limits are specified like this:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> limits:
> >>>>>>>>>        sparc/gr712rc:
> >>>>>>>>>          DirtyCache:
> >>>>>>>>>            max-upper-bound: 0.000005
> >>>>>>>>>            mean-upper-bound: 0.000005
> >>>>>>>>>          FullCache:
> >>>>>>>>>            max-upper-bound: 0.000005
> >>>>>>>>>            mean-upper-bound: 0.000005
> >>>>>>>>>          HotCache:
> >>>>>>>>>            max-upper-bound: 0.000005
> >>>>>>>>>            mean-upper-bound: 0.000005
> >>>>>>>>>          Load/1:
> >>>>>>>>>            max-upper-bound: 0.00001
> >>>>>>>>>            mean-upper-bound: 0.00001
> >>>>>>>>>          Load/2:
> >>>>>>>>>            max-upper-bound: 0.00001
> >>>>>>>>>            mean-upper-bound: 0.00001
> >>>>>>>>>          Load/3:
> >>>>>>>>>            max-upper-bound: 0.00001
> >>>>>>>>>            mean-upper-bound: 0.00001
> >>>>>>>>>          Load/4:
> >>>>>>>>>            max-upper-bound: 0.00001
> >>>>>>>>>            mean-upper-bound: 0.00001
> >>>>>>>>>
> >>>>>>>>> This neglects that the limits are subject to a board configuration. One
> >>>>>>>>> approach to cover this is the addition of a new BSP provided function:
> >>>>>>>>>
> >>>>>>>>> const char *rtems_get_hardware_performance_hash();
> >>>>>>>>>
> >>>>>>>>> The BSP feeds all performance related data into a hash function and
> >>>>>>>> "data" here means configuration?
> >>>>>>> Yes, hardware configuration.
> >>>>>> Why not make these values part of the BSP configuration? The defaults for the
> >>>>>> BSP can have a set of suitable values. Different boards have different
> >>>>>> configurations to match and a separate kernel build.
> >>>>>>
> >>>>> This doesn't work on BSPs which support configuration via a hardware
> >>>>> enumeration, boot loader settings, or device trees. Also changes in the BSP
> >>>>> options have no influence on the BSP name. Not only BSP configuration
> >>>>> influence
> >>>>> performance, the CPU options play a role too, for example RTEMS_SMP. In
> >>>>> order to
> >>>>> compare performance values over time we have to obtain the values under the
> >>>>> same
> >>>>> conditions.
> >>>> Maybe I am not understanding the context.
> >>>>
> >>>> A BSP, which ever one, has a set of options that configure it. An example is
> >>>> the
> >>>> xilinx_zynq_zc702 and the `ZYNQ_RAM_LENGTH = 0x40000000`. If I have 2 Zynq
> >>>> circuits one with 256M and one with 1G I need to build and maintain 2 RTEMS
> >>>> builds and from a purists point of view I need to maintain 2 builds of the
> >>>> exact
> >>>> same application.
> >>>>
> >>>> I asked about the fixed memory and your answer was to use the BSP options, the
> >>>> size is fixed in the linker command files via the BSP option. That is what I
> >>>> have done.
> >>>>
> >>>> I would expect there exists a set of values for the xilinx_zynq_zc702 with no
> >>>> SMP and with SMP as this BSP supports SMP. Those values would match all the
> >>>> other settings for the BSP such as ZYNQ_CLOCK_CPU_1X,
> >>>> BSP_ARM_A9MPCORE_PERIPHCLK
> >>>> etc. If my clock is different (and they are) I would need to supply a suitable
> >>>> set of performance values if I wanted to pass those tests.
> >>>>
> >>>> I am not questioning the need for the values or the tests. I am suggesting the
> >>>> values form part of the BSP settings so a user can adjust them to suite their
> >>>> specific set up in the same way they adjust other BSP settings. I do not think
> >>>> we should attempt to hold or manage an endless sets of possible values and I do
> >>>> not see the need for complex encapsulation methods such as a base64 hashes. The
> >>>> systems we interact with are too complex and list is endless.
> >>> I think it will be highly BSP-specific what parameters are relevant to the
> >>> performance limits. This is why I suggested to add a function which can be
> >>> implemented by each BSP.
> >>>
> >>> const char *rtems_get_hardware_performance_something();
> >>>
> >>> It should return a string which changes if a performance relevant parameter
> >>> changed. If it is only SMP/no-SMP, ZYNQ_CLOCK_CPU_1X, and
> >>> BSP_ARM_A9MPCORE_PERIPHCLK, then fine, just return "SMP/800MHz/400MHz" or
> >>> whatever.
> >> I suggest you avoid heading down a path of specific strings, ie avoid something
> >> meaningful a human can read. Also performance characteristics are a part of a
> >> wider configuration topic. Maybe considering that would solve the performance
> >> specific parts as well.
> >>
> >> A label for a build of RTEMS is a good idea (see below) that could serve the
> >> human readable part. I would consider computing a hash for the config.ini file,
> >> ie the build, and embedding it. If you wanted to capture the state of the RTEMS
> >> source built optionally compute a hash for the entire source tree and embed that
> >> as well. You can then have calls such as:
> >>
> >> const char* rtems_config_build_hash(void);
> >> const char* rtems_config_source_hash(void);
> >>
> >>   [ the last one could return "NOT-AVAILABLE" if not enabled ]
> >>
> >> The key point is defining markers, with defaults if optional, then wrapping your
> >> configuration management system round them. Strings with a meaning such as
> >> "SMP/800MHz/400MHz" are fragile because cosmetic changes break dependent
> >> configuration management systems. A hash implies nothing specific, that task is
> >> left to your CM systems.
> >>
> >> For a BSP specific case of runtime values what about:
> >>
> >> const char* rtems_config_bsp_hash(void);
> >>
> >> with a default returning "DEFAULT". A BSP could override a weak function to
> >> provide a hash computed in a specific way.
> >>
> >> When I said a build label I was considering ...
> >>
> >> [arm/beagleboneblack]
> >> RTEMS_BUILD_LABEL = "...---..."
> >>
> >> with a function 'rtems_config_build_label' to fetch it. The default could be
> >> "RTEMS" if not set in config.ini. This would be useful when tracking deployed
> >> builds of RTEMS. Consider this as labelling the config.ini file in a human
> >> readable way that suites my CM processes.
> > thanks for broadening the perspective. Maybe just focusing on the performance
> > limits was a bit too specific. However, if we put things into a hash which only
> > weakly influence the performance characteristics, then comparable performance
> > test runs will be hard over time.
>
> A hash provides nothing more than a unique data point. How it is used qualifies
> what it means and so weak or hard is relative. The path I have put forward
> simply says if the hash is not what you expect something has changed. I like
> this because it is simple and clear at the origin. Exposing internal components
> of a board's configuration so you can determine the reason adds complexity to
> RTEMS and it is not clear to me what the advantages are when considering
> something is fit for purpose.
>
> Note, there is nothing stopping additional adhoc interfaces being added to a
> specific BSP that can be queried in a BSP specific manner to report extra
> detail. This would be outside the formal RTEMS interfaces and could change. An
> example of this is bootloader and boot rom output.
>
> Also I am not sure we need a secure sized hash. Something simple, small and fast
> may be suitable.
>
This is true. The collision resistance of this hash is not too
important, as long as small changes in configuration are not likely to
have a hash collision. If two completely different configurations have
the same hash, this is not likely a problem for  a user, but the
tooling does need to be robust to the possibility.

> >> Can environment variables effect a build of RTEMS? If so you either need to
> >> include them somehow or have waf ignore them.
> >
> > I don't know waf good enough. If some environment variables are set during ./waf
> > configure a warning is printed. I don't know, if environment variables are used
> > during ./waf build.
>
> I am the same. I noted it as a matter of being complete while we discuss this
> topic. Would something in the documentation in relation to configuration
> management be suitable?
>
> >>> My point is that we need a key reported by the BSP and then some performance
> >>> limits which can be found by arch/bsp/key to check if there are performance
> >>> regressions.
> >> I am missing the place where the performance limits are held. Do the tests
> >> report timing values and the checks against the limits happen on a host?
> >
> > Yes, this is what I proposed.
>
> Thanks and sorry for not picking up on this before now. It makes sense to do it
> this way.
>
I chimed in on the idea of not using a hash, because of the opaqueness
of the specification and difficulty to derive what should be
reasonable performance based on small configuration changes from a
standard set. In that case, we do punt some responsibility to the end
user to start from a configuration with a known hash and performance
bounds before defining their own. Otherwise, the best they can do is
like what we do: run it, record the measurements, and use those as the
bounds moving forward.

When a user sends us a report saying my configuration
a/lwjeVQ:H#TIHFOAH doesn't match the performance of
z./hleg.khEHWIWEHFWHFE then we can have this conversation again. :)

> > An alternative would be to generate tables with
> > performance limits and excessive C preprocessor conditionals and let the tests
> > check the limits. Another option is to let the build system generate the tables.
> > This would require that the performance limits are a part of the build
> > specification.
> >
> > The proposed work flow would be something like this:
> >
> > 1. You select a board to use for long term performance tests.
> >
> > 2. You define a set of configurations you want to test.
> >
> > 3. You do an initial run of the test suite for each configuration. The RTEMS
> > Tester provides you with a machine readable output (test data) of the test run
> > with the raw test output per test executable and some meta information (TODO).
> >
> > 4. A tool reads the  test data and the RTEMS specification and updates the
> > specification with the performance limits obtained from the test run (maybe with
> > some simple transformation, for example increase maximum by 10% and round to
> > microseconds).
> >
> > 5. You review the performance limits and then commit them.
> >
> > 6. Later you run the tests with a new RTEMS commit, get the performance values,
> > compare them against the limits stored in the specification, and generate a report.
> >
> > In the specification items the limits are stored like this:
> >
> > limits:
> >       sparc/gr712rc:
> >         DirtyCache:
> >           max-upper-bound: 0.000005
> >           mean-upper-bound: 0.000005
> >
> > So each BSP has a separate block of lines. This avoids trouble with merge
> > conflicts.
> >
> > As discussed above, using arch/bsp as a key is not enough. We need to include
> > other things, so it should be really:
> >
> > limits:
> >       sparc/gr712rc/something-in-addition:
>           configs:
>             - 1727638abd7188282ef
>             - 19292efab87ade8928e
>             - etc
> >         DirtyCache:
> >           max-upper-bound: 0.000005
> >           mean-upper-bound: 0.000005
> >
>
> Nice. I think a hash still works. I would use it to raise an "alert" if it does
> not match any listed value. By an "alert" I am attempting to avoid error or
> warning because this depends on the context. A qualified system may want this to
> be an error while a warning for me is OK if the timing figures are being achieved.
>
+1

> Chris


More information about the devel mailing list