chrisj at rtems.org
Thu Nov 14 00:09:59 UTC 2019
On 13/11/19 5:22 pm, Sebastian Huber wrote:
> On 13/11/2019 01:39, Chris Johns wrote:
>>> > Ok, it seems a test program has exactly one state for a given BSP.
>>> Yes, this is driven by the ABI, hardware profile and similar constraints.
>>> This is generally the case but we have a few BSPs where the same executable
>>> runs on real hardware and multiple simulators. The SPARC BSPs are at the
>>> top of this list. The pc386 is another example.
>>> We currently have no way to specify that a test could pass on real hardware,
>>> fail on simulator 1 and pass on simulator 2. For the SPARC BSPs, we now
>>> have real HW, sis, qemu and tsim.
>>> The .tcfg files don't begin to address this and I really don't have any good
>>> ideas myself. It is just a very real life case that it would be nice to address
>>> so we have 100% expected passes on all execution platforms.
>> I think it is a difficult problem to try and solve in RTEMS with test states. I
>> think this is best handled in the eco-system with rtems-test and anything that
>> processes the results it generates. In the eco-system we have more flexibility
>> to define a hardware profile and assign a BSP to it.
>> A hardware test result is of higher importance that a simulator result and a
>> simulator test result is of higher importance than no test results. The
>> expected-fail state is feedback driven and we cannot set this state without a
>> suitable set of results. The tier a BSP resides in defines how you view the test
>> state for a specific BSP.
>> Here are some observations ...
>> 1. By definition an accurate simulator will always match the hardware test
> This depends on the goals of the simulator. In Qemu for example there is by
> design no preservation of the timing properties of the simulated hardware. We
> see this in the spintrcritical* tests for example.
No doubt however the assertion is still true. Having tests that pass on
simulator and fail on hardware then flagging them as valid does not make sense.
>> 2. We need to provide a workable way to capture the test results and feedback
>> this back into the tiers
>> 3. The tiers need to be current and visible to be meaningful to our users
>> 4. We have no expected-fail states in any tests and we should ...
>> $ grep 'expected-fail' `find . -name \*.tcfg`` | wc -l
> One approach to capture the actual expectations on a particular target system
> (e.g. real hardware H, simulator A, simulator B) would be to split the state
> definitions. The build system would just build the test or not.
I am not following why you would want this.
> A particular
> test runner which knows if the test runs on H, A, or B could use a test runner
> specific data set to get the runtime states (pass, expected-fail, user-input,
> indeterminate, benchmark).
We should always build and run a test that can be executed. We need to know if a
test flagged as expected-fail is now passing.
More information about the devel