Using model checking to do test generation for RTEMS

Mon Sep 7 03:21:37 UTC 2020

On 4/9/20 7:35 pm, Andrew Butterfield wrote:
> Dear Chris,
> 
>  thanks for your feedback - much appreciated!
> 
> Responses to your queries inline below.
> 
>> On 3 Sep 2020, at 00:38, Chris Johns <chrisj at rtems.org
>> <mailto:chrisj at rtems.org>> wrote:
>>
>> On 1/9/20 2:30 am, Andrew Butterfield wrote:
>>> Dear all,
>>>
>>> I am involved the in the ESA sponsored project RTEMS-SMP, to add
>>> tools and data for software qualification. Our focus is on the use
>>> of formal techniques to assist in software verification.
>>
>> Excellent :)
>>
>>> We have a developed a prototype for an approach that uses the SPIN
>>> model-checker (spinroot.com <http://spinroot.com>) to produce formal models
>>> of RTEMS API
>>> behaviour, from which API tests can be generated. The formal models
>>> form one way of specifying the desired behaviour for these APIs.
>>
>> Spin looks nice. Its maturity and wide host support looks great.
>>
>>> We have developed a prototype demonstrator of this technique and are
>>> now seeking feedback from the community. It is expected that a
>>> suitably revised and updated version of this would become part of the
>>> proposed RTEMS qualification tool-chain.
>>>
>>> The demonstrator can be found at:
>>>  https://github.com/andrewbutterfield/manual-spin2tests
>>
>> The repo looks great. The approach looks suitable for our project.
>>
>>> Feedback on all aspects of this from a range of different perspectives
>>> would be very welcome.
>>
>> Will the results of your efforts be merged into the rtems-central repo?
> 
> Yes - I believe so.

Great.

>> Is there any relationship between the spec files Sebastian is creating for the
>> APIs and your work? If not how would they be kept in sync or at a minimum how
>> would we know they do not match?
> 
> The plan is for our work to be fully integrated with what Sebastian is building.
> Our spin2test tool would be part of the qualification tools, and the tests we
> produce
> would be integrated with the handwritten test suites. The test results would
> feed  into
> the tests analysis tools (coverage, etc..) and that analysis would end up as part of
> the qualification datapackage.

Sounds good. I am looking forward to seeing the results.

>> How is the log from the test output used to validate the model?
> 
> The test output shows what the test did, so it can be used by someone familiar
> with the
> requirements to judge if the test expectations of correct behaviour are
> themselves correct.
> So, yes, they can help in that regard.

Just so I understand, if the model and the API do not match the test will report
a failure?

>> I would like to see this work performed on a tier-1 architecture [1]. At this
>> point in time SPARC (and LEON) is not tier-1 because there are no test results
>> from hardware posted to build @ rtems.org <http://rtems.org>. This leaves the
>> ESA pre-qual project
>> with some choices. You could select a suitable target like a beagleboneblack and
>> post hardware test results or encourage other members of the pre-qual project to
>> run the testsuite on a LEON and post the results. We could then move SPARC and
>> LEON to tier-1, something I personally would love to see.
> 
> In one sense the ESA project has no choice - we have to work with
> leon3/gr712rc/gr740.

That is fine. We understand few people often see the real hardware with flight
software development. We are relaxed on how an architecture can reach tier-1, we
need only one device from a family to run the tests with the results being
published. My desire is to automatically monitor the test results and age out
older results.

> I should point out that a test generated by us has been run (standalone) on
> gr740 hardware at ESTEC.
> Would it help if the results of that was posted? 

The test output needs to be posted using the public rtems-tools.git `rtems-test`
command. For public test results to be meaningful the tools used to generate the
results need to be public.

> My understanding is that as our project rolls out we will be running hardware tests.

This is understandable and what I would expect. The role of tiers in RTEMS is to
bring the results out into the public so all users can view where each
architecture and device family sits. I personally see value for hardware vendors
having devices in tier-1 and I see value for large organisation like ESA
expecting to see their selected hardware in tier-1. You and I will not make this
happen.

> However, the way I built the test executables was using waf configured for the
> above three BSPs, and so it should be possible to do it for another tier-1 architecture.

If the tests end up in the testsuite the tests will be required to pass on all
supported architectures. We consider the range of architecture we support a real
feature because the code needs to be clean.

Chris