fenv on RISC-V

Fri Aug 16 14:44:34 UTC 2019

On Thu, Aug 15, 2019 at 3:58 PM Joel Sherrill <joel at rtems.org> wrote:
>
> OK. I am going to through an issue out and then paste another email
> from Jim Wilson. Let's see if we can figure this one out portably.
>
> (1) Every architecture now will have an implementation of fenv. It may
> be a stub, non-functional fenv implementation. If this is the case, many
> methods return "-ENOTSUP". If the test first checks for fegetenv()
> returning this, then the architecture has no support. The test could detect
> this as a first step and exit.
>
Again, if the implementation is non-functional, any test for it should
cause a failure. (Assuming that fenv is a "required feature" we want
to test.)

The portable solution is to write the test case against the standard.
Ignore all this other stuff about the implementation, other than to
triage the test failures to place blame on RTEMS, newlib, or "other"
(gcc, hardware, etc.)

> (2) RISC-V is the only architecture currently with an implementation. It
> has very limited support (return 0 mostly) when there is no hardware
> floating point. __riscv_flen is defined to 0 for no HW FP, 4 for single
> support and 4 for double support (8 coming for 128 bit). RTEMS'
> CPU_HARDWARE_FP is always false on this architecture since
> this trips code in the cpukit. I don't know how to flag whether the test
> should be run or not on this architecture.
>
I wouldn't filter, just run the test and fail.

We have to avoid the mindset that all failures are bad. There is a
certain fallacy in using test pass metrics as a proxy for software
quality, and especially trying to achieve 100% tests pass to claim
your product is good, when you exclude 99% of the tests because you
know they fail.

What is important from a project standpoint is that we can manage
expected-fail, known-to-fail, and bug. This should be done at a higher
level than the codebase itself, i.e., in the test infrastructure.

> (3) According to Jim, there are allowed variations on the exception
> method behavior. We may have to account for that.
>
Are they allowed variations, or are they simply variations caused by
implementation incompleteness?  It seems to me that the variations are
mostly due to insufficient support for exceptions in soft-float modes.
If I test for a particular exception in soft-float mode, and it
doesn't get raised, that is a test failure. It doesn't matter that the
test fails because of lacking support in for example gcc.

> My first thought is that embedding logic to know if the test applies
> or not is a bad idea since we could end up with multiple tests.
>
> Second thought is to add a .h file which contains the architecture
> specific logic and a static inline method which reports if the test
> should be attempted. Then any test which needs fenv can all it.
> The question is would this be good in the cpukit like "rtems/test_fenv.h"?
> That way an application could ask questions.
>
> Email from Jim below my signature. I suspect this should migrate to
> the CPU Supplement. I'm open to comments on this also.
>
> Thoughts?
>
> --joel
>
> =============================
>
> We have __riscv_float_abi_soft, __riscv_float_abi_single, and
> __riscv_float_abi_double which indicate the size of FP registers being
> used for argument passing, which can be none, single-float only, or
> single-float and double.  We also have __riscv_flen which can be zero
> for no float, 32 for single float only, and 64 for single and double
> float.  So each one of these has 3 values.  And there is already a
> formal 128-bit FP proposal, so at some point each of these will have 4
> possible values.  The hardware FP reg size does not have to match the
> ABI arg passing FP reg size.  It just needs to be greater than or
> equal to it.  So you can have a rv64gc target using the lp64 ABI,
> where we have FP regs that can hold double, but we don't use FP regs
> for argument passing.  This can be useful for upward compatibility,
> e.g. if the first version of the project has no FP regs, and then they
> are added later, and you want new code to be ABI compatible with old
> code.  So there are 6 different valid combinations of FP hardware reg
> size and FP ABI reg size currently, which will grow to 10 when 128-bit
> FP support is added.
>
> Note that if you have a single float target, then you will use soft
> float support for double, and hard float support for single.  If you
> have a single float target, and use the soft float ABI, and have code
> that uses float, then you will get code that uses the FP registers for
> arithmetic, but not for argument passing.  Also, another interesting
> bit here is that the libgcc soft-float code has a check for
> __riscv_flen, and will use fcsr for exceptions.  This case triggers if
> you have a single float target, and need soft-float double code, in
> which case the soft-float double routines will use fcsr to signal
> exceptions because it exists.  It will also use it for rounding modes.
> But if you have a target with no FP hardware, and need soft-float
> double code, then fcsr does not exist, and you get no rounding mode
> support and no exception support, though maybe that can be implemented
> in the future.  So the soft-float double support may or may not
> support rounding and exceptions depending on the target it is compiled
> for.
>
> =============================
>
>
> On Thu, Aug 15, 2019 at 4:31 PM Chris Johns <chrisj at rtems.org> wrote:
> >
> > On 16/8/19 1:38 am, Gedare Bloom wrote:
> > > On Wed, Aug 14, 2019 at 12:44 PM Joel Sherrill <joel at rtems.org> wrote:
> > >>
> > >> Hi
> > >>
> > >> I emailed Jim Wilson of SiFive and got a quick response. Much thanks
> > >> to him and this is his reply:
> > >>
> > >> ==================
> > >> I don't have any embedded hardware that I can use for testing.  I just
> > >> have linux and simulators (qemu, gdb sim).  I haven't seen gcc
> > >> testsuite failures related to fenv, but not sure if those tests are
> > >> being run for embedded elf targets.  The testcase doesn't work with
> > >> gdb sim, probably a gdb sim bug.  It does work with qemu, but only if
> > >> I use a -march that has FP support.  Checking the code, it looks
> > >> pretty obvious, fegetexceptflag for instance is
> > >>
> > >> int fegetexceptflag(fexcept_t *flagp, int excepts)
> > >> {
> > >> #if __riscv_flen
> > >> ...
> > >> #endif
> > >>   return 0;
> > >> }
> > >>
> > >> __riscv_flen is the number of bits (bytes?) in the FP registers, and
> > >> is zero if this is target has no FP support.  So fenv for soft-float
> > >> targets hasn't been implemented.  Was probably implemented for hard
> > >> float targets because it is easy, we just directly read/write the fp
> > >> status register.
> > >>
> > >> feraiseexcept isn't fully supported, but that seems to be a standards
> > >> interpretation question.  RISC-V hardware doesn't have a way to
> > >> automatically trap when an exception is raised.  We can only set a bit
> > >> in the fp status register and something else needs to check that and
> > >> trap.  So is feraiseexcept full implemented if we set the bit but
> > >> don't trap?  Newlib says no.  Glibc says yes.  But glibc has
> > >> feenableexcept and FE_NOMASK_ENV.  Newlib does not.  So on RISC-V, all
> > >> exceptions are masked, and that can't be changed.
> > >> ==================
> > >>
> > >> This test (and fenv in general) is unlikely to work on an target with
> > >> soft float per Jim and some newlib discussions. On those BSPs,
> > >> it may have to be marked as NA with a comment about soft float.
> > >> Unless the test is automatically disabled if the CPU_HARDWARE_FP
> > >> define in RTEMS is false. But this is always defined to FALSE on risc-v.
> > >>
> > >> Q: Is hardware FP even supported right now on RISC-V? I don't see the
> > >> context switch code assuming there are registers to switch.
> > >>
> > >> And as Jim notes, even on some hardware targets (yes RISC-V), some
> > >> things don't work.
> > >>
> > >> So let's try this on a HW FP target and see if the works.
> > >>
> > >> Then address how to automatically disable this test with fall back to
> > >> updating a lot of .tcfg files.
> > >>
> > >
> > > The test should not be disabled. The test should report a failure,
> > > because the functionality is not implemented *but it could be.*
> > >
> > > We should not be using the tcfg filtering simply to disable tests that
> > > we know fail. As I understand it, the filter mechanism is mainly there
> > > to help us with tests that can't be executed on the target due to some
> > > missing resources e.g., insufficient memory, no cache, etc. Even
> > > though no one implemented the support for soft-float, it could be
> > > done, and a user might want to know that these tests fail for this
> > > target.
> > >
> > > This is not the first time I've noticed it mentioned to exclude tests
> > > that we know fail, but there is a difference between something we know
> > > is missing/never-to-be-implemented versus something that is impossible
> > > to implement. We should exclude the latter, but fail on the former.
> >
> > Well said and 100% correct.
> >
> > I am considering a tool that helps us manage these settings. My hope is that may
> > generate some more activity in this area.
> >
> > Chris