fenv on RISC-V

Fri Aug 16 15:29:35 UTC 2019

On Fri, Aug 16, 2019 at 9:44 AM Gedare Bloom <gedare at rtems.org> wrote:
>
> On Thu, Aug 15, 2019 at 3:58 PM Joel Sherrill <joel at rtems.org> wrote:
> >
> > OK. I am going to through an issue out and then paste another email
> > from Jim Wilson. Let's see if we can figure this one out portably.
> >
> > (1) Every architecture now will have an implementation of fenv. It may
> > be a stub, non-functional fenv implementation. If this is the case, many
> > methods return "-ENOTSUP". If the test first checks for fegetenv()
> > returning this, then the architecture has no support. The test could detect
> > this as a first step and exit.
> >
> Again, if the implementation is non-functional, any test for it should
> cause a failure. (Assuming that fenv is a "required feature" we want
> to test.)

The definition of fenv methods allows some to do nothing and
return 0. This lets a non-functional implementation slide by.

Also the RISC-V implementation is functional for HW FP but
the ifdefs result in different behavior for soft FP.

>
> The portable solution is to write the test case against the standard.
> Ignore all this other stuff about the implementation, other than to
> triage the test failures to place blame on RTEMS, newlib, or "other"
> (gcc, hardware, etc.)

Yes as long as the test framework marks it as expected.

> > (2) RISC-V is the only architecture currently with an implementation. It
> > has very limited support (return 0 mostly) when there is no hardware
> > floating point. __riscv_flen is defined to 0 for no HW FP, 4 for single
> > support and 4 for double support (8 coming for 128 bit). RTEMS'
> > CPU_HARDWARE_FP is always false on this architecture since
> > this trips code in the cpukit. I don't know how to flag whether the test
> > should be run or not on this architecture.
> >
> I wouldn't filter, just run the test and fail.
>
> We have to avoid the mindset that all failures are bad. There is a
> certain fallacy in using test pass metrics as a proxy for software
> quality, and especially trying to achieve 100% tests pass to claim
> your product is good, when you exclude 99% of the tests because you
> know they fail.
>
> What is important from a project standpoint is that we can manage
> expected-fail, known-to-fail, and bug. This should be done at a higher
> level than the codebase itself, i.e., in the test infrastructure.

These should be in the expected-fail category.

The test just shouldn't hang.

>
> > (3) According to Jim, there are allowed variations on the exception
> > method behavior. We may have to account for that.
> >
> Are they allowed variations, or are they simply variations caused by
> implementation incompleteness?  It seems to me that the variations are
> mostly due to insufficient support for exceptions in soft-float modes.
> If I test for a particular exception in soft-float mode, and it
> doesn't get raised, that is a test failure. It doesn't matter that the
> test fails because of lacking support in for example gcc.

You should read the POSIX specification. It allows for a lot of
exceptions to just be missing.

I understand what "raise exception" means at the HW level but
I am also not sure precisely what POSIX thinks that means. It
could be a SIGFPE but I can't find any hint in the POSIX standard
using Google that SIGFPE and fenv methods have a relationship.

> > My first thought is that embedding logic to know if the test applies
> > or not is a bad idea since we could end up with multiple tests.
> >
> > Second thought is to add a .h file which contains the architecture
> > specific logic and a static inline method which reports if the test
> > should be attempted. Then any test which needs fenv can all it.
> > The question is would this be good in the cpukit like "rtems/test_fenv.h"?
> > That way an application could ask questions.
> >
> > Email from Jim below my signature. I suspect this should migrate to
> > the CPU Supplement. I'm open to comments on this also.
> >
> > Thoughts?
> >
> > --joel
> >
> > =============================
> >
> > We have __riscv_float_abi_soft, __riscv_float_abi_single, and
> > __riscv_float_abi_double which indicate the size of FP registers being
> > used for argument passing, which can be none, single-float only, or
> > single-float and double.  We also have __riscv_flen which can be zero
> > for no float, 32 for single float only, and 64 for single and double
> > float.  So each one of these has 3 values.  And there is already a
> > formal 128-bit FP proposal, so at some point each of these will have 4
> > possible values.  The hardware FP reg size does not have to match the
> > ABI arg passing FP reg size.  It just needs to be greater than or
> > equal to it.  So you can have a rv64gc target using the lp64 ABI,
> > where we have FP regs that can hold double, but we don't use FP regs
> > for argument passing.  This can be useful for upward compatibility,
> > e.g. if the first version of the project has no FP regs, and then they
> > are added later, and you want new code to be ABI compatible with old
> > code.  So there are 6 different valid combinations of FP hardware reg
> > size and FP ABI reg size currently, which will grow to 10 when 128-bit
> > FP support is added.
> >
> > Note that if you have a single float target, then you will use soft
> > float support for double, and hard float support for single.  If you
> > have a single float target, and use the soft float ABI, and have code
> > that uses float, then you will get code that uses the FP registers for
> > arithmetic, but not for argument passing.  Also, another interesting
> > bit here is that the libgcc soft-float code has a check for
> > __riscv_flen, and will use fcsr for exceptions.  This case triggers if
> > you have a single float target, and need soft-float double code, in
> > which case the soft-float double routines will use fcsr to signal
> > exceptions because it exists.  It will also use it for rounding modes.
> > But if you have a target with no FP hardware, and need soft-float
> > double code, then fcsr does not exist, and you get no rounding mode
> > support and no exception support, though maybe that can be implemented
> > in the future.  So the soft-float double support may or may not
> > support rounding and exceptions depending on the target it is compiled
> > for.
> >
> > =============================
> >
> >
> > On Thu, Aug 15, 2019 at 4:31 PM Chris Johns <chrisj at rtems.org> wrote:
> > >
> > > On 16/8/19 1:38 am, Gedare Bloom wrote:
> > > > On Wed, Aug 14, 2019 at 12:44 PM Joel Sherrill <joel at rtems.org> wrote:
> > > >>
> > > >> Hi
> > > >>
> > > >> I emailed Jim Wilson of SiFive and got a quick response. Much thanks
> > > >> to him and this is his reply:
> > > >>
> > > >> ==================
> > > >> I don't have any embedded hardware that I can use for testing.  I just
> > > >> have linux and simulators (qemu, gdb sim).  I haven't seen gcc
> > > >> testsuite failures related to fenv, but not sure if those tests are
> > > >> being run for embedded elf targets.  The testcase doesn't work with
> > > >> gdb sim, probably a gdb sim bug.  It does work with qemu, but only if
> > > >> I use a -march that has FP support.  Checking the code, it looks
> > > >> pretty obvious, fegetexceptflag for instance is
> > > >>
> > > >> int fegetexceptflag(fexcept_t *flagp, int excepts)
> > > >> {
> > > >> #if __riscv_flen
> > > >> ...
> > > >> #endif
> > > >>   return 0;
> > > >> }
> > > >>
> > > >> __riscv_flen is the number of bits (bytes?) in the FP registers, and
> > > >> is zero if this is target has no FP support.  So fenv for soft-float
> > > >> targets hasn't been implemented.  Was probably implemented for hard
> > > >> float targets because it is easy, we just directly read/write the fp
> > > >> status register.
> > > >>
> > > >> feraiseexcept isn't fully supported, but that seems to be a standards
> > > >> interpretation question.  RISC-V hardware doesn't have a way to
> > > >> automatically trap when an exception is raised.  We can only set a bit
> > > >> in the fp status register and something else needs to check that and
> > > >> trap.  So is feraiseexcept full implemented if we set the bit but
> > > >> don't trap?  Newlib says no.  Glibc says yes.  But glibc has
> > > >> feenableexcept and FE_NOMASK_ENV.  Newlib does not.  So on RISC-V, all
> > > >> exceptions are masked, and that can't be changed.
> > > >> ==================
> > > >>
> > > >> This test (and fenv in general) is unlikely to work on an target with
> > > >> soft float per Jim and some newlib discussions. On those BSPs,
> > > >> it may have to be marked as NA with a comment about soft float.
> > > >> Unless the test is automatically disabled if the CPU_HARDWARE_FP
> > > >> define in RTEMS is false. But this is always defined to FALSE on risc-v.
> > > >>
> > > >> Q: Is hardware FP even supported right now on RISC-V? I don't see the
> > > >> context switch code assuming there are registers to switch.
> > > >>
> > > >> And as Jim notes, even on some hardware targets (yes RISC-V), some
> > > >> things don't work.
> > > >>
> > > >> So let's try this on a HW FP target and see if the works.
> > > >>
> > > >> Then address how to automatically disable this test with fall back to
> > > >> updating a lot of .tcfg files.
> > > >>
> > > >
> > > > The test should not be disabled. The test should report a failure,
> > > > because the functionality is not implemented *but it could be.*
> > > >
> > > > We should not be using the tcfg filtering simply to disable tests that
> > > > we know fail. As I understand it, the filter mechanism is mainly there
> > > > to help us with tests that can't be executed on the target due to some
> > > > missing resources e.g., insufficient memory, no cache, etc. Even
> > > > though no one implemented the support for soft-float, it could be
> > > > done, and a user might want to know that these tests fail for this
> > > > target.
> > > >
> > > > This is not the first time I've noticed it mentioned to exclude tests
> > > > that we know fail, but there is a difference between something we know
> > > > is missing/never-to-be-implemented versus something that is impossible
> > > > to implement. We should exclude the latter, but fail on the former.
> > >
> > > Well said and 100% correct.
> > >
> > > I am considering a tool that helps us manage these settings. My hope is that may
> > > generate some more activity in this area.
> > >
> > > Chris