ZynqMP and Versal crash clearing coherent cache memory with memset

Chris Johns chrisj at rtems.org
Mon Oct 18 22:11:26 UTC 2021


On 19/10/21 8:59 am, Joel Sherrill wrote:
> On Mon, Oct 18, 2021 at 4:28 PM Chris Johns <chrisj at rtems.org> wrote:
>>
>> On 19/10/21 3:53 am, Kinsey Moore wrote:
>>> On 10/18/2021 00:44, Chris Johns wrote:
>>>> Hi,
>>>>
>>>> I cannot run libbsd on real hardware because the cadence rx descriptor cache
>>>> coherent allocation crashes in `memset`. It is used to clear the memory.
>>>>
>>>> The rtemsbsd allocator call optionally clears the memory and it seems the newlib
>>>> aarch64 memset code crashes when doing this. A basic loop with 8bit or 32bit
>>>> writes does not crash. The memset call happily clears an array in cached memory
>>>> with different offsets.
>>>>
>>>> I have posted a patch to spcache01 that generates the crash on Versal and ZynqMP
>>>> hardware. The crash dump is:
>>>>
>>>> test cache coherent allocation
>>>> clear cache coherent with memset: 0x1fe00050
>>>>
>>>>
>>>> *** FATAL ***
>>>> fatal source: 9 (RTEMS_FATAL_SOURCE_EXCEPTION)
>>>>
>>>>
>>>> X0   = 0x000000001fe00050 X17  = 0x000000000000000c
>>>> X1   = 0x0000000000000000 X18  = 0x00000000100007b0
>>>> X2   = 0x0000000000000110 X19  = 0x000000001fe00050
>>>> X3   = 0x000000001fe000c0 X20  = 0x000000001fdfff80
>>>> X4   = 0x000000001fe00250 X21  = 0x0000000010013ab0
>>>> X5   = 0x0000000000000004 X22  = 0x0000000000000000
>>>> X6   = 0x0000000000000001 X23  = 0x00000000ffffffff
>>>> X7   = 0x0000000000000000 X24  = 0x0000000010103140
>>>> X8   = 0x0000000000000000 X25  = 0x0000000000000000
>>>> X9   = 0xffffff80ffffffc8 X26  = 0x0000000000000000
>>>> X10  = 0x0000000000000000 X27  = 0x0000000000000000
>>>> X11  = 0x000000001010ca78 X28  = 0x0000000000000000
>>>> X12  = 0x0000000000000001 FP   = 0x000000001010cc30
>>>> X13  = 0x000000001fe00050 LR   = 0x0000000010001f94
>>>> X14  = 0x0000000000000000 SP   = 0x000000001010cc30
>>>> X15  = 0x0000000000000004 PC   = 0x00000000100125c0
>>>> X16  = 0x000000001000f700 DAIF = 0x00000000000003c0
>>>> VEC  = 0x0000000000000004 CPSR = 0x0000000060000005
>>>> ESR  = EC: 0b100101 IL: 0b1 ISS: 0b0000000000000000001100001
>>>>         Data Abort taken without a change in Exception level
>>>> FAR  = 0x000000001fe000c0
>>>> FPCR = 0x0000000000000000 FPSR = 0x0000000000000010
>>>>
>>>> The Versal (A72) fails in exactly the same way. The allocated address is
>>>> 0x1fe00050 and the FAR is 0x1fe000c0 so I am not sure if the "0xc0 - 0x50"
>>>> section is aligning the pointer to a larger word size for better performance and
>>>> that first part is OK but the different word size breaks.
>>>
>>> I'm running with a toolchain that was built with
>>> --targetcflags="-DPREFER_SIZE_OVER_SPEED" which affects the content of the
>>> memset function, so my memset is just loops of writes and seems to work fine.
>>
>> Oh. Maybe the eng manual needs a piece on this. Using flags on tool chains like
>> this is fine for a user because it is use at your own peril however I believe
>> patches need to be tested with the defaults for all tools. It is way to hard to
>> baseline a BSP if tweaks are needed here and there.
> 
> We did try to merge this to the RSB as a temporary workaround for ilp32 issues.
> Kinsey may have realized it had this impact also but I don't recall
> being aware of it.

Sure and we need to accommodate this but I think as a policy we need to make
sure patches are tested with default tool sets. I cannot see how we can make
things work without having this happen?

> We didn't want it to be a local hack. :)

It may have to be just that. It seem to me we have an IPL32 BSP that needs a
special set of tools and that constrains any other aarch64 BSPs if it became the
default. Do we want that? If the cached memory gets a performance boost from a
better memset, memcpy etc then I hope that is available to me by default.

>>> Just out of curiousity, what instruction was at that PC address? If it was "dc
>>> zva", then I had seen this a while back during initial AArch64 bringup and had
>>> assumed it was fixed since the addition of the MMU code since that instruction
>>> doesn't work on device memory.
>>
>> It this that instruction ...
>>
>>     100125c0:   d50b7423        dc      zva, x3
>>
>> Looks like it is not fixed.
> 
> I think your suggestion that FreeBSD should not use memset for device memory
> is the right path though. But that could be in a lot of places. :(

I do not know. The allocation is under the bus space DMA allocator and that
interface is complicated. Maybe memset is not suitable?

Chris


More information about the devel mailing list