What to do with rtems_cache_disable_data()?

Thu Jun 20 13:53:56 UTC 2024

On 18.06.24 17:32, Gedare Bloom wrote:
> On Mon, Jun 17, 2024 at 11:20 PM Chris Johns <chrisj at rtems.org> wrote:
>>
>> On 18/6/2024 12:02 am, Sebastian Huber wrote:
>>> On 17.06.24 03:35, Chris Johns wrote:
>>>> On 14/6/2024 10:42 pm, Peter Dufault wrote:
>>>>>
>>>>>
>>>>>> On Jun 14, 2024, at 5:47 AM, Sebastian Huber
>>>>>> <sebastian.huber at embedded-brains.de> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> an user noticed that for example on the Xilinx Zynq 7000 BSP, the
>>>>>> rtems_cache_disable_data() doesn't work.
>>>>>>
>>>>>> I had a look at this and managed to disable the L1 and L2 caches, however,
>>>>>> afterwards I got not that far. On the Cortex-A cores it seems at least the
>>>>>> L1 data cache is required to provide support for atomic operations:
>>>>>>
>>>>>> https://stackoverflow.com/questions/76207164/disabled-dcache-will-prevent-atomic-flag-from-being-set
>>>>>>
>>>>>> I guess we have this situation on most modern chips with caches since the
>>>>>> reservation granule is usually a cache line. How do we want to deal with
>>>>>> rtems_cache_disable_data() in this case? From my point of view, this
>>>>>> function as no real use case and adding it in the first place was a mistake.
>>>>>>
>>>>>> We have a couple of options:
>>>>>>
>>>>>> * Make the rtems_cache_disable_data() directive a no-operation. Afterwards
>>>>>> the cache is still enabled, and an user may get confused.
>>>>>>
>>>>>> * Issue a fatal error if someone calls rtems_cache_disable_data().
>>>>>>
>>>>>> * Really disable the cache and let the user figure out that the atomic
>>>>>> operations no longer work. Disabling the data cache can be a quite complex
>>>>>> thing to do.
>>>>>>
>>>>>> * Add a return status code to rtems_cache_disable_data() and let it return
>>>>>> RTEMS_UNSATISFIED for example.
>>>>>
>>>>> Assuming "this function has no real use case and adding it in the first place
>>>>> was a mistake" how about:
>>>>>
>>>>> - In the active release continue to disable the data cache but add a warning
>>>>> attribute 'warning ("Disabling data cache breaks atomic functionality")';
>>>>> - In the next release change it to an error attribute and change the function
>>>>> behavior to do nothing except return RTEMS_UNSATISFIED (in case someone
>>>>> somehow still calls it), or better change it to call an RTEMS fatal function.
>>>>>
>>>>
>>>> I am not so sure it can be easily simplified. I know of a Zynq or ZynqMP
>>>> application in a critical system (bare metal, no OS) with the caches disabled
>>>> for a specific reason. I cannot go into or remember the specific detail.
>>>
>>> Even in such a scenario you would probably not enable the data cache and then
>>> disable it with a function call? You would simply configure the system to not
>>> enable the cache at all.
>>
>> Yes this would be correct, it held off.
>>
>>>> We cannot silently disable functionality. A call must do what it says or return
>>>> or raise an error. If disabling the data cache is not implement there should be
>>>> an error. The error would draw the user to the documentation. It is not yet
>>>> clear we can disable the code in that function.
>>>>
>>>> If our kernel code uses atomics then the data cache cannot be disabled. However
>>>> we should look into the use of atomics and if it is SMP specific and then handle
>>>> the specific cases.
>>>
>>> The atomic operations are defined by the C/C++ standard. This is not an
>>> SMP-specific issue. It is also not about the usage of the cache in general. It
>>> is about the rtems_cache_disable_data() directive.
>>
>> My comments about not having silent things happens is common to all things we
>> do. We need an error if what we have is not workable on a platform.
>>
>>>> A user can use atomics even if the kernel does not but leaving that to the user
>>>> to figure out and deal is at best unhelpful and as an OS provider we should help
>>>> users in these complex corner cases. At a minimum our documentation should cover
>>>> this and we could update rtems-exeinfo to report a potential issue.
>>>
>>> Which of the above four options would be your favourite? Would there be another
>>> option to deal with rtems_cache_disable_data()?
>>>
>>> Currently, calling this function crashes the system on a Zynq 7000.
>>
>> The return code but is it realistic to make such a change for such a long
>> standing interface? I suppose it is then the fatal error. It is a major thing to
>> do disabling the cache.
>>
> Given the age of the interface, I would be fine with making it a fatal
> error if called with SMP enabled. Currently it gets called during some
> bsp restart procedures. most likely, invalidating the cache would be
> effective in those cases.

Yes, in the restart procedure it may make sense, but the bsp_restart() 
is used in system where RTEMS is also used as a boot loader. The more 
complex system use normally a third-party boot loader.
.
> 
> I would be in favor of deprecating the interface now, and make it
> obsolete in the future.
> 
>> Out of interest why do we have the interface? What is the use case?
>>
> Joel may know.

I added a ticket for this:

https://gitlab.rtems.org/rtems/rtos/rtems/-/issues/5050

-- 
embedded brains GmbH & Co. KG
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber at embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/