[PATCH 06/13] config: Add rtems_malloc_task_stack_for_idle()

Wed Oct 5 22:13:04 UTC 2022

On 5/10/2022 4:00 pm, Sebastian Huber wrote:
> On 04/10/2022 23:21, Chris Johns wrote:
>> On 5/10/2022 12:41 am, Sebastian Huber wrote:
>>> On 04/10/2022 15:21, Joel Sherrill wrote:
>>>> On Tue, Oct 4, 2022 at 12:40 AM Sebastian Huber
>>>> <sebastian.huber at embedded-brains.de
>>>> <mailto:sebastian.huber at embedded-brains.de>> wrote:
>>>>
>>>>      On 30/09/2022 23:39, Chris Johns wrote:
>>>>       > On 30/9/2022 7:21 pm, Sebastian Huber wrote:
>>>>       >> Update #4524.
>>>>       >> ---
>>>>       >>   cpukit/doxygen/appl-config.h                | 13 +++++
>>>>       >>   cpukit/include/rtems/rtems/config.h         | 29 +++++++++-
>>>>       >>   cpukit/include/rtems/score/interr.h         |  1 +
>>>>       >>   cpukit/sapi/src/interrtext.c                |  3 +-
>>>>       >>   cpukit/sapi/src/malloctaskstackforidle.c    | 59
>>>>      +++++++++++++++++++++
>>>>       >>   spec/build/cpukit/librtemscpu.yml           |  1 +
>>>>       >>   spec/build/testsuites/sptests/grp.yml       |  2 +
>>>>       >>   spec/build/testsuites/sptests/spfatal36.yml | 19 +++++++
>>>>       >>   testsuites/sptests/spfatal36/init.c         | 52
>>>>      ++++++++++++++++++
>>>>       >>   testsuites/sptests/spfatal36/spfatal36.doc  | 11 ++++
>>>>       >>   testsuites/sptests/spinternalerror02/init.c |  2 +-
>>>>       >>   testsuites/sptests/sptls04/init.c           |  2 +
>>>>       >>   12 files changed, 191 insertions(+), 3 deletions(-)
>>>>       >>   create mode 100644 cpukit/sapi/src/malloctaskstackforidle.c
>>>>       >>   create mode 100644 spec/build/testsuites/sptests/spfatal36.yml
>>>>       >>   create mode 100644 testsuites/sptests/spfatal36/init.c
>>>>       >>   create mode 100644 testsuites/sptests/spfatal36/spfatal36.doc
>>>>       >>
>>>>       >> diff --git a/cpukit/doxygen/appl-config.h
>>>>      b/cpukit/doxygen/appl-config.h
>>>>       >> index aa6dbae648..ee647dc961 100644
>>>>       >> --- a/cpukit/doxygen/appl-config.h
>>>>       >> +++ b/cpukit/doxygen/appl-config.h
>>>>       >> @@ -4842,6 +4842,19 @@
>>>>       >>    * configuration options.  It is assumed that any memory
>>>>      allocated for the
>>>>       >>    * stack of an IDLE task will not be from the RTEMS Workspace
>>>>      or the memory
>>>>       >>    * statically allocated by default.
>>>>       >> + *
>>>>       >> + * For applications with a thread-local storage size which is
>>>>      completely
>>>>       >> + * unknown at the time the application configuration is
>>>>      defined, RTEMS provides
>>>>       >> + * an IDLE task stack allocator which uses rtems_malloc().
>>>>       >
>>>>       > I thought the TLS size was static and set by the linker? Has this
>>>>      changed?
>>>>
>>>>      No, this didn't change.
>>>>
>>>>       >
>>>>       > I am confused about the relationship between an unknown TLS size
>>>>      and IDLE task?
>>>>       > And the relationship of the TLS size and stack size?
>>>>
>>>>      Currently, the IDLE task storage area is statically allocated. The size
>>>>      of this area is defined by CONFIGURE_IDLE_TASK_STACK_SIZE. So, this
>>>>      configuration option doesn't directly specify the IDLE task stack size.
>>>>      This size covers also the TLS area.
>>>>
>>>>
>>>> Thanks for speaking up Chris. I also don't like the idea that something that
>>>> has said and meant IDLE stack size was getting other items lumped into it.
>>>>
>>>>
>>>>       >
>>>>       >> * * The size of the
>>>>       >> + * allocated thread storage area is the sum of stack size
>>>>      defined by the
>>>>       >> + * #CONFIGURE_IDLE_TASK_STACK_SIZE configuration option and the
>>>>      actual
>>>>       >> + * thread-local storage size of the application.
>>>>       >
>>>>       > The label CONFIGURE_IDLE_TASK_STACK_SIZE provides no insight into
>>>>      it being
>>>>       > effected by the TLS size.
>>>>       >
>>>>       >> * * Define this configuration
>>>>       >> + * option to ``rtems_malloc_task_stack_for_idle`` to use this
>>>>      allocator.  If
>>>>       >> + * the memory allocation fails, then the system terminates with the
>>>>       >> + * INTERNAL_ERROR_CORE fatal source and the
>>>>       >> + * INTERNAL_ERROR_NO_MEMORY_FOR_IDLE_TASK_STACK fatal code
>>>>      during system
>>>>       >> + * initialization.
>>>>       >> + * @endparblock
>>>>       >
>>>>       > I am confused about the how and why I would use this change?
>>>>
>>>>      With the statically allocated storage area for the IDLE task you
>>>>      need at
>>>>      least an estimate of the size if you define the application
>>>>      configuration. If you can't estimate it, then one option is to simply
>>>>      allocate it dynamically.
>>>>
>>>>      Maybe a better approach is to allocate the IDLE task storage from the
>>>>      workspace by default and use the CONFIGURE_IDLE_TASK_STACK_SIZE really
>>>>      for the stack size and not the complete storage area. We could add a
>>>>      new
>>>>      configuration option (for example CONFIGURE_IDLE_TASK_STORAGE_SIZE) to
>>>>      enable the static allocation.
>>>>
>>>>
>>>>
>>>> CONFIGURE_IDLE_TASK_STORAGE_SIZE could default to the size of stack size
>>>> plus other items but it seems over complicated. Just have IDLE stack size and
>>>> the other
>>>> area for other items.
>>>>
>>>> I don't know when IDLE task storage size was introduced but it has been a long
>>>> time ago. Changing the semantics of it seems quite wrong.
>>>
>>> Ok, I will send a v2 of the patch which:
>>
>> I suggest we get a clearer understand of what is happening first.
>>
>>> * Changes the default IDLE task storage allocation to the workspace.  It will
>>> use CONFIGURE_IDLE_TASK_STACK_SIZE and this size will define the stack size.
>>
>> Does it still need to be a power of 2? If so why?
> 
> The CONFIGURE_IDLE_TASK_STACK_SIZE doesn't have to be a power of two. I only
> changed the CPU_STACK_MINIMUM_SIZE to be a power of two. This was already the
> case except for aarch64 and moxie. aarch64 used 10KiB for whatever reason. This
> is not strictly necessary, but I was a bit surprised that not all architectures
> used a power of two for the default.

Thanks. This makes sense.

>>> * Adds a new option CONFIGURE_IDLE_TASK_STORAGE_SIZE which will change the IDLE
>>> task storage allocation to a statically allocated area which will contain the
>>> task stack and the thread-local storage area.
>>
>> I do not understand the reason. Why is TLS effecting IDLE if it is just a loop
>> with may be a BSP specific halt or stop type instruction?
> 
> We could not equip the IDLE tasks with a TLS area, however, this leads to other
> issues, for example in user-provided thread switch extensions. The system
> initialization is done in the context of an IDLE task, so using errno would then
> be an issue.

Yes for both cases.

>> In this patch you discuss applications with "very dynamic thread-local storage
>> size"? I have no idea what this means and under what conditions such an app is
>> created. Could you please explain this?
> 
> A scenario is a bigger organization in which one department deals with the basic
> RTEMS and platform support and another department develops the application
> (without knowing much about RTEMS). The platform is a 64-bit processor with lots
> of RAM. The application is a high performance SMP development using modern
> software frameworks which use TLS.

If libdl is put aside, would the same mechanics used to size the TLS area for
any score thread be the same for IDLE?

>>> * Adds a new option CONFIGURE_IDLE_TASK_MINIMUM_STACK_SIZE which ensures that
>>> the IDLE task stack size has at least this size.
>>
>> Can you please provide a use case for these variables showing what is being
>> solved and when a user would need them?
> 
> With a statically allocated IDLE task storage area you need a lower bound for
> the stack size. This is missing right now. Currently, if you have a TLS size
> which is larger than CONFIGURE_IDLE_TASK_STACK_SIZE, then you have a memory
> corruption during system initialization.

Ah OK. Does this mean the static allocation cannot take into account the size of
the static TLS area?

> With the new CONFIGURE_IDLE_TASK_STORAGE_SIZE option we could also use
> CONFIGURE_IDLE_TASK_STACK_SIZE to ensure this minimum stack size. I will adjust
> the next patch to not introduce CONFIGURE_IDLE_TASK_MINIMUM_STACK_SIZE.

OK

>> The IDLE stack size was used on architectures where the interrupt stack was on
>> the executing task's stack. We have been able to move away from that model to
>> having a separate interrupt stack. As a result the stack usage for IDLE is
>> deterministic because the code makes no calls and is little more than a loop. As
>> a result the default IDLE stack size should be OK for all users expect those
>> adding their own IDLE thread?
> 
> There are some calls to reach the loop, for example the thread begin extensions
> and the thread switch extensions.

So this is a side effect of switching of newlib to TLS variables?

> Next proposal:
> 
> 
> * Change the default IDLE task storage allocation to the workspace.  It will use
> CONFIGURE_IDLE_TASK_STACK_SIZE and this size will define the stack size.
> 
> * Add a new option CONFIGURE_IDLE_TASK_STORAGE_SIZE which will change the IDLE
> task storage allocation to a statically allocated area which will contain the
> task stack and the thread-local storage area.
> 
> * Use CONFIGURE_IDLE_TASK_STACK_SIZE which ensures that the IDLE task stack size
> has at least this size in case the statically allocated area is used.
> 
> With this change you don't have to worry about the TLS size in the default
> configuration. If you enable the static allocation, then you don't get undefined
> behaviour if your actual TLS size exceeds the configured limits.

Sounds good. Thanks for taking the time to explain the change.

Will the IDLE TLS size be based on the
CONFIGURE_MAXIMUM_THREAD_LOCAL_STORAGE_SIZE if it is not zero? This effects
libdl once it supports loading TLS based code.

[ while looking at the doco ... ]

Should CONFIGURE_MAXIMUM_THREAD_LOCAL_STORAGE_SIZE be under General System
Configuration because it applies to all threads?

Chris