[PATCH 1/2] cpukit/aarch64: Keep state across context switch

Sebastian Huber sebastian.huber at embedded-brains.de
Tue Mar 8 08:52:04 UTC 2022


On 28/02/2022 20:18, Kinsey Moore wrote:
> 
> On 2/28/2022 12:19, Sebastian Huber wrote:
>> On 26/02/2022 08:03, Kinsey Moore wrote:
>>> On 2/26/2022 00:53, Sebastian Huber wrote:
>>>> On 26/02/2022 00:41, Kinsey Moore wrote:
>>>>> This may also be an issue for ARM, RISC-V and others as it doesn't 
>>>>> appear that ARM saves CPSR during context switch and I couldn't 
>>>>> tell that RISC-V does this either, though I'm less familiar with it.
>>>>
>>>> This doesn't look like the right way to fix this issue.
>>>>
>>>> There is currently the assumption that all processors start 
>>>> multitasking with a context switch to _Thread_Handler() which sets 
>>>> the interrupt level. It is possible to construct a scenario in which 
>>>> we start multitasking with a migration of a thread which already 
>>>> executed the _Thread_Handler() prologue. This would result in an 
>>>> execution with disabled interrupts. I think the proper fix for this 
>>>> scenario is to enable interrupts in 
>>>> _CPU_SMP_Prepare_start_multitasking().
>>>>
>>>> Doing a context switch with interrupts disabled is a fatal 
>>>> application error on all architectures with
>>>>
>>>> #define CPU_ENABLE_ROBUST_THREAD_DISPATCH TRUE
>>>>
>>>> or enabled SMP support.
>>>>
>>> Ok, great. I was wondering if that was the case and this is 
>>> definitely the kind of feedback I was looking for. I'll adjust the 
>>> patch set to reflect that. I still wonder if this is an issue on 
>>> other SMP CPU ports, though, since most of them don't implement that 
>>> hook, either.
>>
>> I would like to have a closer look at this next week then I am back 
>> from holidays.
>>
>> Enabling interrupts in _CPU_SMP_Prepare_start_multitasking() would not 
>> work since we use the interrupt stack at this point. We should add a 
>> ticket and a test case for this (I can do this next week). How did you 
>> observe this bug?
>>
> I was only able to observe this bug once the 2/2 patch is applied and 
> that optimization opens a race condition (adding a few no-ops to the 
> Per_CPU_Control accessor prevents it from appearing) in the 
> sppercpudata01 test on SMP configurations since the task is migrating 
> across CPUs as CPUs are coming online. The race condition resolves 
> nominally in 90% of cases so while it's not a frequent failure it is 
> reproducible.

I added a ticket and a test case:

http://devel.rtems.org/ticket/4627

Could you please check if the test case fails currently on your aarch64 
target?

-- 
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber at embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


More information about the devel mailing list