Improve SMP support for PC BSP
Jan.Sommer at dlr.de
Jan.Sommer at dlr.de
Tue Jun 1 11:01:01 UTC 2021
Hello,
Currently the pc BSP in SMP mode uses a timer interrupt on cpu0 which is then distributed to all other cores via an IPI for the scheduler tick.
This means, if cpu0 locally disables its interrupts (e.g. in a driver), all schedulers lose their sense of time.
Worst case this could lead to a hanging system, e.g. like in the test smpclock01 (https://devel.rtems.org/ticket/4008).
The clockdriver uses the PIT and as far as I understand it, it is only possible to create a global timer interrupt with the PIT and not simultaneous interrupts for each core (please correct me if I am wrong).
I started reading into possible solutions and as with everything x86-related, it is a bit messy.
So my propsed steps forward would be the following:
1. Update the CPU feature recognition:
-----------------------------------------------------
The next steps need to know about the availability of more modern features.
So, I would port the identcpu.c (and possibly the tsc.c) from FreeBSD to RTEMS.
I have done this for testing already and seems to be quite straight forward.
2. For RTEMS_SMP: Implement clockdriver for the Local APIC with TSC_DEADLINE mode
---------------------------------------------------------------------------------------------------------------------
If the TSC_DEADLINE mode is available (i.e. on newer Intel processors (2013+)) this is probably the easiest and most accurate method.
The TSC rate on these processors is invariant and synchronized across all cores during reset (unless the BIOS does something stupid after reset).
Creating a clock driver for that using the Local APIC should be fairly straight forward.
Maybe the TSC calibration routine could be improved a bit.
We also have corresponding hardware here for testing.
In general I think for SMP we should require invariant TSCs, so that rtems_get_uptime and similar methods are in line across cores.
3. For RTEMS_SMP: Implement clockdriver for Local APIC without TSC_DEADLINE mode
---------------------------------------------------------------------------------------------------------------------
For older or non-Intel processors the TSC_DEADLINE mode is not available.
As far as I see it, the most reasonable option would be to use the LAPIC timer in periodic mode.
To do that, I would first need to calibrate the rate of the Local APIC timer with the PIT, similar to the TSC. The rate of the LAPIC is the same across all cpu cores, the trouble is then the sychronization of the timer interrupt between all LAPICs of all cores.
I found this Master thesis which discusses synchronization via atomic variables with formulas derived from PTP:
https://core.ac.uk/download/pdf/302914733.pdf
It looks like this could synchronize the LAPIC timers across the cores reasonably well.
At the very least it should be an improvement compared to the distribution using the IPIs.
However, this work would not be required by our projects, so I have to see how to find some spare time for the implementation.
Anyways, would you see this as a viable solution? Or do you see any pitfalls I might have overlooked?
Best regards,
Jan
Deutsches Zentrum für Luft- und Raumfahrt e. V. (DLR)
German Aerospace Center
Institute for Software Technology | Software for Space Systems and Interactive Visualization | Lilienthalplatz 7 | 38108 Braunschweig | Germany
More information about the devel
mailing list