Help on how to configure for user-defined memory protection support (GSoC 2020)

Mon May 18 15:07:49 UTC 2020

On Mon, May 18, 2020 at 4:31 AM Utkarsh Rai <utkarsh.rai60 at gmail.com> wrote:
>
>
>
>
> On Sat, May 16, 2020 at 9:16 PM Joel Sherrill <joel at rtems.org> wrote:
>>
>>
>>
>> On Sat, May 16, 2020 at 10:14 AM Gedare Bloom <gedare at rtems.org> wrote:
>>>
>>> Utkarsh,
>>>
>>> What do you mean by "This would although mean that we would have page tables of  1MB."
>>>
>>> Check that you use plain text when inlining a reply, or at least that you broke the reply format.
>>>
>>> Gedare
>>>
>>> On Fri, May 15, 2020, 6:04 PM Utkarsh Rai <utkarsh.rai60 at gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On Thu, May 14, 2020 at 10:23 AM Sebastian Huber <sebastian.huber at embedded-brains.de> wrote:
>>>>>
>>>>> Hello Utkarsh Rai,
>>>>>
>>>>> On 13/05/2020 14:30, Utkarsh Rai wrote:
>>>>> > Hello,
>>>>> > My GSoC project,  providing thread stack protection support, has to be
>>>>> > a user-configurable feature.
>>>>> > My question is,  what would be the best way to implement this, my idea
>>>>> > was to model it based on the existing system configuration
>>>>> > <https://docs.rtems.org/branches/master/c-user/config/intro.html>, but
>>>>> > Dr. Gedare pointed out that configuration is undergoing heavy changes
>>>>> > and may look completely different in future releases. Kindly advise me
>>>>> > as to what would be the best way to proceed.
>>>>> before we start with an implementation. It would be good to define what
>>>>> a thread stack protection support is supposed to do.
>>>>
>>>>
>>>> The thread stack protection mechanism will protect against stack overflow errors and will completely isolate the thread stacks from each other. Sharing of thread stack will be possible only when the user makes explicit calls to do so. More details about this can be found in this thread.
>>>>>
>>>>> Then there should
>>>>> be a concept for systems with a Memory Protection Unit (MPU) and a
>>>>> concept for systems with a Memory Management Unit (MMU). MMUs may
>>>>> provide normal 4KiB Pages, large Pages (for example 1MiB) or something
>>>>> more flexible. We should identify BSPs which should have support for
>>>>> this. For each BSP should be a concept. Then we should think about how a
>>>>> user can configure this feature.
>>>>>
>>>>> For memory protection will have a 1:1 VA-PA address translation that means a 4KiB page size will be set for both the MPU and MMU, a 1:1 mapping will ensure we will have to do lesser page table walks.This would although mean that we would have page tables of  1MB. I will be first providing the support for Armv7 based BSPs (RPi , BBB, etc. have MMU support) then when I have a working example I will move on to provide the support for RISC-V. which has MPU support.
>>
>>
>> I think Sebastian is asking exactly what I did. What are the processor (specific CPU) requirements to support thread stack protection?
>
>
> For thread stack protection the processor should have the option of paging along with appropriate 'access bits' setting. Both RISC-V and ARMv7-A (the ones that I will be focusing on my project) have the option of defining pages of 4KiB size with appropriate access bits.
>
>>
>>
>> For example, to be effective, I imagine a 1MB granularity might be sufficient to protect code versus data/bss. But it is likely insufficient to protect thread stacks.
>>
>> Similarly, a processor with a limited number of "protection areas" would be unsuitable as a basis for implementing thread stack protection. Here I am thinking of the PowerPC with a handful of TLB registers. You would have to turn on paging.
>
>
> I agree, most of the processors have protection regions between 8 to 16 and in some cases as less as 4. For stack protection paging with each page of size 4KiB, as it is applicable for processors with mpu or mmu and is optimal, in the sense that we would have appropriate number and size of pages for thread stacks, is the best option.
>

We should have a clear understanding of the design requirements
brefore we can make such a statement about "optimal" and "best".

The proposal has some good ideas in it, but I think the project has
some implied expectations or assumptions, on both your side and from
mentors/stakeholders. Here are some ideas that should start to hint at
requirements. Maybe you can propose some design requirements. I'm not
too good at writing requirements myself, but here goes:
1. Memory protection is optional. The default is no memory protection.
2. The basic protection isolates the text, rodata, and rwdata from
each other. There is no notion of task-specific protection domains,
and tasks should not incur any additional overhead due to this
protection.
3. The advanced protection strongly isolates all tasks' stacks.
Sharing is done explicitly via POSIX/RTEMS APIs, and the heap and
executive (kernel/RTEMS) memory are globally shared. A task shall only
incur additional overhead in context switches and the first access to
a protected region (other task's stack it shares) after a context
switch.

I'm sure there are more you can draw out from your proposal and we can
discuss. #2 provides a useful option for systems with MPU or similar
hardware that is insufficient to support #3.

Mainly I wanted to get to driving at #3. One implication of it is that
for a task that doesn't access any other task's stack there should
only be TLB faults when there is a context switch. Another implication
is that all entries for a task's protection domain should fit in the
TLB. If you use 4 KiB pages with 8-entry TLB, you can only have 32 KiB
active in the TLB at a time. This may exceed the size of some
protection domains (e.g., very large code bases might have more than
32K in the text segment) and so you could not guarantee #3. This is
the kind of analysis you need to think about before you can make
design decisions. Large pages lead to internal fragmentation, which
would cause its own problem. If you can mix-and-match sizes, it may be
sensible to use large pages for statically shared regions (.text,
.data, .bss) and only the smaller pages for stacks.

Some of the complexity may be punted to the user configuration also.
Possibly, the upcoming rewrite of configuration may make this even
more flexible, but I haven't looked deeply enough at the details yet.
I'm thinking that something like a configuration option (macro,
specification) to set the number of shared task stack protection
domains, or some API methods to custom-tailor the stack allocations
for sharing tasks. Consider, for example, that task stacks could be
just 1 KiB, then you could pack 4 in the same 4KiB page in a shared
protection domain.

I'm not convinced that you should be thinking about the implementation
as providing a "page table" mechanism. Because most of the regions
(text, rodata, .data, .bss) are globally shared. Yes, it can be
implemented by a traditional page table mechanism, but I think that is
overkill. A simple, stupid implementation could just put all the
task-specific protected regions in a linked list and walk it,
installing them in the TLB, during the context_restore. Or maybe
pushing to a list/stack on context_save, and popping on restore. The
only real complication is TLB shootdown of the entries that change
between contexts, and whether you have to install again the globally
shared ones.  In a real-time system, it is far superior to pay for a
fixed, known cost at a context switch than it is to take costs (even
if they are smaller!) at a random point in task execution.

There is a lot yet for you to unravel in this topic. Don't be afraid
to keep asking questions and digging. We should try to lay out the
design and aim to be future-proof, and establish some simple steps
toward implementation that allow you to make incremental progress over
the summer. I think the past efforts at this project were not merged
completely because they did not provide sufficiently good APIs and
default use cases--instead they tried to fashion a completely custom
solution for managing memory protection regions, although they did
provide useful advancements to BSP support for memory protection. My
hope is that by focusing on a single type of region--task stacks--your
project can achieve a more successful integration in the upper layers
of RTEMS and not just advance our lower-level BSP support.

Gedare

>>
>> This is the general guidance that needs to be provided so anyone can evaluate how much protection they really can have on their target.
>>
>> --joel
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel at rtems.org
>>>> http://lists.rtems.org/mailman/listinfo/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel at rtems.org
>>> http://lists.rtems.org/mailman/listinfo/devel