Memory Protection project interface details (GSoC 2020)

Wed May 13 18:33:23 UTC 2020

On Wed, May 13, 2020 at 10:27 PM Joel Sherrill <joel at rtems.org> wrote:

>
>
> On Wed, May 13, 2020 at 9:41 AM Gedare Bloom <gedare at rtems.org> wrote:
>
>> On Wed, May 13, 2020 at 6:12 AM Utkarsh Rai <utkarsh.rai60 at gmail.com>
>> wrote:
>> >
>> >
>> >
>> > On Wed, May 13, 2020 at 3:56 AM Joel Sherrill <joel at rtems.org> wrote:
>> >>
>> >> I hate to top post but I'm not sure where to insert this.
>> >>
>> >> As a first step, I would ensure that paging catches stack overflow
>> errors.
>> >
>> >
>> > I understand that any thread-stack protection mechanism has to ensure
>> against a stack overflow error, adding this to my plan would mean that I
>> have to re-adjust my goals, one of them being support for typed-memory
>> objects. I would request your opinion as to what features are more valuable
>> to the community so I can focus on them.
>> >
>>
>> When Joel says stack overflow he means a thread writing beyond the
>> thread's stack area. I don't think this would require TYM? Usually the
>> way this is done is to add a gap between stacks when you allocate
>> them, which is a bit wasteful of physical memory.
>>
>
> A gap in the virtual address?  I guess if you map virtual:physical 1:1,
> then it is a gap in both. But technically, it doesn't have to be 1:1.
>
> Independent of being pedantic on that, enlighten me how using the MMU
> for stack overflow protection isn't a logical step on the path of this
> project?
> It is just a mapping thread stacks and letting every thread have access to
> all threads memory but leaving gaps.
>
> I have seen the basic benefits of an MMU as follows (increasing in
> complexity and dynamicism):
>
> + mark region of memory as read/write, execute, read only, no cache,
>    and for stack use only. What you can do depends on the architecture.
>   This is a gross form of memory protection using large areas of memory.
>    This is usually done at BSP startup and left along.
>
> + Provide each thread a protected area where overflow/underflow is
>   detected by the MMU. This requires finer grain control but if still
> simple
>    since it is only changed at run-time by the thread stack
> allocator/deallocator.
>
> + Finer grain control of which threads can access which other threads'
> stack.
>    This will require more management and something happens at context
> switch.
>
> + Very fine grain control of objects like CHERI and other similar
> approaches.
>
> This project is aiming for the third. Wouldn't it need to move through the
> first
> two capabilities?
>

Thank you, the stack-overflow protection should be the first obvious step
when implementing this feature, your analysis has helped me clear my
doubts.

>
>
>>
>> >> Then move on to an optional capability where threads cannot access the
>> stacks of other threads. POSIX does not say anything about whether that
>> should work or not but there are cases (especially in RTEMS) where if a
>> blocking thread has a message queue buffer on a stack and another thread
>> does a send, then it will write to another thread's stack.
>> >
>> >
>> > One of the most important parts of this project will be proper
>> documentation to make the user understand the possible cases where enabling
>> this feature will cause problems and proper handling of these issues(For
>> all the cases the users need stack sharing, they will have to explicitly
>> make certain calls to share stacks )
>> >>
>> >> This is accepted programming practice.
>> >
>> >
>> >>
>> >>
>> >> I'm not opposed to an option where per-thread stack protection is
>> available. But making it mandatory is a bad step. Using the MMU to detect
>> stack overflow is good.
>> >
>> >
>> > Yes, the thread-stack protection will be user-configurable.
>> >>
>> >>
>> >> As Hesham mentioned, this is hard on some architectures if you don't
>> have a nice page management system. Can you make sure the minimum processor
>> architectural requirements are documented. Not just in an email. This will
>> ultimately be information in the CPU Supplement.
>> >
>> > Noted.
>> >>
>> >>
>> >> --joel
>> >>
>> >> On Tue, May 12, 2020 at 5:02 PM Hesham Almatary <
>> hesham.almatary at cl.cam.ac.uk> wrote:
>> >>>
>> >>> On Tue, 12 May 2020 at 04:57, Gedare Bloom <gedare at rtems.org> wrote:
>> >>> >
>> >>> > On Thu, May 7, 2020 at 9:59 PM Hesham Almatary
>> >>> > <hesham.almatary at cl.cam.ac.uk> wrote:
>> >>> > >
>> >>> > > Hello Utkarsh,
>> >>> > >
>> >>> > > I'd suggest you don't spend too much efforts on setting up BBB
>> >>> > > hardware if you haven't already. Debugging on QEMU with GDB is way
>> >>> > > easier, and you can consider either qemu-xilinx-zynq-a9 or rpi2
>> BSPs.
>> >>> > > Later, you can move your code to BBB if you want, since both are
>> based
>> >>> > > on ARMv7.
>> >>> > +1
>> >>> >
>> >>> > Past work has also used psim successfully I thought? Or am I
>> mistaken there.
>> >>> >
>> >>> Before my 2012 project (and part of it, yes), we used psim. The use of
>> >>> software TLBs wasn't very ideal/easy though, so we moved to ARM in
>> >>> 2013. The development/testing was mainly on a RPi board. I don't
>> >>> remember there was a QEMU model for it yet.
>> >>>
>> >>>
>> >>> > >
>> >>> > > On Thu, 7 May 2020 at 18:26, Utkarsh Rai <utkarsh.rai60 at gmail.com>
>> wrote:
>> >>> > > >
>> >>> > > > Hello,
>> >>> > > > This is to ensure that all the interested parties are on the
>> same page before I start my project and can give their invaluable feedback.
>> >>> > Excellent, thank you for getting the initiative.
>> >>> >
>> >>> > I'll be taking on the primary mentorship for your project, with
>> >>> > support from the co-mentors (Peter, Hesham, Sebastian). For now, I
>> >>> > prefer you to continue to make your presence on the mailing list. We
>> >>> > will establish other forms of communication as needed and will take
>> on
>> >>> > IRC meetings once coding begins in earnest.
>> >>> >
>> >>> > > > My GSoC project, providing user-configurable thread stack
>> protection, requires adding architecture-specific low-level support as well
>> as high-level API support. I will be starting my project with ARMv7-A (on
>> BBB) based MMU since RTEMS already has quite mature support for it. As
>> already mentioned in my proposal I will be focusing more on the High-level
>> interface and let it drive whatever further low-level support is needed.
>> >>> > > > Once the application uses MMU for thread stack address
>> generation each thread will be automatically protected as the page tables
>> other than that of the executing thread would be made dormant. When the
>> user has to share thread stacks they will have to obtain the stack
>> attributes of the threads to be shared by pthread_attr_getstack() and then
>> get a file descriptor of the memory to be mapped by a call to shm_open()
>> and finally map this to the stack of the other thread through
>> >>> > > > mmap(), this is the POSIX compliant way I could think of. Now
>> at the low level, it means mapping the page table of the thread to be
>> shared into the address space of the executing thread. This is an area
>> where the low-level support has to be provided. At the high-level, this
>> means providing support to mmap and shared-memory interface as mmap
>> provides support for a file by simply
>> >>> > > > copying the memory from the file to the destination. For shared
>> memory objects it can
>> >>> > > > provide read/write access but cannot provide restriction of
>> write/read access. One of the areas that I have to look into more detail is
>> thread context-switch, as after every context switch the TLBs need to be
>> flushed and reinitialized lest we get an invalid address for the executing
>> thread. Since context-switch is low-level architecture-specific, this also
>> has to be provided with more support.
>> >>> >
>> >>> > This is really dense text. Try to break apart your writing a little
>> >>> > bit to help clarify your thoughts.  You should also translate some
>> of
>> >>> > your proposal into a wiki page if you haven't started that yet, and
>> a
>> >>> > blog post. Both of those will help to focus your thoughts into
>> words.
>> >>> >
>> >>> > "mapping the page table" is not meaningful to me. I think you mean
>> >>> > something like "mapping a page from the page table"?  Will the
>> design
>> >>> > support sharing task stacks using MPUs with 4 regions? 8?  (It seems
>> >>> > challenging to me, but might be possible in some limited
>> >>> > configurations. Having support for those kinds of targets might
>> still
>> >>> > be useful, with the caveat that sharing stacks is not possible.)
>> >>> >
>> >>> > The first step is to get a BSP running that has sufficient
>> >>> > capabilities for you to test out memory protection with. Do a little
>> >>> > bit of digging, but definitely simulation is the way to go.
>> >>> >
>> >>> > The second step from my perspective is to determine how to introduce
>> >>> > strict isolation between task stacks. Don't worry about sharing at
>> >>> > this stage, but rather can you completely isolate tasks? Then you
>> can
>> >>> > start to poke holes in the isolation.
>> >>> >
>> >>> > As you say, you'll also need to start to understand the context
>> switch
>> >>> > code. Start looking into it to determine where you might think to
>> >>> > implement changing the address space of the executing thread.
>> Another
>> >>> > challenge is that RTEMS can dispatch to a new task from the
>> interrupt
>> >>> > handler, which may cause some problems for you as well to handle.
>> >>> >
>> >>> > Have you figured out where in the code thread stacks are allocated?
>> >>> > How do you plan to make the thread stacks known to other threads?
>> >>> >
>> >>> > TLB shootdown can be extremely expensive. Try to find ways to
>> optimize
>> >>> > that cost earlier rather than later. (One of those cases where
>> >>> > premature optimization will be acceptable.) Tagged TLB architectures
>> >>> > or those with "superpages" may incur less overhead if you can
>> >>> > selectively shoot-down the entry (entries) used for task stacks.
>> >>> >
>> >>> > A final thought is that the method to configure this support is
>> >>> > necessary. Configuration is undergoing some heavy changes lately,
>> and
>> >>> > application-level configuration is going to be completely different
>> in
>> >>> > rtems6. You may want to consider raising a new thread with CC to
>> >>> > Sebastian to get his input on how the best way to configure
>> something
>> >>> > like this might look, now and in the future. I would have leaned
>> >>> > toward a high-level configure switch (--enable-task-protection) in
>> the
>> >>> > past, but now I don't know.  This capability is however something
>> that
>> >>> > should be considered disabled by default due to the extra overhead.
>> >>> >
>> >>> > Gedare
>> >>> >
>> >>> > > > Kindly provide your feedback if I have missed something or I
>> have a wrong idea about it.
>> >>> > > >
>> >>> > > > Regards,
>> >>> > > > Utkarsh Rai.
>> >>> > > >
>> >>> > > > _______________________________________________
>> >>> > > > devel mailing list
>> >>> > > > devel at rtems.org
>> >>> > > > http://lists.rtems.org/mailman/listinfo/devel
>> >>> > > _______________________________________________
>> >>> > > devel mailing list
>> >>> > > devel at rtems.org
>> >>> > > http://lists.rtems.org/mailman/listinfo/devel
>> >>> > _______________________________________________
>> >>> > devel mailing list
>> >>> > devel at rtems.org
>> >>> > http://lists.rtems.org/mailman/listinfo/devel
>> >>> _______________________________________________
>> >>> devel mailing list
>> >>> devel at rtems.org
>> >>> http://lists.rtems.org/mailman/listinfo/devel
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20200514/59c16d0b/attachment-0001.html>