[PATCH] icpp remedy

Gedare Bloom gedare at rtems.org
Wed Sep 21 21:50:34 UTC 2022


CC: devel at rtems.org

Preserving this discussion for posterity.

On Fri, Sep 16, 2022 at 3:09 PM Jian-Jia Chen
<jian-jia.chen at cs.uni-dortmund.de> wrote:
>
> Dear Gedare, dear Sebastian,
>   I would like to clarify a few things.
>
> == historical perspectives ==
>   First of all, let me shortly talk about the history of PCP and the extension of it. PCP was proposed in 1989 with two advantages in comparison to PIP
>
> Under PCP, the access of resources in a nested manner is deadlock-free
> Under PCP, a higher-priority job J_i is blocked by at most one lower-priority critical section whose priority ceiling has a higher priority than or the same priority as J_i.
>
>
>   PCP therefore has become an interesting result. However, in its original design, a lower-priority job only inherits the priority of a higher-priority job that it blocks. It was later noted that the original PCP design can be extended to the so-called immediate ceiling priority protocol (ICPP) which promotes the priority of a lower-priority job immediately to its priority ceiling after the lock. This extension ensures that the two important properties of PCP remain valid.
>
>   Now, in our discussion, we should further clarify whether the implementation of RTEMS is the ICPP described above or not. If yes, then it should be deadlock-free under any nested resource access sequence. Otherwise, it is not supposed to be the ICPP described above. You may want to call it “restrictive ICPP” or anything alike.
>
>   When we submitted the paper, one reviewer also asked a question whether the issue we found is a bug or a feature. I believe that it is both because there is no documentation. However, in case there is no documentation on such a limitation, then it is a bug as the implementation does not comply to the ICPP; otherwise, it is a feature for the implemented “restrictive ICPP”.
>
> == designers’ perspectives ==
>
>   Now, coming to the discussion whether “acquiring mutexes in an arbitrary priority order is a good application design” or not, I think we have to also clarify two things:
>
> The program's semantics of resource access: it is supposed to lock first resource R1 (guarded by mutex M1) and then potentially resource R2 (guarded by mutex M2) in a nested manner, depending on the condition needed.
> The resource priority ceiling: it is unrelated to the program context but just used to ensure that there is no deadlock in any request order and there is at most one blocking from the lower-priority jobs.
>
>
>   So, (1) is about the context of the program(s) and (2) is about the resource management schemes.
>
>   One can of course enforce the sequence of locks, but as soon as the designer changes the priority assignment of the tasks or adds a new task, the program may have to be rewritten in their statements and may not be able to keep the same program behavior. Therefore, I think the beauty of PCP and ICPP for achieving the deadlock-free property is destroyed if the mutex usage has to follow a strict order to ensure deadlock free. In fact, under this strict order, as there is no circular waiting, there is by definition no deadlock.
>
> == ending ==
>
>   Furthermore, no matter which approach is taken, it is important to document the assertions/assumptions that must hold to lead to the correctness of the implemented IPCC.
>
>
>   Best Regards,
>   Jian-Jia Chen
>
>
>
> On Sep 16, 2022, at 9:47 AM, Kuan-Hsun Chen <c0066c at gmail.com> wrote:
>
>
>
> ---------- Forwarded message ---------
> From: Sebastian Huber <sebastian.huber at embedded-brains.de>
>
> Date: Fri, Sep 16, 2022 at 08:09
> Subject: Re: [PATCH] icpp remedy
> To: Gedare Bloom <gedare at rtems.org>, Kuan-Hsun Chen <c0066c at gmail.com>
> Cc: Strange369 <junjie.shi at tu-dortmund.de>, rtems-devel at rtems.org <devel at rtems.org>
>
>
> On 15.09.22 00:06, Gedare Bloom wrote:
> > On Tue, Sep 13, 2022 at 12:42 PM Kuan-Hsun Chen<c0066c at gmail.com>  wrote:
> >> Thanks for the prompt reply. Yes, I will guide Junjie to make a ticket and go through the issue.
> >>
> >>> Is there a test case for this?
> >> Yes, a test case is also ready to be reviewed and can be part of the testsuite to test out ICPP (MrsP should also pass).
> >>
> >>> If it applies to 5 or 4.11, there needs to be another ticket to get the fix back ported.
> >> So each release version with one ticket? We only check 5.1 in this work. My intuitive guess is that if the functionality does not change over the versions, the remedy should be similar.
> >> Let us figure it out.
> >>
> >> On Tue, Sep 13, 2022 at 8:21 PM Joel Sherrill<joel at rtems.org>  wrote:
> >>>
> >>>
> >>> On Tue, Sep 13, 2022, 1:14 PM Kuan-Hsun Chen<c0066c at gmail.com>  wrote:
> >>>> (Oh, I just noticed that the patch was submitted faster than my mail to share this patch :P).
> >>>
> >>> No problem. That's why I asked what the issue was.
> >>>
> >>> It needs at least a much better log message. It needs a ticket given the gravity of the issue. That ticket can explain the situation and include links. Probably needs a comment above the change explaining what's going on.
> >>>
> >>> Is there a test case for this?
> >>>
> >>> If it applies to 5 or 4.11, there needs to be another ticket to get the fix back ported.
> >>>
> >>> Great catch. Just need to do a good job on the log, test, ticket(s), etc
> >>>
> >>> --joel
> >>>
> >>>> Hi Joel,
> >>>>
> >>>> Let me clarify what happened here. In our recent study, we adopt Frama-C to formally verify the implementation of ICPP (also MrsP) and note that the current flow may lead to a deadlock. The patch is a simple remedy to fix the issue.
> >>>>
> >>>> The resource’s ceiling is not checked against the thread’s base priority, but against its current dynamic priority derived from the task’s priority aggregation. However, a resource’s ceiling is required to be set as the highest base priority of all tasks that are requesting it. This mismatch may lead to a deadlock by denying legitimate nested resource access if resources are requested in descending order of priority. So this adaption is needed to resolve the issue. A similar issue can also be found in MrsP due to the usage of this function. A detailed explanation can be found at the end of Section V in the attached preprint, and a toy example is provided there to reveal the issue.
> >>>>
> > By changing this to check the base priority instead of the dynamic
> > priority, the mutex seize will now allow a task to acquire a ceiling
> > mutex even though its dynamic priority is higher than the ceiling. The
> > specification to check against the base priority of tasks comes from
> > the ICPP without nested lock behavior as I recall, and I'm not 100%
> > sure where the requirement comes from that "a resource’s ceiling is
> > required to be set as the highest base priority of all tasks that are
> > requesting it" -- is that requirement proved anywhere for ICPP with
> > nested resources? I'm a little concerned about making this kind of
> > fundamental change deep within the (shared) locking protocol code,
> > especially since it primarily seems to cater to a uniprocessor corner
> > case.
> >
> > This is a tricky situation, and clearly a lot of work has already gone
> > into discovering it. I think we need to discuss this patch a bit more.
> > If I understood correctly, this deadlock can't happen if locks are
> > always acquired in the increasing priority order. Changing this code
> > allows for applications to acquire locks in arbitrary priority order.
> > I'm not sure that is permissible in multiprocessor systems to still
> > ensure correctness of the locking protocol correctness. It would be
> > therefore also valid to state this as an explicit requirement for
> > applications to make correct use of ICPP. The same thing is also true
> > for MrsP, and for multiprocessor locking protocols in general I
> > believe it is required for locks to be acquired in priority order? I'd
> > be inclined to instead say that we should confirm that locks are
> > acquired in the proper priority ordering as a way to avoid the
> > identified deadlock situation.
>
> The check against the dynamic priority is there for a long time (at
> least RTEMS 3.6.0). The questions is how much policy we want to enforce
> through such checks. I am not sure if acquiring mutexes in an arbitrary
> priority order is a good application design. However, from an
> implementation point of view we could remove the check completely. Each
> priority ceiling mutex provides a priority node for the task, so it is
> independent of other priority contributions. We also have a dedicated
> deadlock detection.
>
> >
> > Either way this discussion/patch goes, we should document these issues
> > inhttps://docs.rtems.org/branches/master/c-user/key_concepts.html#priorityceiling
>
> Yes.
>
> --
> embedded brains GmbH
> Herr Sebastian HUBER
> Dornierstr. 4
> 82178 Puchheim
> Germany
> email: sebastian.huber at embedded-brains.de
> phone: +49-89-18 94 741 - 16
> fax:   +49-89-18 94 741 - 08
>
> Registergericht: Amtsgericht München
> Registernummer: HRB 157899
> Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
> Unsere Datenschutzerklärung finden Sie hier:
> https://embedded-brains.de/datenschutzerklaerung/
> --
> Diese Mail wurde mobil geschrieben. Etwaige Rechtschreibfehler sind volle Absicht und als großzügiges Geschenk zu verstehen.
>
>


More information about the devel mailing list