[PATCH] icpp remedy

shijunjie junjie.shi at tu-dortmund.de
Thu Nov 17 09:47:02 UTC 2022


Dear all,

To make a formal description of the bug/mismatch of implemented
protocol(s), we prepared a ticket in https://devel.rtems.org/ticket/4742#no2,
which contains the bug description and counter example.
I hope it can help to give us a clear overview of the bug/mismatch that we
found.

Best,
Junjie

Gedare Bloom <gedare at rtems.org> 于2022年9月21日周三 23:50写道:

> CC: devel at rtems.org
>
> Preserving this discussion for posterity.
>
> On Fri, Sep 16, 2022 at 3:09 PM Jian-Jia Chen
> <jian-jia.chen at cs.uni-dortmund.de> wrote:
> >
> > Dear Gedare, dear Sebastian,
> >   I would like to clarify a few things.
> >
> > == historical perspectives ==
> >   First of all, let me shortly talk about the history of PCP and the
> extension of it. PCP was proposed in 1989 with two advantages in comparison
> to PIP
> >
> > Under PCP, the access of resources in a nested manner is deadlock-free
> > Under PCP, a higher-priority job J_i is blocked by at most one
> lower-priority critical section whose priority ceiling has a higher
> priority than or the same priority as J_i.
> >
> >
> >   PCP therefore has become an interesting result. However, in its
> original design, a lower-priority job only inherits the priority of a
> higher-priority job that it blocks. It was later noted that the original
> PCP design can be extended to the so-called immediate ceiling priority
> protocol (ICPP) which promotes the priority of a lower-priority job
> immediately to its priority ceiling after the lock. This extension ensures
> that the two important properties of PCP remain valid.
> >
> >   Now, in our discussion, we should further clarify whether the
> implementation of RTEMS is the ICPP described above or not. If yes, then it
> should be deadlock-free under any nested resource access sequence.
> Otherwise, it is not supposed to be the ICPP described above. You may want
> to call it “restrictive ICPP” or anything alike.
> >
> >   When we submitted the paper, one reviewer also asked a question
> whether the issue we found is a bug or a feature. I believe that it is both
> because there is no documentation. However, in case there is no
> documentation on such a limitation, then it is a bug as the implementation
> does not comply to the ICPP; otherwise, it is a feature for the implemented
> “restrictive ICPP”.
> >
> > == designers’ perspectives ==
> >
> >   Now, coming to the discussion whether “acquiring mutexes in an
> arbitrary priority order is a good application design” or not, I think we
> have to also clarify two things:
> >
> > The program's semantics of resource access: it is supposed to lock first
> resource R1 (guarded by mutex M1) and then potentially resource R2 (guarded
> by mutex M2) in a nested manner, depending on the condition needed.
> > The resource priority ceiling: it is unrelated to the program context
> but just used to ensure that there is no deadlock in any request order and
> there is at most one blocking from the lower-priority jobs.
> >
> >
> >   So, (1) is about the context of the program(s) and (2) is about the
> resource management schemes.
> >
> >   One can of course enforce the sequence of locks, but as soon as the
> designer changes the priority assignment of the tasks or adds a new task,
> the program may have to be rewritten in their statements and may not be
> able to keep the same program behavior. Therefore, I think the beauty of
> PCP and ICPP for achieving the deadlock-free property is destroyed if the
> mutex usage has to follow a strict order to ensure deadlock free. In fact,
> under this strict order, as there is no circular waiting, there is by
> definition no deadlock.
> >
> > == ending ==
> >
> >   Furthermore, no matter which approach is taken, it is important to
> document the assertions/assumptions that must hold to lead to the
> correctness of the implemented IPCC.
> >
> >
> >   Best Regards,
> >   Jian-Jia Chen
> >
> >
> >
> > On Sep 16, 2022, at 9:47 AM, Kuan-Hsun Chen <c0066c at gmail.com> wrote:
> >
> >
> >
> > ---------- Forwarded message ---------
> > From: Sebastian Huber <sebastian.huber at embedded-brains.de>
> >
> > Date: Fri, Sep 16, 2022 at 08:09
> > Subject: Re: [PATCH] icpp remedy
> > To: Gedare Bloom <gedare at rtems.org>, Kuan-Hsun Chen <c0066c at gmail.com>
> > Cc: Strange369 <junjie.shi at tu-dortmund.de>, rtems-devel at rtems.org <
> devel at rtems.org>
> >
> >
> > On 15.09.22 00:06, Gedare Bloom wrote:
> > > On Tue, Sep 13, 2022 at 12:42 PM Kuan-Hsun Chen<c0066c at gmail.com>
> wrote:
> > >> Thanks for the prompt reply. Yes, I will guide Junjie to make a
> ticket and go through the issue.
> > >>
> > >>> Is there a test case for this?
> > >> Yes, a test case is also ready to be reviewed and can be part of the
> testsuite to test out ICPP (MrsP should also pass).
> > >>
> > >>> If it applies to 5 or 4.11, there needs to be another ticket to get
> the fix back ported.
> > >> So each release version with one ticket? We only check 5.1 in this
> work. My intuitive guess is that if the functionality does not change over
> the versions, the remedy should be similar.
> > >> Let us figure it out.
> > >>
> > >> On Tue, Sep 13, 2022 at 8:21 PM Joel Sherrill<joel at rtems.org>  wrote:
> > >>>
> > >>>
> > >>> On Tue, Sep 13, 2022, 1:14 PM Kuan-Hsun Chen<c0066c at gmail.com>
> wrote:
> > >>>> (Oh, I just noticed that the patch was submitted faster than my
> mail to share this patch :P).
> > >>>
> > >>> No problem. That's why I asked what the issue was.
> > >>>
> > >>> It needs at least a much better log message. It needs a ticket given
> the gravity of the issue. That ticket can explain the situation and include
> links. Probably needs a comment above the change explaining what's going on.
> > >>>
> > >>> Is there a test case for this?
> > >>>
> > >>> If it applies to 5 or 4.11, there needs to be another ticket to get
> the fix back ported.
> > >>>
> > >>> Great catch. Just need to do a good job on the log, test, ticket(s),
> etc
> > >>>
> > >>> --joel
> > >>>
> > >>>> Hi Joel,
> > >>>>
> > >>>> Let me clarify what happened here. In our recent study, we adopt
> Frama-C to formally verify the implementation of ICPP (also MrsP) and note
> that the current flow may lead to a deadlock. The patch is a simple remedy
> to fix the issue.
> > >>>>
> > >>>> The resource’s ceiling is not checked against the thread’s base
> priority, but against its current dynamic priority derived from the task’s
> priority aggregation. However, a resource’s ceiling is required to be set
> as the highest base priority of all tasks that are requesting it. This
> mismatch may lead to a deadlock by denying legitimate nested resource
> access if resources are requested in descending order of priority. So this
> adaption is needed to resolve the issue. A similar issue can also be found
> in MrsP due to the usage of this function. A detailed explanation can be
> found at the end of Section V in the attached preprint, and a toy example
> is provided there to reveal the issue.
> > >>>>
> > > By changing this to check the base priority instead of the dynamic
> > > priority, the mutex seize will now allow a task to acquire a ceiling
> > > mutex even though its dynamic priority is higher than the ceiling. The
> > > specification to check against the base priority of tasks comes from
> > > the ICPP without nested lock behavior as I recall, and I'm not 100%
> > > sure where the requirement comes from that "a resource’s ceiling is
> > > required to be set as the highest base priority of all tasks that are
> > > requesting it" -- is that requirement proved anywhere for ICPP with
> > > nested resources? I'm a little concerned about making this kind of
> > > fundamental change deep within the (shared) locking protocol code,
> > > especially since it primarily seems to cater to a uniprocessor corner
> > > case.
> > >
> > > This is a tricky situation, and clearly a lot of work has already gone
> > > into discovering it. I think we need to discuss this patch a bit more.
> > > If I understood correctly, this deadlock can't happen if locks are
> > > always acquired in the increasing priority order. Changing this code
> > > allows for applications to acquire locks in arbitrary priority order.
> > > I'm not sure that is permissible in multiprocessor systems to still
> > > ensure correctness of the locking protocol correctness. It would be
> > > therefore also valid to state this as an explicit requirement for
> > > applications to make correct use of ICPP. The same thing is also true
> > > for MrsP, and for multiprocessor locking protocols in general I
> > > believe it is required for locks to be acquired in priority order? I'd
> > > be inclined to instead say that we should confirm that locks are
> > > acquired in the proper priority ordering as a way to avoid the
> > > identified deadlock situation.
> >
> > The check against the dynamic priority is there for a long time (at
> > least RTEMS 3.6.0). The questions is how much policy we want to enforce
> > through such checks. I am not sure if acquiring mutexes in an arbitrary
> > priority order is a good application design. However, from an
> > implementation point of view we could remove the check completely. Each
> > priority ceiling mutex provides a priority node for the task, so it is
> > independent of other priority contributions. We also have a dedicated
> > deadlock detection.
> >
> > >
> > > Either way this discussion/patch goes, we should document these issues
> > > inhttps://
> docs.rtems.org/branches/master/c-user/key_concepts.html#priorityceiling
> >
> > Yes.
> >
> > --
> > embedded brains GmbH
> > Herr Sebastian HUBER
> > Dornierstr. 4
> > 82178 Puchheim
> > Germany
> > email: sebastian.huber at embedded-brains.de
> > phone: +49-89-18 94 741 - 16
> > fax:   +49-89-18 94 741 - 08
> >
> > Registergericht: Amtsgericht München
> > Registernummer: HRB 157899
> > Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
> > Unsere Datenschutzerklärung finden Sie hier:
> > https://embedded-brains.de/datenschutzerklaerung/
> > --
> > Diese Mail wurde mobil geschrieben. Etwaige Rechtschreibfehler sind
> volle Absicht und als großzügiges Geschenk zu verstehen.
> >
> >
>


-- 
Junjie Shi, M.Sc.
Technische Universität Dortmund
Informatik, Lehrstuhl 12
Design Automation of Embedded Systems
Otto-Hahn-Str. 16, E05
44227 Dortmund, Germany

Phone:  +49 0231 755-6121
Fax:    +49 0231 755-6116
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20221117/5663218e/attachment-0001.htm>


More information about the devel mailing list