[RTEMS Project] #2811: More robust thread dispatching on SMP and ARM Cortex-M
RTEMS trac
trac at rtems.org
Wed Nov 16 08:07:30 UTC 2016
#2811: More robust thread dispatching on SMP and ARM Cortex-M
-----------------------------+------------------------------
Reporter: sebastian.huber | Owner: sebastian.huber
Type: enhancement | Status: new
Priority: normal | Milestone: 4.12
Component: cpukit | Version: 4.11
Severity: normal | Resolution:
Keywords: |
-----------------------------+------------------------------
Comment (by sebastian.huber):
Replying to [comment:5 chrisj]:
> Replying to [comment:4 sebastian.huber]:
> > I think a fatal error is more appropriate here.
> >
> > * Applications which have this usage error needs to be fixed at
compile-time. It makes no sense to ship an SMP application with this bug.
>
> A fatal error is still run-time and not a compile time error so you have
lost me here.
It is an error that must be fixed during development. Otherwise you have
a broken product.
>
> >
> > * Return codes can be ignored. I definitely have seen code like this
before:
> > {{{
> > #!c
> > /* This cannot fail, we know the identifier is valid */
> > (void) pthread_mutex_lock(&mtx);
> > }}}
> >
>
> This is a different issue and a change of topic. We provide the means
for errors to be analyzed and that is our boundary.
>
> > * This ticket is a result of porting a real world application from
uni-processor to SMP. If you are not an expert of the operating system
internals and your application has this bug, then you need easily a couple
of days to figure out the problem. So, it is important to make sure it
gets detected.
>
> I agree with detecting the issue and there being an error. It is the
delivery we are discussing.
>
> The error code should provide some help just like the fatal error code.
If one can the other can.
>
> How many fatal errors instance are there in RTEMS in the kernel? Not the
number of error code, but the specific locations a fatal error can appear,
ie code/line pairs? I have never audited this.
See Internal_errors_Core_list, we have a test for every fatal internal
error.
>
> >
> > * To figure out what caused a fatal error is easy. The (source, error)
pair uniquely identifies the source code location of the error.
>
> The source location is a line the kernel's core code which means users
need to step into this code and figure out the answer. I have been hit by
this with SMP and it is hard.
>
> > With a stack trace and the executing thread you get enough information
to locate the problem in the code. There is no need for a thread aware
debugger.
>
> This implies testing will highlight the issue because you have a
debugger to give you this data. Currently RTEMS standard or default stack
traces that get called on a fatal error provide little if any information
that could be used to resolve the exact source, eg the thread id executing
or even better an unwinder (dreaming here). Better support for tier 1
archs would help.
Improved fatal error diagnostics is a different topic. With a debugger is
a matter of seconds to figure out the problem spot of a fatal error.
>
> >
> > * This is a new constraint specific to SMP. Existing software may be
simply unaware of this issue. However, its important to detect this
constraint violation.
>
> I agree it is important.
>
> > * _Thread_Do_dispatch() has no return value. Adding this check to
other places would be much more difficult, error prone. with more space
and time overhead, and labour intensive to test.
>
> There are no other similar tests happening now on the blocking paths?
No, this is a weak area in RTEMS. For example call
rtems_task_wake_after() in an interrupt service routine. You don't get
any status information that this is stupid.
For now, I think a fatal error is sufficient. In case there is really a
problem with this in the field, we can still improve things. What matters
is that this constraint violation gets detected, otherwise you can spend
hours on debugging.
--
Ticket URL: <http://devel.rtems.org/ticket/2811#comment:6>
RTEMS Project <http://www.rtems.org/>
RTEMS Project
More information about the bugs
mailing list