Bug in termios
Eric Norum
norume at aps.anl.gov
Thu Oct 19 14:44:55 UTC 2006
On Oct 19, 2006, at 9:28 AM, Joel Sherrill wrote:
> jennifer at oarcorp.com wrote:
>> I have a system with 12 serial port. The driver is configured to be
>> interrupt driven with one character arrival per interrupt.
>> Furthermore,
>> I am using 1 task which is set up to poll the serial ports in raw
>> mode at
>> a 10 ms rate and distribute any data that has arrived. To do this
>> I have
>> set VMIN and VTIME to 0. The problem occurs on high bandwidth
>> devices
>> (after they have ran a while), when the device is turned off.
>> Turning off
>> the device results in the system locking up for several seconds then
>> recovering. Analysis has shown that the read command is locking the
>> system during this time. It appears to me that the problem is
>> happening
>> is that inside of termios.c in the fillBufferQueue routine. While
>> the
>> device is running there are always characters available so
>> rawInBuff.Head
>> and rawInBuffTail are never equal. Then ccount is always >= c_cc
>> [VMIN]
>> which results in wait being set to 0 and the semaphore never being
>> decremented even thought it being incremented approximately 20 times
>> during the 10 ms application task poll rate. When the serial
>> device is
>> turned off and the characters are emptied out of the termios
>> buffer, this
>> results in a spinlock of obtaining the rawInBufSemaphore several
>> hundred
>> thousand times.
>>
> I looked at the code and believe this semaphore was intended to be
> used
> as a counting
> semaphore. I think each increment is for an interrupt occurrence --
> not
> for a single
> character. It is simply not decremented via obtain unless a task is
> willing to block.
> In Jennifer's case, this is very rare so the count is VERY high
> when she
> finally
> doesn't have any data and has to way.
>
> I suggested that this probably needs to be a simple binary
> semaphore but
> she had
> already tried that and it broke something else. She is now trying
> to do
> a flush on
> this semaphore just before checking the buffer counts.
> Technically, this
> is a
> condition mutex and we should be able to get away with that. You don't
> care about
> it until there is no data and then you want to block on it until just
> the next interrupt.
Right -- which is why I would have thought that a binary semaphore
would work. Can you describe what breaks when a binary semaphore
is used?
Flushing the semaphore count doesn't sound like a good fix to me. I
worry a lot about race conditions with that sort of approach.
>
> Eric.. do you see what is going on? I know it has been years but you
> originally
> wrote this code.
>
Well, I wrote some of it. All the flow-control stuff is new.
--
Eric Norum <norume at aps.anl.gov>
Advanced Photon Source
Argonne National Laboratory
(630) 252-4793
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/users/attachments/20061019/85537fb3/attachment-0001.html>
More information about the users
mailing list