Bug in termios

Joel Sherrill joel.sherrill at oarcorp.com
Thu Oct 19 15:28:51 UTC 2006


jennifer at oarcorp.com wrote:
> Eric,
>
> Changing to a simple binary semaphore works.  What I had checked before was
> changing it to a binary semaphore, not a simple binary.  We are going to
> run over the weekend, but this appears to have solved the problem.
>   

Jennifer, If the test passes all weekend, we need to get a patch merged.

FWIW Jennifer is helping transition a large PowerPC application to a new 
board.  It
pushes very hard on just about everything in RTEMS it uses. :)

--joel
> Thanks
> Jennifer
>
>   
>> I'm rebuilding the system with this changed to a simple binary semaphore
>> in
>> order to see exactly what breaks.
>>
>> Jennifer
>>
>>     
>>> On Oct 19, 2006, at 9:28 AM, Joel Sherrill wrote:
>>>
>>>       
>>>> jennifer at oarcorp.com wrote:
>>>>         
>>>>> I have a system with 12 serial port.  The driver is configured to be
>>>>> interrupt driven with one character arrival per interrupt.
>>>>> Furthermore,
>>>>> I am using 1 task which is set up to poll the serial ports in raw
>>>>> mode at
>>>>> a 10 ms rate and distribute any data that has arrived.  To do this
>>>>> I have
>>>>> set VMIN and VTIME to 0.  The problem occurs on high bandwidth
>>>>> devices
>>>>> (after they have ran a while), when the device is turned off.
>>>>> Turning off
>>>>> the device results in the system locking up for several seconds then
>>>>> recovering.  Analysis has shown that the read command is locking the
>>>>> system during this time.  It appears to me that the problem is
>>>>> happening
>>>>> is that inside of termios.c in the fillBufferQueue routine.  While
>>>>> the
>>>>> device is running there are always characters available so
>>>>> rawInBuff.Head
>>>>> and rawInBuffTail are never equal.   Then ccount is always >=  c_cc
>>>>> [VMIN]
>>>>> which results in wait being set to 0 and the semaphore never being
>>>>> decremented even thought it being incremented approximately 20 times
>>>>> during the 10 ms application task poll rate.  When the serial
>>>>> device is
>>>>> turned off and the characters are emptied out of the termios
>>>>> buffer, this
>>>>> results in a spinlock of obtaining the rawInBufSemaphore several
>>>>> hundred
>>>>> thousand times.
>>>>>
>>>>>           
>>>> I looked at the code and believe this semaphore was intended to be
>>>> used
>>>> as a counting
>>>> semaphore. I think each increment is for an interrupt occurrence --
>>>> not
>>>> for a single
>>>> character. It is simply not decremented via obtain unless a task is
>>>> willing to block.
>>>> In Jennifer's case, this is very rare so the count is VERY high
>>>> when she
>>>> finally
>>>> doesn't have any data and has to way.
>>>>
>>>> I suggested that this probably needs to be a simple binary
>>>> semaphore but
>>>> she had
>>>> already tried that and it broke something else. She is now trying
>>>> to do
>>>> a flush on
>>>> this semaphore just before checking the buffer counts.
>>>> Technically, this
>>>> is a
>>>> condition mutex and we should be able to get away with that. You don't
>>>> care about
>>>> it until there is no data and then you want to block on it until just
>>>> the next interrupt.
>>>>         
>>> Right -- which is why I would have thought that a binary semaphore
>>> would work.    Can you describe what breaks when a binary semaphore
>>> is used?
>>>
>>> Flushing the semaphore count doesn't sound like a good fix to me.  I
>>> worry a lot about race conditions with that sort of approach.
>>>
>>>       
>>>> Eric.. do you see what is going on? I know it has been years but you
>>>> originally
>>>> wrote this code.
>>>>
>>>>         
>>> Well, I wrote some of it.  All the flow-control stuff is new.
>>>
>>> --
>>> Eric Norum <norume at aps.anl.gov>
>>> Advanced Photon Source
>>> Argonne National Laboratory
>>> (630) 252-4793
>>>
>>>
>>>
>>>       
>>     
>
>   




More information about the users mailing list