Really need some help with RTEMS networking semaphores
Eric Norum
norume at aps.anl.gov
Wed Oct 18 18:23:02 UTC 2006
On Oct 18, 2006, at 12:45 PM, gregory.menke at gsfc.nasa.gov wrote:
>
> Eric Norum writes:
>> On Oct 18, 2006, at 11:00 AM, gregory.menke at gsfc.nasa.gov wrote:
>>
>>>
>>> I've been rewriting the Coldfire fec network driver for a week
>>> now to
>>> try and make it stable under a significant network load and I'm
>>> running
>>> into considerable trouble with deadlocks and network semaphore
>>> issues.
>>> The next 2 days are important, if I can't get the driver stable
>>> then I
>>> will have to abandon the network stack and try to kludge
>>> something up
>>> with message queues.
>>
>> This is the driver from which BSP?
>> The uC5282 driver has been pretty solid here.
>
> We took a copy of the u5282 network.c from the 4.7 CVS for our bsp.
>
>>>
>>> I have the network task priority == 1, all other tasks lower.
>>> 256k in
>>> both the mbuf_bytecount and mbuf_cluster_bytecount.
>>>
>>> The problems mostly manifest in tcp receives by the RTEMS ftpd, but
>>> rapid UDP sends also seem to lock up the stack.
>>>
>>> The tx task always clears the tx queue; loading packets onto the
>>> card
>>> till its full and dumping the rest. Rx task receives packets,
>>> once an
>>> mbuf allocation (done with M_DONTWAIT) fails, all remaining rx
>>> packets
>>> on the card are dumped. Thus the driver (theoretically) never
>>> queues tx
>>> buffers and will not stall the card waiting for rx mbufs.
>>
>> Having the driver throw away transmit buffers doesn't sound like a
>> good idea to me.
>
> I'm trying all options to try and keep the stack on its feet.
>
>
>>>
>>> Is it true that the rx and tx tasks can allocate and free mbuffs as
>>> needed when they have the network semaphore, OR must additional
>>> semaphore release/obtain invocations be used for each and every mbuf
>>> manipulation?
>>
>> The rule is that if a task makes calls to any of the BSD network code
>> it must ensure that it holds the semaphore. The network receive and
>> transmit tasks are started with the semaphore held and call
>> rtems_bsdnet_event_receive to wait for an event. This call releases
>> the semaphore, waits for an event and then reobtains the semaphore
>> before returning. In this way the driver never has to explicitly
>> deal with the network semaphore. By way of example, have a look
>> at c/
>> src/lib/libbsp/m68k/uC5282/network/network.c -- there is no code that
>> manipulates the network semaphore.
>
> The driver tasks only use rtems_bsdnet_event_receive. But for some
> reason I'm still getting the "failed to release" message. Is there a
> way that can be triggered from m_freem()'ing a mbuf that the driver is
> finished with?
>
> Also, how should the rx task request buffers; is it OK to use
> M_DONTWAIT
> so the rx task can dump the rx queue on an allocation failure?
Yes.
>
>
>>>
>>> Under what conditions does the stack deadlock and what can drivers
>>> do to
>>> help prevent it from doing so?
>>
>> Running out of mbufs is never a good thing. In the UDP send case you
>> might reduce the maximum length of the socket queue.
>
> Does that mean a too-long udp send queue can starve for mbufs &
> deadlock
> the stack?
I suspect that this could happen, yes.
>
>
>
>>> What is the functional relationship between the mbuf_bytecount and
>>> mbuf_cluster_bytecount?
>>
>> 'regular' (small) mbufs are allocated from the pool sized by
>> mbuf_bytecount. mbuf clusters (2k each) are allocated from the pool
>> sized by mbuf_cluster_bytecount.
>>
>>>
>>> What should their relative sizings be?
>>
>> Depends on your application. Which type are you running out of?
>> For my EPICS applications here I've got:
>> 180*1024, /* MBUF space */
>> 350*1024, /* MBUF cluster space */
>
>
> How do I tell which I'm running out of?
rtems_bsdnet_show_mbuf_stats ();
>
> I've tried everything from 64k & 128k up to 256k & 256k, some sort of
> problems in all cases. Could you give examples of how mbuf buffer
> sizings relates to types of application?
The only example I can give is what seems to be working here.
>
You're sure that you don't have half/full duplex problems?
--
Eric Norum <norume at aps.anl.gov>
Advanced Photon Source
Argonne National Laboratory
(630) 252-4793
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/users/attachments/20061018/d2b8ee91/attachment-0001.html>
More information about the users
mailing list