Really need some help with RTEMS networking semaphores

Wed Oct 18 17:06:44 UTC 2006

On Oct 18, 2006, at 11:00 AM, gregory.menke at gsfc.nasa.gov wrote:

>
> I've been rewriting the Coldfire fec network driver for a week now to
> try and make it stable under a significant network load and I'm  
> running
> into considerable trouble with deadlocks and network semaphore issues.
> The next 2 days are important, if I can't get the driver stable then I
> will have to abandon the network stack and try to kludge something up
> with message queues.

This is the driver from which BSP?
The uC5282 driver has been pretty solid here.

>
> The rx and tx tasks both sleep on rtems_bsdnet_event_receive().  When
> they respectively awake, now having the semaphore, they both perform
> their expected tasks, and then go back to sleep on the event again; rx
> task allocates mbufs & sends them to ether_receive, tx task copies off
> the mbuf data to the transmit buffers, etc..  This is all fine when  
> the
> I/O rate is low, but when it picks up, then the stack has issues, it
> either silently deadlocks, prints that its waiting for mbufs (waits
> forever, printing that its waiting every few secs), or says it can't
> release the network semaphore and also deadlocks.
>
> Version is RTEMS 4.6.6
>
>
> I have the network task priority == 1, all other tasks lower.  256k in
> both the mbuf_bytecount and mbuf_cluster_bytecount.
>
> The problems mostly manifest in tcp receives by the RTEMS ftpd, but
> rapid UDP sends also seem to lock up the stack.
>
> The tx task always clears the tx queue; loading packets onto the card
> till its full and dumping the rest.  Rx task receives packets, once an
> mbuf allocation (done with M_DONTWAIT) fails, all remaining rx packets
> on the card are dumped.  Thus the driver (theoretically) never  
> queues tx
> buffers and will not stall the card waiting for rx mbufs.

Having the driver throw away transmit buffers doesn't sound like a  
good idea to me.

>
>
>
> The questions that I'd <really> like to find answers to are;
>
> Are there SPECIFIC guidelines anywhere about how driver rx & tx tasks
> may and may not work with mbufs and how the network semaphore is to be
> used?  The Network supplement is very general and quite old at this
> point.
>
> Is it true that the rx and tx tasks can allocate and free mbuffs as
> needed when they have the network semaphore, OR must additional
> semaphore release/obtain invocations be used for each and every mbuf
> manipulation?

The rule is that if a task makes calls to any of the BSD network code  
it must ensure that it holds the semaphore.  The network receive and  
transmit tasks are started with the semaphore held and call  
rtems_bsdnet_event_receive to wait for an event.  This call releases  
the semaphore, waits for an event and then reobtains the semaphore  
before returning.   In this way the driver never has to explicitly  
deal with the network semaphore.  By way of example, have a look at c/ 
src/lib/libbsp/m68k/uC5282/network/network.c -- there is no code that  
manipulates the network semaphore.

>
> Under what conditions does the stack deadlock and what can drivers  
> do to
> help prevent it from doing so?

Running out of mbufs is never a good thing.  In the UDP send case you  
might  reduce the maximum length of the socket queue.

>
> What is the functional relationship between the mbuf_bytecount and
> mbuf_cluster_bytecount?

'regular' (small) mbufs are allocated from the pool sized by  
mbuf_bytecount.   mbuf clusters (2k each) are allocated from the pool  
sized by mbuf_cluster_bytecount.

>
> What should their relative sizings be?

Depends on your application.  Which type are you running out of?
For my EPICS applications here I've got:
     180*1024,              /* MBUF space */
     350*1024,              /* MBUF cluster space */

-- 
Eric Norum <norume at aps.anl.gov>
Advanced Photon Source
Argonne National Laboratory
(630) 252-4793