Really need some help with RTEMS networking semaphores

Wed Oct 18 17:45:49 UTC 2006

Eric Norum writes:
 > On Oct 18, 2006, at 11:00 AM, gregory.menke at gsfc.nasa.gov wrote:
 > 
 > >
 > > I've been rewriting the Coldfire fec network driver for a week now to
 > > try and make it stable under a significant network load and I'm  
 > > running
 > > into considerable trouble with deadlocks and network semaphore issues.
 > > The next 2 days are important, if I can't get the driver stable then I
 > > will have to abandon the network stack and try to kludge something up
 > > with message queues.
 > 
 > This is the driver from which BSP?
 > The uC5282 driver has been pretty solid here.

We took a copy of the u5282 network.c from the 4.7 CVS for our bsp.

 > >
 > > I have the network task priority == 1, all other tasks lower.  256k in
 > > both the mbuf_bytecount and mbuf_cluster_bytecount.
 > >
 > > The problems mostly manifest in tcp receives by the RTEMS ftpd, but
 > > rapid UDP sends also seem to lock up the stack.
 > >
 > > The tx task always clears the tx queue; loading packets onto the card
 > > till its full and dumping the rest.  Rx task receives packets, once an
 > > mbuf allocation (done with M_DONTWAIT) fails, all remaining rx packets
 > > on the card are dumped.  Thus the driver (theoretically) never  
 > > queues tx
 > > buffers and will not stall the card waiting for rx mbufs.
 > 
 > Having the driver throw away transmit buffers doesn't sound like a  
 > good idea to me.

I'm trying all options to try and keep the stack on its feet.

 > >
 > > Is it true that the rx and tx tasks can allocate and free mbuffs as
 > > needed when they have the network semaphore, OR must additional
 > > semaphore release/obtain invocations be used for each and every mbuf
 > > manipulation?
 > 
 > The rule is that if a task makes calls to any of the BSD network code  
 > it must ensure that it holds the semaphore.  The network receive and  
 > transmit tasks are started with the semaphore held and call  
 > rtems_bsdnet_event_receive to wait for an event.  This call releases  
 > the semaphore, waits for an event and then reobtains the semaphore  
 > before returning.   In this way the driver never has to explicitly  
 > deal with the network semaphore.  By way of example, have a look at c/ 
 > src/lib/libbsp/m68k/uC5282/network/network.c -- there is no code that  
 > manipulates the network semaphore.

The driver tasks only use rtems_bsdnet_event_receive.  But for some
reason I'm still getting the "failed to release" message.  Is there a
way that can be triggered from m_freem()'ing a mbuf that the driver is
finished with?

Also, how should the rx task request buffers; is it OK to use M_DONTWAIT
so the rx task can dump the rx queue on an allocation failure?

 > >
 > > Under what conditions does the stack deadlock and what can drivers  
 > > do to
 > > help prevent it from doing so?
 > 
 > Running out of mbufs is never a good thing.  In the UDP send case you  
 > might  reduce the maximum length of the socket queue.

Does that mean a too-long udp send queue can starve for mbufs & deadlock
the stack?

 > > What is the functional relationship between the mbuf_bytecount and
 > > mbuf_cluster_bytecount?
 > 
 > 'regular' (small) mbufs are allocated from the pool sized by  
 > mbuf_bytecount.   mbuf clusters (2k each) are allocated from the pool  
 > sized by mbuf_cluster_bytecount.
 > 
 > >
 > > What should their relative sizings be?
 > 
 > Depends on your application.  Which type are you running out of?
 > For my EPICS applications here I've got:
 >      180*1024,              /* MBUF space */
 >      350*1024,              /* MBUF cluster space */

How do I tell which I'm running out of?  

I've tried everything from 64k & 128k up to 256k & 256k, some sort of
problems in all cases.  Could you give examples of how mbuf buffer
sizings relates to types of application?

Thanks,

Greg