problems with network processing

Eugene Denisov dea at sendmail.ru
Fri Feb 13 13:01:32 UTC 2004


Thank you very much.
Yet another question: I recently try to fix this problem by increasing
'HeapSize' parameter
in Makefile (according RTEMS documentation):

CFLAGS_LD += -Wl,--defsym -Wl,HeapSize=0xa00000

I found changes in this parameter help - the socket stalls rarely (now I
chagme this parameter from 0x500000 and system seems to work stable). Have
you any to say about?

----- Original Message ----- 
From: "Sergei Organov" <osv at topconrd.ru>
To: "Eugene Denisov" <dea at sendmail.ru>
Cc: <rtems-users at rtems.com>
Sent: Friday, February 13, 2004 3:29 PM
Subject: Re: problems with network processing


> "Eugene Denisov" <dea at sendmail.ru> writes:
> > Hello, all.
> >
> > Can't any help me to localize the probem I got with RTEMS?
> > I develop client-server application where RTEMS run server process and
Java
> > client communicates with server via socket connection. After some time
of
> > success processing the server under RTEMS blocks network socket. On
RTEMS
> > console I see  the repeating messages:
> >
> > Still waiting for mbuf cluster.
> > Still waiting for mbuf cluster.
> > Still waiting for mbuf cluster.
> > Still waiting for mbuf cluster.
> > ...
>
> That has been discussed a lot before...
>
> Try to visit <http://www.rtems.com/ml/rtems-users/> and enter 'mbuf' as a
> search string.
>
> Below is my answer to a similar problem that for whatever reason I can't
find
> in the mailing list archives anymore :-(
>
> Rolf Schroedter <Rolf.Schroedter at dlr.de> writes:
> > My RTEMS application communicates with a Linux-PC via connectionless UDP
> > socket, receiving and replying UDP packets every ~20 milliseconds.
> > After a few minutes the application stops communicating.
> > The application is still running, but no more UDP packets are received.
> >
> >  From time to time a message is displayed at the console:
> >     "Still waiting for mbuf cluster."
> > With [ethereal] I see that correct UDP packets are coming from the PC.
> > Additionally the PC is sending ARP requests:
> >     "Who has 192.168.0.100 ? Tell 192.168.0.1"
> > which are also not answered by RTEMS.
> >
> > What does this mean ?
>
> It means that the stack ran out of mbuf clusters.
>
> > How can the UDP communication be restarted?
>
> It depends on the cause of the problem. Maybe it will help to close the
socket
> then reopen it. But I'd suggest to fix the cause of the problem rather
than
> battling with the consequences.
>
> As others mentioned, it could be some mbuf clusters leak in the Ethernet
> driver. Not very likely to happen if you use one of the drivers provided
with
> RTEMS. If you use your own Ethernet driver, make sure there is no mbufs
leak
> somewhere in the code. A good check for the leak would be to call
> 'rtems_bsdnet_show_mbuf_stats()' before you open the socket, then let the
> socket work for some time, then close it (while it is still operational),
then
> call the stats routine again and compare the results. If the leak is the
case,
> the bug needs to be fixed.
>
> It could be the case that you either don't read all the data from the
socket
> or don't read fast enough and/or write to the socket too fast. In this
case
> the socket could eat all the mbufs/clusters available to the stack and
then
> stall. This is the result of general problem with the default stack
> configuration in RTEMS, I believe. It seems that the default RTEMS
> configuration could be not suitable even for single TCP/UDP socket open by
> an application :( At least my recent investigation of similar problem with
> TCP sockets lead me to this conclusion.
>
> To understand why the problem could arise and how to eliminate it, let's
> consider some numbers. By default, RTEMS allocates 64Kb for mbufs and
128Kb
> for clusters for the stack. These defaults are set by 'nmbuf' and
> 'nmbclusters' variables found in 'libnetworking/rtems/rtems_glue.c'.
>
> [BTW, the next variable defined in this file, 'unsigned long sb_efficiency
=
> 8;' is unused and misleading as its name suggests it has something to do
with
> the actual 'sb_efficiency' variable that in fact is static in the file
> 'libnetworking/kern/uipc_socket2.c'.]
>
> Now, for UDP socket, the stack has receive and transmit buffer size limits
> defined by the variables 'udp_recvspace' and 'udp_sendspace' found in the
file
> 'libnetworking/netinet/udp_usrreq.c'. They are 41200 and 9216 bytes,
> respectively, so the total configured space for UDP socket is 50416.
However,
> actual maximum space that the stack may allocate for the socket is higher
as
> defined by the multiplication factor 'sb_efficiency' found in the file
> 'libnetworking/kern/uipc_socket2.c'. The default value is 8.
>
> Thus, actual maximum size that the stack could allocate for UDP socket in
the
> worst case is 50416*8=403328 bytes that is about 4 times more(!) than
total
> number of bytes available for the stack in clusters (128Kb). Well, the
> calculations above are not strict and actual upper margin is somewhat
less,
> but the estimation we've got at least gives us an idea. Anyway, the BSD
stack
> starts to behave badly as soon as it runs out of mbufs as allocation of
mbufs
> is done without timeouts through the stack code. As a result, read/write
> timeouts on the socket may stop working and even total unrecoverable stack
> stalls are possible. In addition, it seems that once an mbuf/cluster is
> allocated to a socket buffer, it won't be returned to the stack until the
> socket is closed.
>
> As for TCP sockets, the send/receive buffers are 16Kb each, 32Kb*8=256Kb
per
> socket maximum that is also greater than total memory allocated for stack
by
> default. I myself have encountered troubles ("Still waiting for mbufs")
> opening three sockets and sending a lot of data in rather small chunks
(about
> 50 bytes each) to them. This results in sockets using no mbuf clusters as
such
> chunks fit into mbufs. Eventually the result is total stall. It seems that
at
> some point TCP sockets eat all the mbufs so that there are no spare mbufs
to
> store incoming packets and pass them to the stack.
>
> Now, what can you do about it? If you have plenty of spare RAM on your
board,
> just increase the space allocated for mbuf clusters and/or mbufs.
Allocating
> 512Kb for clusters would be definitely enough in your case as above
> calculations show.
>
> To decrease memory requirements, you can lower the value of
'sb_efficiency'
> and/or 'udp_recvspace'. My problem has gone as soon as I've set
sb_efficiency
> to 1 and tcp_[send|recv]space to 8Kb while increasing space for mbufs to
> 128Kb.
>
> Hope it helps.
>
>
> -- 
> Sergei.
>




More information about the users mailing list