MBUF Cluster Network freeze problem
Gene.Smith at sea.siemens.com
Wed May 2 13:19:42 UTC 2001
>From: Rosimildo da Silva [mailto:rdasilva at connecttel.com]
>Sent: Wednesday, May 02, 2001 8:12 AM
>Cc: rtems-users at oarcorp.com
>Subject: Re: MBUF Cluster Network freeze problem
>From: "bob" <bobwis at asczone.com>
>To: "'Smith, Gene'" <Gene.Smith at sea.siemens.com>
>Cc: <rtems-users at oarcorp.com>
>Sent: Wednesday, May 02, 2001 2:43 AM
>Subject: RE: MBUF Cluster Network freeze problem
>> Your hypothesis/description certainly sounds credible to me.
>> pain, we eventually came to the conclusion that the network stack was
>> and the problems were all at the application level. However,
>> with in the network code itself and it is possible that this
>> CPU, not giving the App time to empty the other stuff from
>the MBUF pool.
>> am not sure, but it could be that the ping handling runs as
>one of the
>> networks tasks and these run at high priority compared with
>> priority. With RTEMS hard scheduling algorithm the lower
>> never get a slice of CPU time.
>I have been developing a SOAP server for embedded systems,
>and it seems to trigger this problem easily under RTEMS.
>When I put the system under heavy load, a SOAP client make exactly 504
>requests, and "freezes" for about a minute ( client's timeout ). This
>times out, and system goes on for aother 504 requests. I see
>on the RTEMS' console about running low of "MBUFS" before the freeze.
>This problems goes away if I do "socket connection pooling" ( resue the
>on the client side ( using Keep-Alive of HTTP 1.1 ).
>I would say that for some reason the RTEMS does not "free"
>right away the
>MBUFS on closed sockets, if the system is extremely busy.
The message I see is actually "Still waiting for mbuf clusters" and there is
no disconnection of clients. Also, when I see the message, my systems never
recovers and has to be reset. Did you have to reboot when your problem
Also, when I do 'ping -f' to a similar non-rtems unit, it drops about 60% of
the pings and slows down greatly from its primary task, but it does not
lock-up like by rtems-based unit does. The rtems-based unit seem to devote
all its energy to responding to pings and stops doing its primary task
entirely. When the clients (64 for them) time-out after 20 sec and re-send
their messages (while the ping -f is going on) is when the lock-up occurs.
As long as I stop the ping before the clients re-send, I do not see a
lock-up. (The clients never intentionally disconnect.)
More information about the users