The "Out of mbuf clusters" problem, resolved

Wed Sep 15 20:08:00 UTC 2004

could you send you patch to me? I got similar problem with 4.5.0 version.
Thanks in advance.

> -----Original Message-----
> From: Phil Torre [mailto:ptorre at zetron.com]
> Sent: Wednesday, September 15, 2004 2:05 PM
> To: RTEMS User List
> Subject: The "Out of mbuf clusters" problem, resolved
>
>
> In reference to my previous message, here's what I ended up doing to
> "fix" it.
>
> The deadlocked state that I was observing was caused when the RTEMS
> system was doing sustained file transmission via FTP, and receiving
> a mix of TCP ACKs and broadcast traffic (from chatty ms windows boxes
> on our LAN).  With the default mbuf/cluster pool sizes, we quickly
> run out of clusters.  (Our Ethernet driver only allocates clusters
> for receive data, which makes matters even worse.)
>
> As soon as all clusters are exhausted, the receive task goes into
> its "waiting for clusters" loop.  As incoming ACKs are processed,
> outbound packets are freed from the sockbuf by TCP, which frees up
> some clusters.  But, there is a race condition between the receive
> thread and the application writing to the socket; they both want
> clusters, and the application is winning too much of the time.  So,
> the incoming ACKs get lost, the outbound packets stay in the sockbuf
> pending retransmission, and there we sit.
>
> I expected that TCP would eventually time out and drop the connection,
> which should bring us back to life.  It does, but manages not to free
> the outbound packets from the sockbuf.  (This makes no sense to me,
> as it seems to guarantee that we will leak memory if a remote client
> hangs.  But, it sat there wedged for 16 hours without recovering.
> That's close enough to forever for me.)
>
> So, I applied two fixes:
>
> 1) Deadlock recovery.  I shortened tcp_keepidle to 30 seconds,
>    tcp_keepintvl to 10 seconds, and set always_keepalive.  This
>    makes the connection time out in a few minutes rather than many
>    hours.  Then I modified tcp_drop() so that if the connection is
>    being dropped due to timeout, both receive and send sockbufs and
>    any mbufs/clusters are explicitly freed.
>
> 2) Deadlock avoidance.  To resolve the "receive thread is losing the
>    fight for clusters" problem, I modified m_clalloc() to respect a
>    global flag set by the receive thread when it is waiting for a
>    cluster.  No one but the receive thread can get a cluster so long
>    as that flag is true.
>
> With those two changes, my application is now rock-solid even under
> sustained heavy load with default pool sizes.  I can offer patches if
> anyone is interested; I don't know if these changes are something
> that would be desirable to merge into RTEMS or not.
>
> -Phil
>
>
> --
>
> =====================================================================
> Phil Torre                               phone: 425-820-6363 x234
> Design Engineer                          email: ptorre at zetron.com
> Switching Systems Group                    fax: 425-820-7031
> Zetron, Inc.                               web: http://www.zetron.com
>