The "Out of mbuf clusters" problem, resolved

Wed Sep 15 18:11:52 UTC 2004

Eric Norum.. could you please look at his fixes and see what you think?

Phil Torre wrote:
> In reference to my previous message, here's what I ended up doing to
> "fix" it.
> 
> The deadlocked state that I was observing was caused when the RTEMS
> system was doing sustained file transmission via FTP, and receiving
> a mix of TCP ACKs and broadcast traffic (from chatty ms windows boxes
> on our LAN).  With the default mbuf/cluster pool sizes, we quickly
> run out of clusters.  (Our Ethernet driver only allocates clusters
> for receive data, which makes matters even worse.)
> 
> As soon as all clusters are exhausted, the receive task goes into
> its "waiting for clusters" loop.  As incoming ACKs are processed,
> outbound packets are freed from the sockbuf by TCP, which frees up
> some clusters.  But, there is a race condition between the receive
> thread and the application writing to the socket; they both want
> clusters, and the application is winning too much of the time.  So,
> the incoming ACKs get lost, the outbound packets stay in the sockbuf
> pending retransmission, and there we sit.
> 
> I expected that TCP would eventually time out and drop the connection,
> which should bring us back to life.  It does, but manages not to free
> the outbound packets from the sockbuf.  (This makes no sense to me,
> as it seems to guarantee that we will leak memory if a remote client
> hangs.  But, it sat there wedged for 16 hours without recovering.  
> That's close enough to forever for me.)
> 
> So, I applied two fixes:
> 
> 1) Deadlock recovery.  I shortened tcp_keepidle to 30 seconds, 
>    tcp_keepintvl to 10 seconds, and set always_keepalive.  This
>    makes the connection time out in a few minutes rather than many
>    hours.  Then I modified tcp_drop() so that if the connection is
>    being dropped due to timeout, both receive and send sockbufs and
>    any mbufs/clusters are explicitly freed.
> 
> 2) Deadlock avoidance.  To resolve the "receive thread is losing the
>    fight for clusters" problem, I modified m_clalloc() to respect a
>    global flag set by the receive thread when it is waiting for a
>    cluster.  No one but the receive thread can get a cluster so long
>    as that flag is true.
> 
> With those two changes, my application is now rock-solid even under
> sustained heavy load with default pool sizes.  I can offer patches if
> anyone is interested; I don't know if these changes are something 
> that would be desirable to merge into RTEMS or not.
> 
> -Phil
> 
> 

-- 
Joel Sherrill, Ph.D.             Director of Research & Development
joel at OARcorp.com                 On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
    Support Available             (256) 722-9985