Questions about MBUF allocation in bsdnet

Mon Mar 23 13:14:55 UTC 2015

Chris Johns <chrisj at rtems.org> writes:

> On 23/03/2015 4:13 am, Albert Huang wrote:
>> Hi, all,
>>
>> During developing/tracing an Ethernet device driver for ARM, I also
>> setup an i386 RTEMS on qemu with NE2000 as a comparison. I found that
>> MBUF allocation on both hosts does not behave the same, and might be one
>> of the source of issues. One of the strange behavior is that MSIZE=128
>> on both hosts, 98-byte packets are not segmented on i386 while segmented
>> on ARM.
>
> Each architecture has different alignment constraints and on top of
> this a specific driver can require further constraints so the
> in-memory layout of a specific packet can vary between difference
> types of hardware.
>
>> Another issue is sometimes M_LEADINGSPACE for 84-byte/98-byte
>> packet get 0 or negative.
>
> This is the space remaining in a cluster and is used when deciding on
> pre-pending protocol headers to the user's data.
>
>>
>> In order to clarify these issues, I would like to know more about
>> MBUF. So my questions about MBUF are:
>>
>> 1. Is MSIZE adjustable? Currently, it is set to 128, and I couldn't find
>> where to modify it. Do I need to modify it to support larger Ethernet
>> segment size?
>
> It is fixed and in mbuf.h. I strongly suggest you do not touch this
> value as it is tuned for the stack we currently have.
>
>>
>> 2. What does M_EXT mean exactly? They're different on both hosts as
>> well.
>
> The TCP/IP Illustrated Vol 2 says it is set when an external buffer or
> cluster is used.

I misunderstood what "external" means. After reading the mbuf chater
with figures, now I know.

>
>>
>> 3. If a packet size is larger than MSIZE, would MCLGET() gives me two
>> MBUF, or a "cluster," which means a larger chunk of MBUF's, which are
>> grouped and treated as one MBUF.
>
> It depends on pre-pending or appending data, if you already have
> available space and how much data you are looking to add.
>
> The idea is not to copy data into new buffers each time you change the
> size. When user data is placed in an mbuf and/or cluster at the socket
> layer the amount of data to pre-pended is not known. As the data heads
> down to the driver various pieces are added and rather than copy the
> data to a new larger buffer the stack links a buffer to the start of
> the list of buffers. This "scatters" the data around the memory. The
> driver "gathers" this data as it sends it down the wire.

Thank you very much!

>
>> Am I correct on this part? Then is it
>> normal that I saw 98-byte packet segmented into two Etherent segments?
>>
>> I would appreciate any idea/suggestion. Thanks in advance.
>>
>
> I suggest you look around for some documentation on the topic of mbufs
> in BSD networking software. One of the advantages of RTEMS using the
> FreeBSD stack is the available documentation. Steven's book "TCP/IP
> Illustrated Vol 2" is showing its age but it is still worth a read.
>
> Chris

Hi, Chris,

Thanks for your information. I've read the mbuf chapter in "TCP/IP
Illustrated Vol 2" and know more about mbuf. 

Our network device driver is modified from
c/src/libchip/network/dwmac*.c, but our Synopsys Ethernet MAC IP is more
recent version with different register set. After knowing more about
mbuf, I think I'll trace down to device driver because those behaviors
are normal.

BTW, good to know that "TCP/IP Illustrated Vol 2" is still
valuable. Last time when I "skimmed" through it was about 10 years ago,
and then I've digged too deep into physical layer stuff for about ten
years. Now I'm glad to be back to open source community. Thanks for help
again. 

Albert