Throughput/Goodput analysis on RTEMS

Mon Jul 12 00:42:40 UTC 2021

On 10/7/21 1:28 pm, Vijay Kumar Banerjee wrote:
> On Fri, Jul 9, 2021 at 8:25 PM Chris Johns <chrisj at rtems.org> wrote:
>>
>> On 2/7/21 4:40 am, Vijay Kumar Banerjee wrote:
>>> I'm planning to do a throughput analysis on the RTEMS network stacks
>>> and I'm looking for some suggestions on the tools/applications for
>>> that if anyone has done something like that before.
>>
>> This is a great project.
>>
>>> If such application has not been used with RTEMS, then I might be
>>> willing to port it to RTEMS or write one from scratch as a net-demo
>>> maybe. Any resource/advice for that is welcome and much appreciated.
>>
>> Throughput is one parameter when dealing with networking and it is important in
>> some applications while latency is another parameter that can be critical.
>> Consider a voice switch with operators using Push To Talk (PTT) buttons to
>> activate a radio's transmitter, an enable packet needs to be delivered within
>> 10msec max in all cases.
>>
> This is an interesting point. Thanks for mentioning latency.

In RTEMS and in an RTOS in general I also consider the latency for separate
socket paths to separate readers and/or concurrent readers and writers as
something important. A lot of stacks maintain a single giant lock for the stack
and that serialises the processing through the stack. Libbsd does not have this
limitation with it's fine grain locking. A protocol such as PTP benefits from
deterministic handling of packets.

>> I would look at removing the networking fabric from the analysis because it is
>> subjective and can be effected by the NIC hardware, PHY configuration plus any
>> externally connected equipment.
>>
> I'm approaching this by running different network stacks over the same
> hardware. I have a uC5282 that runs legacy networking and I have
> ported the driver to libbsd nexus device (dhcp doesn't work yet) and
> I'm able to see some difference in the round trip time over loopback
> with different stacks.

Will your analysis look at architectures that support SMP and have various cache
configurations? On a Zynq libbsd creates over 4000 locks and so there may be a
performance hit on smaller single core systems compared with multi-core
architectures that can run concurrent threads in the stack at the same time.

I think the 5282 may bias the results in a particular way and other archs will
bias in a different way. I do not see this as wrong or a problem, rather it is
something that needs to be accounted for so readers of the analysis do not get
the wrong impression.

>> In terms of network stack performance for RTEMS you need to consider the
>> protocol, buffering used, size of the buffer pool, transmit and receive paths,
>> they will have different characteristics, and target's memory effects such as
>> caches. On top of this is filtering, ie packet filtering, and the types of access.
>>
> Thanks for these interesting attributes to consider. I have done
> preliminary analysis over ICMP with packets of same size over
> different network stacks on the same board.
> 
>> With libbsd there are a number of sysctl settings that effect the performance.
>> How will you manage these?
>>
> I'm not sure. I didn't know about the possible performance difference
> based on sysctl settings. This would be relevant for any user and I
> would like to explore it. Could you please point to some resources or
> examples?

There are a lot of posts on the net on this topic (search: "freebsd tune network
sysctl") and the values can be technical but they do make a difference for
different data workflows. There is no single set of values what work universally
and for libbsd this is OK because I encourage users to explore the settings and
see what works for them. There can be a memory used vs performance trade off. We
support the sysctl API and there is a sysctl command for the shell.

Chris