RTEMS 4.X.X gen68360 Ethernet Tx Reliability

Thu Feb 3 09:24:18 UTC 2000

Greetings to all RTEMS list-readers from us here at ASC in the UK. After
reading the list for almost a year, we are now having a go at writing! After
seeing all the fantastic innovation going on with RTEMS at the moment I'm
afraid we have a mundane issue to solve which is spoiling all the fun. I
hope you can help us.....

We have a problem with the Tx Ethernet reliability on the MC68360 and have
been working on it for weeks to identify the cause. We found one issue in
the driver which Eric Norum, the author, has looked at, but we still have
one *big* problem. The problem we see is in 4.0.0 and the latest (20000118)
snapshot. Frankly, we are struggling to find it.

The standard BSP tests do not seem to tease it out, but a little
modification to the netdemo program makes our target fall on its back with
its legs in the air every time. Basically we are just trying to send a few
more multi-frame buffers than the standard settings in the netdemo example
test prog and we get a big lockup (no messages).

We need to know if it is something about our port, or a RTEMS/BSD/Driver
problem. One way we can determine this is by asking if you can see the
problem as well. We have no idea how widespread the user base is for RTEMS
on the 68360 (using Ethernet) so maybe you could say Hi! - even if you can't
help. The more replies we get the more certain we can become. Please help if
you can.

The full story, and the suggested modification to the netdemo test program
follows:-

Our Environment:-
RTEMS:-  Heap 512K, RTEMS workspace 256K, Stack 64K.
Network Driver network.c v 1.8 (or earlier) and Scc buffer allocations as:
#define RX_BUF_COUNT     2
#define TX_BUF_COUNT     4
#define TX_BD_PER_BUF    3

These SCC buffer allocation sizes *do* seem to influence the problem. The
problem is easy to reproduce:-

Application netdemo: test.c v 1.6 (or earlier)
New code inserted in transmitUdp() just before the close() at the end.

#if 1
        {
                int bufsize, loops;
                printf("What size bigbuf? ");
                scanf("%u",&bufsize);
                printf("How many loops? ");
                scanf("%u",&loops);
                printf("Starting %u loop(s) of %u bytes\n",loops,bufsize);
                for (i = 1 ; i <= loops ; i++) {
                        if (sendto (s, bigbuf, bufsize, 0, (struct sockaddr
*)&farAddr, sizeof farAddr) < 0) {
                                printf ("transmitUdp1: Loop %u, Can't send:
%s\n", i, strerror (errno));
                                showStatistics ();
                                break;
                                }
                }
                printf("Done.\n");
        }
#endif

To reproduce the lock-up, typically we send 30 loops of 3000 bytes. Other
combinations of tens of loops with multi frame size buffers (up to 9000
bytes) also produce the same effect (not exhaustively determined). The test
passes for any number of small buffer sizes (say 1000 loops of 300 chars),
or just a few loops of large ones.

If the volume of sending data is very large, we have occasionally seen the
send() call fail with the appropriate "out of space" error message, as one
would expect - but no recovery of buffers happens after this so effectively
the Ethernet is dead - but we are ignoring this special case for now.

When it fails, the program doesn't get back to printing "Done." (or anything
for that matter) - the whole plot dies belly up. We think timing might be an
issue.
Sorry for the long posting, thanks for reading it!, Yours very gratefully,

Bob Wisdom
bobwis at ascweb.co.uk