FW: RTEMS 4.X.X gen68360 Ethernet Tx Reliability

Fri Feb 4 09:28:03 UTC 2000

Hi Jake,
Many thanks for your reply, and HI! It is great to know we are not alone!.
We have seen the out off MBUFs problem - usually it returns to the
application layer and fails on send() This usually happens when we just give
it too much data to send. Our problem is either in the TCP stack (unlikely)
or at glue/driver level (possible) or below. All the problems seem to relate
to multi-frame chains. For data block less than say 1.5K there are no
problems. We found a clue yesterday PM to do with these empty MBUF's - keep
watching the mail list I hope to post more info soon.
Cheers!
Bob

-----Original Message-----
From: Jake Janovetz [mailto:janovetz at tempest.ece.uiuc.edu]
Sent: Thursday, February 03, 2000 2:34 PM
To: bobwis at ascweb.co.uk
Subject: Re: RTEMS 4.X.X gen68360 Ethernet Tx Reliability

Bob,

   I'm replying to you mainly just to "say Hi" as you said.  I
use the 68360 ethernet on two boards.  One of which I am actively
developing in my spare time.  Two notes here:

1. The previous board had ethernet lockup problems, but only when
   I exceeded the mbuf setting by opening up many, many TCP connections
   before they had time to clear out.  Increasing mbufs solved that
   problem and it is clearly not your problem.
2. I'm presently having trouble with the GoAhead webserver when
   serving largish ASP pages.  The suggested fix for this by GoAhead
   did not solve my problem.  Again, I don't think your problem is
   related.

   I won't have time to check things out until perhaps next week.
(maybe even saturday)  But I wanted to "say Hi" anyhow.

    Cheers,
    Jake

On Thu, Feb 03, 2000 at 09:24:18AM -0000, Bob Wisdom wrote:
> Greetings to all RTEMS list-readers from us here at ASC in the UK. After
> reading the list for almost a year, we are now having a go at writing!
After
> seeing all the fantastic innovation going on with RTEMS at the moment I'm
> afraid we have a mundane issue to solve which is spoiling all the fun. I
> hope you can help us.....
>
> We have a problem with the Tx Ethernet reliability on the MC68360 and have
> been working on it for weeks to identify the cause. We found one issue in
> the driver which Eric Norum, the author, has looked at, but we still have
> one *big* problem. The problem we see is in 4.0.0 and the latest
(20000118)
> snapshot. Frankly, we are struggling to find it.
>
> The standard BSP tests do not seem to tease it out, but a little
> modification to the netdemo program makes our target fall on its back with
> its legs in the air every time. Basically we are just trying to send a few
> more multi-frame buffers than the standard settings in the netdemo example
> test prog and we get a big lockup (no messages).
>
> We need to know if it is something about our port, or a RTEMS/BSD/Driver
> problem. One way we can determine this is by asking if you can see the
> problem as well. We have no idea how widespread the user base is for RTEMS
> on the 68360 (using Ethernet) so maybe you could say Hi! - even if you
can't
> help. The more replies we get the more certain we can become. Please help
if
> you can.
>
> The full story, and the suggested modification to the netdemo test program
> follows:-
>
> Our Environment:-
> RTEMS:-  Heap 512K, RTEMS workspace 256K, Stack 64K.
> Network Driver network.c v 1.8 (or earlier) and Scc buffer allocations as:
> #define RX_BUF_COUNT     2
> #define TX_BUF_COUNT     4
> #define TX_BD_PER_BUF    3
>
>
> These SCC buffer allocation sizes *do* seem to influence the problem. The
> problem is easy to reproduce:-
>
> Application netdemo: test.c v 1.6 (or earlier)
> New code inserted in transmitUdp() just before the close() at the end.
>
> #if 1
>         {
>                 int bufsize, loops;
>                 printf("What size bigbuf? ");
>                 scanf("%u",&bufsize);
>                 printf("How many loops? ");
>                 scanf("%u",&loops);
>                 printf("Starting %u loop(s) of %u bytes\n",loops,bufsize);
>                 for (i = 1 ; i <= loops ; i++) {
>                         if (sendto (s, bigbuf, bufsize, 0, (struct
sockaddr
> *)&farAddr, sizeof farAddr) < 0) {
>                                 printf ("transmitUdp1: Loop %u, Can't
send:
> %s\n", i, strerror (errno));
>                                 showStatistics ();
>                                 break;
>                                 }
>                 }
>                 printf("Done.\n");
>         }
> #endif
>
>
> To reproduce the lock-up, typically we send 30 loops of 3000 bytes. Other
> combinations of tens of loops with multi frame size buffers (up to 9000
> bytes) also produce the same effect (not exhaustively determined). The
test
> passes for any number of small buffer sizes (say 1000 loops of 300 chars),
> or just a few loops of large ones.
>
> If the volume of sending data is very large, we have occasionally seen the
> send() call fail with the appropriate "out of space" error message, as one
> would expect - but no recovery of buffers happens after this so
effectively
> the Ethernet is dead - but we are ignoring this special case for now.
>
> When it fails, the program doesn't get back to printing "Done." (or
anything
> for that matter) - the whole plot dies belly up. We think timing might be
an
> issue.
> Sorry for the long posting, thanks for reading it!, Yours very gratefully,
>
> Bob Wisdom
> bobwis at ascweb.co.uk
>
>
>

--
   janovetz at uiuc.edu    | How can it be that mathematics, being after all a
 University of Illinois | product of human thought independent of
experience,
                        | is so admirably adapted to the objects of reality?
        PP-ASEL         |                                  - Albert Einstein

Disclaimer: The policies of this University certainly do not reflect my
            own opinions, objectives, or agenda.