memcpy performance
Joel Sherrill
joel at oarcorp.com
Tue Dec 9 18:24:00 UTC 1997
I have forwarded this to the newlib maintainers list for comments.
I have already been told there is a new and improved set of portable
memory functions in the current newlib source.
And before anyone asks .. no I don't have it either. :)
--joel
Joel Sherrill Director of Research & Development
joel at OARcorp.com On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available (205) 722-9985
On Tue, 9 Dec 1997, Eric Norum wrote:
> It's even worse than just a byte-by-byte copy!
>
> On the 971024 snapshot (gen68360 BSP) a call to memcpy produces:
> 1) A call to bcopy
> 2) The bcopy routine links a stack frame and calls memmove
> 3) The memmove routine:
> a) links a stack frame
> b) checks for overlap
> c) does a byte-by-byte copy
> 5 instructions/byte on a CPU32 processor!
>
> There's a heck a of a lot of unnecessary code here:
> Two extra function calls
> Two extra stack frames
> Extra code to check for overlap
> A very inefficient loop
>
> Processor-independent improvements required:
> 1) There should be an explicit memcpy routine.
> 2) The library should be compiled with aggressive optimization.
>
> Processor-dependent improvements that would be nice:
> M68k - The loop in memmove should be done in such a way that
> processors like the CPU32 can go into loop mode.
>
> Now all we need is a willing volunteer......
>
> ---
> Eric Norum eric at skatter.usask.ca
> Saskatchewan Accelerator Laboratory Phone: (306) 966-6308
> University of Saskatchewan FAX: (306) 966-6058
> Saskatoon, Canada.
>
More information about the users
mailing list