placing hot spot functions in on-chip ram

Hill, Jeff johill at lanl.gov
Fri Apr 12 16:31:06 UTC 2013


With option 1 we have something like this in the linker script:

	.fast_text : {
		*(.bsp_fast_text)
	} > REGION_FAST_TEXT AT > REGION_FAST_TEXT_LOAD

With option 2 we have something like this in the linker script:

	.fast_text : {
		fast_text.o(.text .rodata)
	} > REGION_FAST_TEXT AT > REGION_FAST_TEXT_LOAD

The problem is that there appears to be some kind of bug in the gnu ld linker script parser and 
so one can't have long object file names like this, with option 2.  

	.fast_text : {
		libscorecpu_a-nios2-iic-low-level.* (.text .rodata)
	} > REGION_FAST_TEXT AT > REGION_FAST_TEXT_LOAD

> 2. Rename the .text section of selected object files to .bsp_fast_text (link
> time option).  Attached is an example script.

I see, yes, this _is_ a 3rd option, that I was unaware of. Thanks for that 
suggestion. I will need to consider this option further. 

> In general such optimizations are highly application dependent.  

Yes, and worse yet, the amount of on-chip ram is also quite application 
dependent. The problem I suppose is that the user will need to choose
which classes of hotspot functions need to be mapped to on-chip ram,
or not, depending on their situation. I suppose that this could be managed
easily by appropriately choosing a set of section names {hotspot_in_cksum, 
hotspot_memcpy, hotspot_exception_entry, ...}.

> I don't think we should use the attributes approach in the cpukit, because in
> this case we must ensure that every linker command file can cope with these
> sections.

Perhaps some of this in cpukit is ok if the functions tagged for a hotspot section 
are only Nios2 specific functions, so they will impact only Nios2 linker scripts, 
and not break backwards compatibility for architectures widely in use. Perhaps 
this is ok because it is quite easy to just move the hotspot sections to the normal 
text section if on-chip ram isn't  provided, and in fact the Nios2 BSPs could (probably
should) just map such hotspot functions to the normal text section by default. We 
could provide also a commented out section for tightly coupled ram that the more 
sophisticated user might manipulate.

For example, nios2-iic-low-level.S is easily moved to a special section since it is 
assembler, and this is maybe a good idea because it contains the processor exception 
entry point. Currently, as I recall, the BSP copies a small bit of code to the exception 
entry point address during startup which indirectly jumps to nios2-iic-low-level.S. This adds some 
ISR latency which could be eliminated if nios2-iic-low-level.S was in a private section which
is just placed in memory directly at the processor's exception entry address. If the
user moves the exception entry point around in memory when configuring the
Nios2 instance then I believe that they must also be aware of what is happening in the 
linker script. Presumably, the Altera provided flash boot loader inserted by elf2flash can
be relied upon to copy all of the sections to their proper destinations (including the exception 
vector section) and so we don't need to implement that type of copying in RTEMS startup code
(there are many other tasks we can work on).

Thanks for your comments,

Jeff

> -----Original Message-----
> From: Sebastian Huber [mailto:sebastian.huber at embedded-brains.de]
> Sent: Friday, April 12, 2013 1:03 AM
> To: Hill, Jeff
> Cc: rtems-users at rtems.org
> Subject: Re: placing hot spot functions in on-chip ram
> 
> Hello Jeffrey,
> 
> On 04/12/2013 12:38 AM, Hill, Jeff wrote:
> > Hello Sebastian,
> >
> > Just a quick note to identify my current direction so that you will have an
> opportunity to provide some comments.
> >
> > Today I had a closer look at techniques for placing certain Nios2 hotspot
> functions in on-chip ram. I see only two options for implementing this cleanly.
> > 1) Move code using its section name in the linker script
> >    o Mark the C function with a section attribute. The function shouldn't
> contain rwdata or bss variables.
> >    o Mark the assembler code with a section attribute. Such code shouldn't
> contain rwdata or bss variables.
> > 2) Place all hot spot functions in separate object files, and manipulate the
> location of object files in the linker script
> >    o The entire object file shouldn't contain any rwdata or bss variables.
> 
> a third option is to add a hotspot section to the linker command file for code.
>   We have this in the basic linker command files for ARM and PowerPC:
> 
> http://git.rtems.org/rtems/tree/c/src/lib/libbsp/arm/shared/startup/linkcmds
> .base
> 
> 	.fast_text : {
> 		bsp_section_fast_text_begin = .;
> 		*(.bsp_fast_text)
> 		bsp_section_fast_text_end = .;
> 	} > REGION_FAST_TEXT AT > REGION_FAST_TEXT_LOAD
> 	bsp_section_fast_text_size = bsp_section_fast_text_end -
> bsp_section_fast_text_begin;
> 	bsp_section_fast_text_load_begin = LOADADDR (.fast_text);
> 	bsp_section_fast_text_load_end = bsp_section_fast_text_load_begin
> +
> bsp_section_fast_text_size;
> 
> Now you can do two things:
> 
> 1. Use a section attribute (compile time option):
> 
> #define BSP_FAST_TEXT_SECTION __attribute__((section(".bsp_fast_text")))
> 
> 2. Rename the .text section of selected object files to .bsp_fast_text (link
> time option).  Attached is an example script.
> 
> In general such optimizations are highly application dependent.  In the
> example
> script the modules are determined by run-time traces obtained with a
> hardware
> tracer from Lauterbach.
> 
> I don't think we should use the attributes approach in the cpukit, because in
> this case we must ensure that every linker command file can cope with these
> sections.
> 
> >
> > Until today I have been gravitating towards option 2. However now I have
> bumped into a nasty bug in the gnu ld where the linker script parsing code
> will find a short object file name in its input stream, but will not find a long
> name such as "libscorecpu_a-nios2-iic-low-level.o". Since the file name
> characters are chopped off somewhere in ld, maybe at about 10 characters
> from the right hand side of the above name, then it's difficult to work around
> this issue with wild cards; I end up with file name ambiguities, and too much
> goes in the tightly coupled code section.
> >
> > Furthermore, I see that with option 1 we have another benefit being that
> some rwdata or bss can be in the object file as long as it isn't in the function.
> Of course the BSS and RWDATA can't be in on-chip ram if we are to survive a
> reset without reloading everything from flash.
> >
> > Therefore, my current plan is to make a Nios2 private macro for the C
> source code gnu section name attribute. This implies that we can initially
> place only Nios2 specific codes in on-chip ram of course.
> >
> > Another option would be to just fix the gnu ld program, but finding the
> exact location of this bug in the source initially appears to be a time
> consuming task. Perhaps this is even a bug originating from Altera, I don't
> know.
> >
> > Open to your suggestions.
> >
> > Jeff
> >
> 
> Did you try out the latest GNU Binutils (not the release, the CVS head)?  They
> support now Nios II and it works well for me.
> 
> --
> Sebastian Huber, embedded brains GmbH
> 
> Address : Dornierstr. 4, D-82178 Puchheim, Germany
> Phone   : +49 89 189 47 41-16
> Fax     : +49 89 189 47 41-09
> E-Mail  : sebastian.huber at embedded-brains.de
> PGP     : Public key available on request.
> 
> Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.




More information about the users mailing list