placing hot spot functions in on-chip ram

Hill, Jeff johill at lanl.gov
Fri Apr 12 16:53:01 UTC 2013


Furthermore, if the name of the section applied to hotspot functions in
cpukit was something like".text.hotspot_in_cksum" then perhaps such 
text would be auto-magically placed in the normal text section 
by almost all of the preexisting typical linker scripts w/o breaking backwards 
compatibility . See below.

	.text : {
		bsp_section_text_begin = .;
		*(.text.unlikely .text.*_unlikely)
		*(.text .stub .text.* .gnu.linkonce.t.*)
		/* .gnu.warning sections are handled specially by elf32.em.  */
		*(.gnu.warning)
		*(.glue_7t) *(.glue_7) *(.vfp11_veneer) *(.v4_bx)
	} > REGION_TEXT AT > REGION_TEXT_LOAD

> -----Original Message-----
> From: Hill, Jeff
> Sent: Friday, April 12, 2013 10:31 AM
> To: 'Sebastian Huber'
> Cc: rtems-users at rtems.org
> Subject: RE: placing hot spot functions in on-chip ram
> 
> 
> With option 1 we have something like this in the linker script:
> 
> 	.fast_text : {
> 		*(.bsp_fast_text)
> 	} > REGION_FAST_TEXT AT > REGION_FAST_TEXT_LOAD
> 
> With option 2 we have something like this in the linker script:
> 
> 	.fast_text : {
> 		fast_text.o(.text .rodata)
> 	} > REGION_FAST_TEXT AT > REGION_FAST_TEXT_LOAD
> 
> The problem is that there appears to be some kind of bug in the gnu ld linker
> script parser and
> so one can't have long object file names like this, with option 2.
> 
> 	.fast_text : {
> 		libscorecpu_a-nios2-iic-low-level.* (.text .rodata)
> 	} > REGION_FAST_TEXT AT > REGION_FAST_TEXT_LOAD
> 
> > 2. Rename the .text section of selected object files to .bsp_fast_text (link
> > time option).  Attached is an example script.
> 
> I see, yes, this _is_ a 3rd option, that I was unaware of. Thanks for that
> suggestion. I will need to consider this option further.
> 
> > In general such optimizations are highly application dependent.
> 
> Yes, and worse yet, the amount of on-chip ram is also quite application
> dependent. The problem I suppose is that the user will need to choose
> which classes of hotspot functions need to be mapped to on-chip ram,
> or not, depending on their situation. I suppose that this could be managed
> easily by appropriately choosing a set of section names {hotspot_in_cksum,
> hotspot_memcpy, hotspot_exception_entry, ...}.
> 
> > I don't think we should use the attributes approach in the cpukit, because in
> > this case we must ensure that every linker command file can cope with
> these
> > sections.
> 
> Perhaps some of this in cpukit is ok if the functions tagged for a hotspot
> section
> are only Nios2 specific functions, so they will impact only Nios2 linker scripts,
> and not break backwards compatibility for architectures widely in use.
> Perhaps
> this is ok because it is quite easy to just move the hotspot sections to the
> normal
> text section if on-chip ram isn't  provided, and in fact the Nios2 BSPs could
> (probably
> should) just map such hotspot functions to the normal text section by default.
> We
> could provide also a commented out section for tightly coupled ram that the
> more
> sophisticated user might manipulate.
> 
> For example, nios2-iic-low-level.S is easily moved to a special section since it
> is
> assembler, and this is maybe a good idea because it contains the processor
> exception
> entry point. Currently, as I recall, the BSP copies a small bit of code to the
> exception
> entry point address during startup which indirectly jumps to nios2-iic-low-
> level.S. This adds some
> ISR latency which could be eliminated if nios2-iic-low-level.S was in a private
> section which
> is just placed in memory directly at the processor's exception entry address. If
> the
> user moves the exception entry point around in memory when configuring
> the
> Nios2 instance then I believe that they must also be aware of what is
> happening in the
> linker script. Presumably, the Altera provided flash boot loader inserted by
> elf2flash can
> be relied upon to copy all of the sections to their proper destinations
> (including the exception
> vector section) and so we don't need to implement that type of copying in
> RTEMS startup code
> (there are many other tasks we can work on).
> 
> Thanks for your comments,
> 
> Jeff
> 
> > -----Original Message-----
> > From: Sebastian Huber [mailto:sebastian.huber at embedded-brains.de]
> > Sent: Friday, April 12, 2013 1:03 AM
> > To: Hill, Jeff
> > Cc: rtems-users at rtems.org
> > Subject: Re: placing hot spot functions in on-chip ram
> >
> > Hello Jeffrey,
> >
> > On 04/12/2013 12:38 AM, Hill, Jeff wrote:
> > > Hello Sebastian,
> > >
> > > Just a quick note to identify my current direction so that you will have an
> > opportunity to provide some comments.
> > >
> > > Today I had a closer look at techniques for placing certain Nios2 hotspot
> > functions in on-chip ram. I see only two options for implementing this
> cleanly.
> > > 1) Move code using its section name in the linker script
> > >    o Mark the C function with a section attribute. The function shouldn't
> > contain rwdata or bss variables.
> > >    o Mark the assembler code with a section attribute. Such code shouldn't
> > contain rwdata or bss variables.
> > > 2) Place all hot spot functions in separate object files, and manipulate the
> > location of object files in the linker script
> > >    o The entire object file shouldn't contain any rwdata or bss variables.
> >
> > a third option is to add a hotspot section to the linker command file for
> code.
> >   We have this in the basic linker command files for ARM and PowerPC:
> >
> >
> http://git.rtems.org/rtems/tree/c/src/lib/libbsp/arm/shared/startup/linkcmds
> > .base
> >
> > 	.fast_text : {
> > 		bsp_section_fast_text_begin = .;
> > 		*(.bsp_fast_text)
> > 		bsp_section_fast_text_end = .;
> > 	} > REGION_FAST_TEXT AT > REGION_FAST_TEXT_LOAD
> > 	bsp_section_fast_text_size = bsp_section_fast_text_end -
> > bsp_section_fast_text_begin;
> > 	bsp_section_fast_text_load_begin = LOADADDR (.fast_text);
> > 	bsp_section_fast_text_load_end = bsp_section_fast_text_load_begin
> > +
> > bsp_section_fast_text_size;
> >
> > Now you can do two things:
> >
> > 1. Use a section attribute (compile time option):
> >
> > #define BSP_FAST_TEXT_SECTION __attribute__((section(".bsp_fast_text")))
> >
> > 2. Rename the .text section of selected object files to .bsp_fast_text (link
> > time option).  Attached is an example script.
> >
> > In general such optimizations are highly application dependent.  In the
> > example
> > script the modules are determined by run-time traces obtained with a
> > hardware
> > tracer from Lauterbach.
> >
> > I don't think we should use the attributes approach in the cpukit, because in
> > this case we must ensure that every linker command file can cope with
> these
> > sections.
> >
> > >
> > > Until today I have been gravitating towards option 2. However now I have
> > bumped into a nasty bug in the gnu ld where the linker script parsing code
> > will find a short object file name in its input stream, but will not find a long
> > name such as "libscorecpu_a-nios2-iic-low-level.o". Since the file name
> > characters are chopped off somewhere in ld, maybe at about 10 characters
> > from the right hand side of the above name, then it's difficult to work
> around
> > this issue with wild cards; I end up with file name ambiguities, and too much
> > goes in the tightly coupled code section.
> > >
> > > Furthermore, I see that with option 1 we have another benefit being that
> > some rwdata or bss can be in the object file as long as it isn't in the function.
> > Of course the BSS and RWDATA can't be in on-chip ram if we are to survive
> a
> > reset without reloading everything from flash.
> > >
> > > Therefore, my current plan is to make a Nios2 private macro for the C
> > source code gnu section name attribute. This implies that we can initially
> > place only Nios2 specific codes in on-chip ram of course.
> > >
> > > Another option would be to just fix the gnu ld program, but finding the
> > exact location of this bug in the source initially appears to be a time
> > consuming task. Perhaps this is even a bug originating from Altera, I don't
> > know.
> > >
> > > Open to your suggestions.
> > >
> > > Jeff
> > >
> >
> > Did you try out the latest GNU Binutils (not the release, the CVS head)?
> They
> > support now Nios II and it works well for me.
> >
> > --
> > Sebastian Huber, embedded brains GmbH
> >
> > Address : Dornierstr. 4, D-82178 Puchheim, Germany
> > Phone   : +49 89 189 47 41-16
> > Fax     : +49 89 189 47 41-09
> > E-Mail  : sebastian.huber at embedded-brains.de
> > PGP     : Public key available on request.
> >
> > Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.




More information about the users mailing list