Unable to run run-time loaded code on ZedBoard

Mon Oct 26 22:23:11 UTC 2015

Hello Chris and Patrick,

On Monday 26 of October 2015 22:25:20 Chris Johns wrote:
> On 27/10/2015 8:15 am, Patrick Gauvin wrote:
> >> Hmm maybe a data cache flush and instruction cache invalidate is needed.
> >>
> >> Maybe the L2 cache support being added has broken this test.
> >
> > I added calls to rtems_cache_invalidate_entire_data and
> > rtems_cache_invalidate_entire_instruction before calling the loaded
> > functions (dl-load.c:54) and they now work as expected.
>
> Many thanks for testing this and reporting back. Can you please raise a
> ticket in Trac for this against me?
>
> > I haven't looked too much at the internals of the RTL code, but should
> > instruction and data cache invalidates be performed in the RTL code on
> > the memory regions where objects are loaded, or is this a bug
> > somewhere else? I'm willing to help with patching/testing for this
> > since I have the hardware easily available.
>
> The fix will be more complex because the memory used may need to be
> cache line aligned to avoid corrupting surrounding areas when
> invalidating. Another complication is not all boards have caches so I
> need to review the cache API to see what is needed. I may extend the
> allocator interface to add caching support.

Invalidating even entire instruction cache is safe operation
but costs time to flush and reload instructions

rtems_cache_invalidate_entire_instruction()

but calling rtems_cache_invalidate_entire_data() is wrong
in running system for sure. All changes not pushed to the
main memory are lost. So probably some part of the code
written by CPU.

The rtems_cache_flush_entire_data() should be more appropriate
and mostly safe but can cause significant latency to reload
cache as well. It can result in problems when there is ongoing
DMA transfer from device to memory but only if DMA area
targets for which there are modified data in CPU cache.
Such situation caused by write to DMA target area after DMA start
is code error. There can be problem when DMA area is not
allocated cache line aligned and at start or end shares area
with regular modified variables. But support such memory
use is generally problematic. So sequence

  rtems_cache_flush_entire_data()
  rtems_cache_invalidate_entire_instruction()

should be safe. If the area is smaller than cache size then
use use of regions operations should take shorter
and generally means less unrelated latencies to unrelated code/data

  void rtems_cache_flush_multiple_data_lines( const void *, size_t );

  void rtems_cache_invalidate_multiple_instruction_lines( const void *, size_t );

So it would be great if you can test rtems_cache_flush_entire_data() ...
and then this approach. It would be great if the code ensuring correct
behavior on cache equipped systems is included in dlopen
general path.

Best wishes and thanks for information,

              Pavel