[PATCH v3] arm/xilinx_zynq: Disable the MMU and the data and instruction cache on boot.

Chris Johns chrisj at rtems.org
Thu Sep 1 01:48:57 UTC 2016


On 01/09/2016 00:30, Pavel Pisa wrote:
> Hello Chris,
>
> I am trying to think about consequences.
>

Thank you for this.

> On Wednesday 31 of August 2016 04:16:34 Chris Johns wrote:
>> The u-boot loader can enable the MMU plus the data and instruction caches.
>> Disable them and if the data cache is enabled clear it before turn the
>> caches off.
>>
>> Closes #2774.
>
> I think that this is critical problem and has to be solved.
> May be, this should be committed because it solves problems
> for your actual boot and is probably OK for UP.
>
> But for SMP, I think that there are more problems. When I look into
>
> rtems/c/src/lib/libbsp/arm/shared/include/arm-a9mpcore-start.h
>
> I see
>
>> #ifdef RTEMS_SMP
>> BSP_START_TEXT_SECTION static inline void
>> arm_a9mpcore_start_on_secondary_processor(void)
>> {
>>    uint32_t ctrl;
>>
>>    arm_a9mpcore_start_set_vector_base();
>>
>>    arm_gic_irq_initialize_secondary_cpu();
>>
>>    ctrl = arm_cp15_start_setup_mmu_and_cache(
>>      0,
>>      ARM_CP15_CTRL_AFE | ARM_CP15_CTRL_Z
>>    );
>
> If you add arm_cp15_data_cache_clean_all_levels() into
> arm_cp15_start_setup_mmu_and_cache() then according to my understanding
> of case there can be problems if cache is already or still enabled
> on primary CPU.
>
> I have started to check how are the caches implemented on Zynq
> and my debug code shows that arm_cp15_data_cache_clean_all_levels()
> sees only the first cache level on the CPU. It seem that controler
> is not part of the CPU architecture. So that means that for Zynq
> use of arm_cp15_data_cache_clean_all_levels() is OK.
>
> The result
>
> cache levels:
>    clidr 0x09200003 loc 1
>    level 0 ctype 3
>    level 0 ccsidr 0x701FE019
>    line_power 0x05 associativity 0x04 way_shift 0x1E
>    level 1 ctype 0
>    level 2 ctype 0
>
> This means no cache above level 1 are detected by CP15 code and level
> of coherency is 1 anyway so safe for Zynq. But RPi2 shows multiple levels
> for example. So I would vote to move code into per BSP part the same
> way as Cyclone has that.

OK.

>
>>    arm_cp15_set_domain_access_control(
>>      ARM_CP15_DAC_DOMAIN(ARM_MMU_DEFAULT_CLIENT_DOMAIN, ARM_CP15_DAC_CLIENT)
>>    );
>>
>>    /* FIXME: Sharing the translation table between processors is brittle */
>>    arm_cp15_set_translation_table_base(
>>      (uint32_t *) bsp_translation_table_base
>>    );
>>
>>    ctrl |= ARM_CP15_CTRL_I | ARM_CP15_CTRL_C | ARM_CP15_CTRL_M;
>>    arm_cp15_set_control(ctrl);
>>
>>    _SMP_Start_multitasking_on_secondary_processor();
>> }
>
> Further analysis for generic case.
>
> If it is ensured that cache is disabled on primary CPU during start
> of all secondary CPUs and then sequence
>    invalidate cache
>    wait on barrier with primary
>    enable cache
>    wait on barrier with promary
>    start multitasking
> should be OK.
>
> But mixing can be bad.
>
> When I look into "altera-cyclone-v" which I expect that it is the
> most tested one by Sebastian, then it solves this problem
> by explicitly putting invalidate into bsp_start_hook_0()
>
> rtems/c/src/lib/libbsp/arm/altera-cyclone-v/startup/bspstarthooks.c
>
> BSP_START_TEXT_SECTION void bsp_start_hook_0( void )
> {
>    arm_cp15_instruction_cache_invalidate();
>    arm_cp15_data_cache_invalidate_all_levels();
>    arm_a9mpcore_start_hook_0();
> }
>
> This is called even on secondary CPU and L1 caches setup can
> be mixed for short while. But it is not problem because
> L2 cache should be enabled only after
>
> rtems/c/src/lib/libbsp/arm/altera-cyclone-v/startup/bspsmp.c:_CPU_SMP_Start_processor()
>      started = _Per_CPU_State_wait_for_non_initial_state(cpu_index, 0);
>
> finishes according to the comment.
>
> But when I try to analyze order by traceback
>
> _CPU_SMP_Start_processor
> _SMP_Start_processor
> _SMP_Handler_initialize
> rtems_initialize_data_structures
>
> RTEMS_SYSINIT_ITEM(
>    rtems_initialize_data_structures,
>    RTEMS_SYSINIT_DATA_STRUCTURES,
>    RTEMS_SYSINIT_ORDER_MIDDLE
>
>
> RTEMS_SYSINIT_BSP_WORK_AREAS
> RTEMS_SYSINIT_BSP_START
> RTEMS_SYSINIT_INITIAL_EXTENSIONS
> RTEMS_SYSINIT_DATA_STRUCTURES
>    so there are enabled other CPUs and there is wait for sync
> RTEMS_SYSINIT_USER_EXTENSIONS
> RTEMS_SYSINIT_CPU_SET
>
> L2 cache is enabled in bsp_start_hook_1() on Cyclone
>
>    /* Enable unified L2 cache */
>    rtems_cache_enable_data();
>
> But bsp_start_hook_1() is called early in the boot process even
> on Cyclone from libbsp/arm/shared/start/start.S. So I see even there
> problems to clarify that sequence is completely OK.
>
> I am not expert for SMP and ARM for sure. But I have doubts that
> all that is OK. I try follow with some patch which i consider correct.
> But not sure.

I have tested the patch you posted and it is working. I have manually 
run a few of the single core and SMP tests including the capture engine 
one I have been fixing.

I am going to push your fix and the 2nd core patch of mine. If there are 
further issues identified we can work on those. I think we need to rule 
a line here to work from.

>
> There is another problem which I have noticed on RPi.
> Cache invalidate when U-boots starts application with cache enabled
> can lead to the lost of some data or code, because they are held
> only in cache before cache invalidate and disabled.
>
> So I suggest to flush data there before cache disable.
>
> Best wishes,
>
>                  Pavel
>
> PS: comments are little convoluted, because I have spent quite large amount
>      of time by rethinking and generally looked at that during coordination
>      of more other projects done by studnets at the university.
>

Sure.

I have a concern we are pushing specialised fixes out and down in the 
BSPs and this is has a long term effect of making further changes harder 
because the spread of testing plus it makes it more difficult for others 
to provide support for similar or close processor variants.

Chris


More information about the devel mailing list