GSoC 2015: Raspberry Pi 2 Support

Alan Cudmore alan.cudmore at gmail.com
Wed Jun 3 01:42:04 UTC 2015


The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup.
I have been reading through this long thread, and it is very informative:
https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904 <https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904>

I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly:
Quote from above thread
------------------------------
Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has.

Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps.

PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect.
-------------------------
End of quote

The person who posted the above comment (mrvn) posted the code here:
https://github.com/mrvn/test/blob/master/mmu.cc <https://github.com/mrvn/test/blob/master/mmu.cc>


Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a wait loop always accessing the bus. By putting this option in the config.txt file you can put the other cores to sleep, speeding up the code on core 1.  
 arm_control=0x1000
It would be worth trying that option to see if the benchmark speeds up.


Alan

> On Jun 2, 2015, at 8:05 AM, Hesham ALMatary <heshamelmatary at gmail.com> wrote:
> 
> On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni <krohini1593 at gmail.com <mailto:krohini1593 at gmail.com>> wrote:
>> From what I saw, they have to be enabled separately. Cache/mmu are disabled
>> upon reset.
>> 
> For the existing Raspberry BSP [1] there's a code for MMU/Cache init,
> however I don't know about Pi2 and where its code is.
> 
> [1] https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi <https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi>
>> On 2 Jun 2015 16:59, "Hesham ALMatary" <heshamelmatary at gmail.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Aren't the MMU/Caches enabled by default for RPi [1]?
>>> 
>>> [1]
>>> https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c
>>> 
>>> On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill
>>> <joel.sherrill at oarcorp.com> wrote:
>>>> 
>>>> 
>>>> On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni <krohini1593 at gmail.com>
>>>> wrote:
>>>>> Dr. Joel,
>>>>> 
>>>>> So we can't say something solely on the basis of this result?
>>>> 
>>>> I don't think so. If Linux performs the same, then what you did is as
>>>> good as it gets.
>>>> 
>>>> However, if Linux is faster then some setting still isn't right.
>>>> 
>>>> You need a reference measurement to have any confidence. It is possible
>>>> you did something but didn't actually turn the cache (or all the cache) on.
>>>> 
>>>>> On 2 Jun 2015 16:28, "Rohini Kulkarni" <krohini1593 at gmail.com> wrote:
>>>>> 
>>>>> I have not run it under linux on pi2 yet. Will have to run and check
>>>>> the result.
>>>>> 
>>>>> On 2 Jun 2015 16:16, "Joel Sherrill" <joel.sherrill at oarcorp.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni <krohini1593 at gmail.com>
>>>>> wrote:
>>>>>> HI,
>>>>>> 
>>>>>> I tried running the dhrystone benchmark with some changes for
>>>>> cache/mmu
>>>>>> set up.
>>>>>> 
>>>>>> However, the output shows a reduction in performance.
>>>>>> The time to run through the dhrystone has increased from 12 to 13 and
>>>>>> dhrystones run per second decreased.
>>>>>> 
>>>>>> According to this result, things were better with caches disabled.
>>>>>> 
>>>>>> 
>>>>>> I have been working on this since two days and could not figure out an
>>>>>> improvement. Any pointers?
>>>>> 
>>>>> How did it do under Linux on the Pi2?
>>>>> 
>>>>> 
>>>>>> Thanks.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni
>>>>>> <krohini1593 at gmail.com> wrote:
>>>>>> 
>>>>>> Hi All,
>>>>>> 
>>>>>> I have to implement the cache coherency support for Cortex A7. But for
>>>>>> A7 MPCore, unlike for A9, I am not able to find any register
>>>>>> description for the Snoop Control Unit from the TRM.
>>>>>> 
>>>>>> I need help here on how to proceed.
>>>>>> 
>>>>>> Additionally for A9 there is a single bit for A9 in the Auxiliary
>>>>>> Control Register which enables cache broadcast operations. The
>>>>> register
>>>>>> format is different for A7 and again I am unable to find how to
>>>>> achieve
>>>>>> the same for A7.
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill
>>>>>> <joel.sherrill at oarcorp.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 5/5/2015 11:11 AM, Rohini Kulkarni wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I am working with the code for bsp hooks. I am referring to existing
>>>>>> ARM multicore bsp codes, zync mainly.
>>>>>> 
>>>>>> 1. There are existing hooks for the raspberry pi. Where should the
>>>>> code
>>>>>> for the  Pi2 hooks be added?
>>>>>> 
>>>>>> The Pi and Pi2 are remarkably similar so Pi2 should be placed inside
>>>>>> the Pi BSP directory.
>>>>>> There is already a Pi2 variant of that code built. But we know
>>>>> specific
>>>>>> places where there
>>>>>> are variances. Depending on the scope of what is different, it can be
>>>>>> as simple as
>>>>>> a cpp conditional in a .h to select a value or two implementations of
>>>>> a
>>>>>> single method
>>>>>> and the Makefile.am picking the right file to build based on the board
>>>>>> variant.
>>>>>> 
>>>>>> The big question to always ask is: Is this specific to the Pi2 and
>>>>>> incompatible with the Pi?
>>>>>> 
>>>>>> Since the Pi BSP is still missing capabilities, it is likely code
>>>>>> common to both will
>>>>>> be added this summer. For example, did the mailbox interface change? I
>>>>>> don't know
>>>>>> but would guess that it didn't.  Each new capability added needs that
>>>>>> added.
>>>>>> 
>>>>>> And any differences need to be analyzed to pick the least intrusive
>>>>> way
>>>>>> to provide
>>>>>> alternate implementations. Or enable special code like the Pi2 SMP
>>>>>> support which
>>>>>> is dependent on --enable-smp and being a Pi2.
>>>>>> 
>>>>>> 2. Am I right in understanding that I will have to implement A7
>>>>>> specific functions as have been for A9? I am referring specifically to
>>>>>> the arm-a9mpcore-start.h
>>>>>> 
>>>>>> Yes.
>>>>>> 
>>>>>> If the code is very similar between the a7 and a9, then a discussion
>>>>>> on devel@ should occur to decide the best way to minimize duplication.
>>>>>> 
>>>>>> If you end up with a7 specific code, you should follow the location
>>>>> and
>>>>>> 
>>>>>> naming patterns already established. That places it in
>>>>>> libbsp/arm/shared/...
>>>>>> so it can be used by any BSP with the right SMP core.
>>>>>> 
>>>>>> 
>>>>>> I am referring to existing codes to locate and get hold of what needs
>>>>>> to be done in the hooks. However, being new to such implementations, I
>>>>>> am taking longer to understand the details. Any suggestions that might
>>>>>> help here are welcome
>>>>>> 
>>>>>> The answer will depend on the factors listed above. When code can
>>>>>> be shared, we want to share it across as many BSPs as makes sense.
>>>>>> When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), then
>>>>>> you want to find the way to account for the variation in the least
>>>>>> intrusive code way possible.
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> On 1 May 2015 12:45, "Rohini Kulkarni" <krohini1593 at gmail.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> Excited to be a part of  this edition of GSoC! Thanks to everybody for
>>>>>> helping me get here and congratulations to all the participating
>>>>>> students!
>>>>>> 
>>>>>> So, now getting to work, firstly I wish to know, specifically from my
>>>>>> mentors, any changes that must be made to my proposed project or
>>>>>> schedule.
>>>>>> 
>>>>>> Secondly, are there any specifics for the development blog that we
>>>>> need
>>>>>> to create for the project? Over time what is the blog expected to
>>>>>> convey.
>>>>>> 
>>>>>> Also, I have to create a new wiki page for my project as none exists.
>>>>> I
>>>>>> want to know how to add one.
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Rohini Kulkarni
>>>>>> 
>>>>>> 
>>>>>> -- Joel Sherrill, Ph.D. Director of Research & Development
>>>>>> joel.sherrill at OARcorp.com On-Line Applications Research Ask me about
>>>>>> RTEMS: a free RTOS Huntsville AL 35805 Support Available (256)
>>>>> 722-9985
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Rohini Kulkarni
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Rohini Kulkarni
>>>>> 
>>>>> --joel
>>>> 
>>>> --joel
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel at rtems.org
>>>> http://lists.rtems.org/mailman/listinfo/devel
>>> 
>>> 
>>> 
>>> --
>>> Hesham
> 
> 
> 
> -- 
> Hesham
> _______________________________________________
> devel mailing list
> devel at rtems.org <mailto:devel at rtems.org>
> http://lists.rtems.org/mailman/listinfo/devel <http://lists.rtems.org/mailman/listinfo/devel>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20150602/321faaa7/attachment-0001.html>


More information about the devel mailing list