GSoC 2015: Raspberry Pi 2 Support

Rohini Kulkarni krohini1593 at gmail.com
Sat Jun 20 20:02:17 UTC 2015


Hi,

I have added an SMP related post to my blog to define where exactly in the
code I need to work. Some feedback to indicate if I am identifying the work
area correctly would be very helpful!

Thanks!
 On 18 Jun 2015 03:37, "Rohini Kulkarni" <krohini1593 at gmail.com> wrote:

> Hi all,
>
> I have updated my blog to reflect my understanding and attempts for cache
> performance issue.
>
> Lately I have been trying around memory attributes for the
> mm_config_table. One set of configurations for cacheable memory (inner and
> outer levels)ended up reducing performance further ( which I really thought
> would improve). So this table set up certainly controls performance.
>
> The results are not improving after turning on cache. So memory sections
> are perhaps not even getting cached.
> I get a feeling it has got to do with this mm_config_table.
>
> Updates from the github code and blog might help in further discussion.
>
> Link to github code:https://github.com/krohini1593/rtems/tree/rohini
>
> Link to Blog <http://rohiniwithrpi2.blogspot.in/p/blog-page_3.html>
>
> Thanks!
>
> On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore <alan.cudmore at gmail.com>
> wrote:
>
>> Hi,
>> Some of the code examples may give you some clues. Like this one:
>> https://github.com/mrvn/test/blob/master/smp.cc
>>
>> Or this:
>> https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT
>>
>> If you still can't figure it out, you can always join the raspberrypi.org
>> forums and ask on this thread:
>> https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904
>>
>> When it comes to the Pi 2 and SMP, you are our RTEMS expert :)
>>
>> Thanks,
>> Alan
>>
>>
>> On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni <krohini1593 at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> This is regarding Pi 2 SMP support. After powering on, the secondary
>>> mailboxes read one of their four mailbox registers and wait for a non-zero
>>> content to be written. This content is to be the physical address of the
>>> location from where the cores are expected to start execution.
>>>
>>> I am stuck at figuring out this address. How should I go about
>>> understanding this?
>>>
>>> Thanks!
>>> On 3 Jun 2015 19:44, "Gedare Bloom" <gedare at gwu.edu> wrote:
>>>
>>>> On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni <krohini1593 at gmail.com>
>>>> wrote:
>>>> > But, I can't say cache configurations have a role here.
>>>> >
>>>> > I'll push my code to my github project soon.
>>>> >
>>>> > P.S. The Pi2 board I possess seems to have broken down. It just isn't
>>>> > turning on. Unable to test further. Will order one immediately.
>>>> >
>>>> Ouch. Make sure you put it in a safe space for development, clear of
>>>> threats like moisture, static shock, and cats.
>>>>
>>>> > On 3 Jun 2015 09:03, "Rohini Kulkarni" <krohini1593 at gmail.com> wrote:
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> Alan, your suggestion has resulted in much improvement
>>>> >>
>>>> >> arm_control=0x1000
>>>> >>
>>>> >> This has simply worked! Looks like the other cores were taking up
>>>> plenty
>>>> >> of time.
>>>> >> I was aware from references that the other cores run a WFI, but ya,
>>>> did
>>>> >> not get its impact.
>>>> >> Time for each dhrystone has reduced to 7 from 13 and the no of
>>>> dhrystones
>>>> >> per second also increased.
>>>> >>
>>>> >> But this is a change only in the config.txt not actually in the boot
>>>> code.
>>>> >>
>>>> >> Thanks
>>>> >>
>>>> >> Rohini
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore <alan.cudmore at gmail.com
>>>> >
>>>> >> wrote:
>>>> >>>
>>>> >>> The caches are being enabled on the RPI 1 BSP. The same code is
>>>> being
>>>> >>> executed by the RPI 2 BSP, but obviously it’s not sufficient for
>>>> the cache
>>>> >>> setup.
>>>> >>> I have been reading through this long thread, and it is very
>>>> informative:
>>>> >>> https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904
>>>> >>>
>>>> >>> I am starting to understand the setup that is required to enable
>>>> caches
>>>> >>> on the RPI 2. For example this message near the bottom of page 3
>>>> gives a
>>>> >>> good indication of the speedup available by configuring the MMU and
>>>> caches
>>>> >>> correctly:
>>>> >>> Quote from above thread
>>>> >>> ------------------------------
>>>> >>> Enabling I/D caches and branch prediction, just like the julia demo
>>>> uses,
>>>> >>> it takes ~12 seconds, or ~21 fps. It's just one core but also a
>>>> much smaller
>>>> >>> loop than the julia demo has.
>>>> >>>
>>>> >>> Enabling the MMU and mapping memory inner/outer write-back, write
>>>> >>> allocate and the framebuffer inner write-through, no write allocate
>>>> + outer
>>>> >>> write-back, write-allocate it takes ~8 seconds, of 32 fps.
>>>> >>>
>>>> >>> PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2
>>>> cache
>>>> >>> effect.
>>>> >>> -------------------------
>>>> >>> End of quote
>>>> >>>
>>>> >>> The person who posted the above comment (mrvn) posted the code here:
>>>> >>> https://github.com/mrvn/test/blob/master/mmu.cc
>>>> >>>
>>>> >>>
>>>> >>> Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a
>>>> wait
>>>> >>> loop always accessing the bus. By putting this option in the
>>>> config.txt file
>>>> >>> you can put the other cores to sleep, speeding up the code on core
>>>> 1.
>>>> >>>  arm_control=0x1000
>>>> >>> It would be worth trying that option to see if the benchmark speeds
>>>> up.
>>>> >>>
>>>> >>>
>>>> >>> Alan
>>>> >>>
>>>> >>> On Jun 2, 2015, at 8:05 AM, Hesham ALMatary <
>>>> heshamelmatary at gmail.com>
>>>> >>> wrote:
>>>> >>>
>>>> >>> On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni <
>>>> krohini1593 at gmail.com>
>>>> >>> wrote:
>>>> >>>
>>>> >>> From what I saw, they have to be enabled separately. Cache/mmu are
>>>> >>> disabled
>>>> >>> upon reset.
>>>> >>>
>>>> >>> For the existing Raspberry BSP [1] there's a code for MMU/Cache
>>>> init,
>>>> >>> however I don't know about Pi2 and where its code is.
>>>> >>>
>>>> >>> [1]
>>>> >>>
>>>> https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi
>>>> >>>
>>>> >>> On 2 Jun 2015 16:59, "Hesham ALMatary" <heshamelmatary at gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>>
>>>> >>> Hi,
>>>> >>>
>>>> >>> Aren't the MMU/Caches enabled by default for RPi [1]?
>>>> >>>
>>>> >>> [1]
>>>> >>>
>>>> >>>
>>>> https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c
>>>> >>>
>>>> >>> On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill
>>>> >>> <joel.sherrill at oarcorp.com> wrote:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni <
>>>> krohini1593 at gmail.com>
>>>> >>> wrote:
>>>> >>>
>>>> >>> Dr. Joel,
>>>> >>>
>>>> >>> So we can't say something solely on the basis of this result?
>>>> >>>
>>>> >>>
>>>> >>> I don't think so. If Linux performs the same, then what you did is
>>>> as
>>>> >>> good as it gets.
>>>> >>>
>>>> >>> However, if Linux is faster then some setting still isn't right.
>>>> >>>
>>>> >>> You need a reference measurement to have any confidence. It is
>>>> possible
>>>> >>> you did something but didn't actually turn the cache (or all the
>>>> cache)
>>>> >>> on.
>>>> >>>
>>>> >>> On 2 Jun 2015 16:28, "Rohini Kulkarni" <krohini1593 at gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>> I have not run it under linux on pi2 yet. Will have to run and check
>>>> >>> the result.
>>>> >>>
>>>> >>> On 2 Jun 2015 16:16, "Joel Sherrill" <joel.sherrill at oarcorp.com>
>>>> wrote:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni <
>>>> krohini1593 at gmail.com>
>>>> >>> wrote:
>>>> >>>
>>>> >>> HI,
>>>> >>>
>>>> >>> I tried running the dhrystone benchmark with some changes for
>>>> >>>
>>>> >>> cache/mmu
>>>> >>>
>>>> >>> set up.
>>>> >>>
>>>> >>> However, the output shows a reduction in performance.
>>>> >>> The time to run through the dhrystone has increased from 12 to 13
>>>> and
>>>> >>> dhrystones run per second decreased.
>>>> >>>
>>>> >>> According to this result, things were better with caches disabled.
>>>> >>>
>>>> >>>
>>>> >>> I have been working on this since two days and could not figure out
>>>> an
>>>> >>> improvement. Any pointers?
>>>> >>>
>>>> >>>
>>>> >>> How did it do under Linux on the Pi2?
>>>> >>>
>>>> >>>
>>>> >>> Thanks.
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni
>>>> >>> <krohini1593 at gmail.com> wrote:
>>>> >>>
>>>> >>> Hi All,
>>>> >>>
>>>> >>> I have to implement the cache coherency support for Cortex A7. But
>>>> for
>>>> >>> A7 MPCore, unlike for A9, I am not able to find any register
>>>> >>> description for the Snoop Control Unit from the TRM.
>>>> >>>
>>>> >>> I need help here on how to proceed.
>>>> >>>
>>>> >>> Additionally for A9 there is a single bit for A9 in the Auxiliary
>>>> >>> Control Register which enables cache broadcast operations. The
>>>> >>>
>>>> >>> register
>>>> >>>
>>>> >>> format is different for A7 and again I am unable to find how to
>>>> >>>
>>>> >>> achieve
>>>> >>>
>>>> >>> the same for A7.
>>>> >>>
>>>> >>> Thanks!
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill
>>>> >>> <joel.sherrill at oarcorp.com> wrote:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On 5/5/2015 11:11 AM, Rohini Kulkarni wrote:
>>>> >>>
>>>> >>> Hi,
>>>> >>>
>>>> >>> I am working with the code for bsp hooks. I am referring to existing
>>>> >>> ARM multicore bsp codes, zync mainly.
>>>> >>>
>>>> >>> 1. There are existing hooks for the raspberry pi. Where should the
>>>> >>>
>>>> >>> code
>>>> >>>
>>>> >>> for the  Pi2 hooks be added?
>>>> >>>
>>>> >>> The Pi and Pi2 are remarkably similar so Pi2 should be placed inside
>>>> >>> the Pi BSP directory.
>>>> >>> There is already a Pi2 variant of that code built. But we know
>>>> >>>
>>>> >>> specific
>>>> >>>
>>>> >>> places where there
>>>> >>> are variances. Depending on the scope of what is different, it can
>>>> be
>>>> >>> as simple as
>>>> >>> a cpp conditional in a .h to select a value or two implementations
>>>> of
>>>> >>>
>>>> >>> a
>>>> >>>
>>>> >>> single method
>>>> >>> and the Makefile.am picking the right file to build based on the
>>>> board
>>>> >>> variant.
>>>> >>>
>>>> >>> The big question to always ask is: Is this specific to the Pi2 and
>>>> >>> incompatible with the Pi?
>>>> >>>
>>>> >>> Since the Pi BSP is still missing capabilities, it is likely code
>>>> >>> common to both will
>>>> >>> be added this summer. For example, did the mailbox interface
>>>> change? I
>>>> >>> don't know
>>>> >>> but would guess that it didn't.  Each new capability added needs
>>>> that
>>>> >>> added.
>>>> >>>
>>>> >>> And any differences need to be analyzed to pick the least intrusive
>>>> >>>
>>>> >>> way
>>>> >>>
>>>> >>> to provide
>>>> >>> alternate implementations. Or enable special code like the Pi2 SMP
>>>> >>> support which
>>>> >>> is dependent on --enable-smp and being a Pi2.
>>>> >>>
>>>> >>> 2. Am I right in understanding that I will have to implement A7
>>>> >>> specific functions as have been for A9? I am referring specifically
>>>> to
>>>> >>> the arm-a9mpcore-start.h
>>>> >>>
>>>> >>> Yes.
>>>> >>>
>>>> >>> If the code is very similar between the a7 and a9, then a discussion
>>>> >>> on devel@ should occur to decide the best way to minimize
>>>> duplication.
>>>> >>>
>>>> >>> If you end up with a7 specific code, you should follow the location
>>>> >>>
>>>> >>> and
>>>> >>>
>>>> >>>
>>>> >>> naming patterns already established. That places it in
>>>> >>> libbsp/arm/shared/...
>>>> >>> so it can be used by any BSP with the right SMP core.
>>>> >>>
>>>> >>>
>>>> >>> I am referring to existing codes to locate and get hold of what
>>>> needs
>>>> >>> to be done in the hooks. However, being new to such
>>>> implementations, I
>>>> >>> am taking longer to understand the details. Any suggestions that
>>>> might
>>>> >>> help here are welcome
>>>> >>>
>>>> >>> The answer will depend on the factors listed above. When code can
>>>> >>> be shared, we want to share it across as many BSPs as makes sense.
>>>> >>> When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2),
>>>> then
>>>> >>> you want to find the way to account for the variation in the least
>>>> >>> intrusive code way possible.
>>>> >>>
>>>> >>> Thanks!
>>>> >>>
>>>> >>> On 1 May 2015 12:45, "Rohini Kulkarni" <krohini1593 at gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>>
>>>> >>> Hi,
>>>> >>>
>>>> >>> Excited to be a part of  this edition of GSoC! Thanks to everybody
>>>> for
>>>> >>> helping me get here and congratulations to all the participating
>>>> >>> students!
>>>> >>>
>>>> >>> So, now getting to work, firstly I wish to know, specifically from
>>>> my
>>>> >>> mentors, any changes that must be made to my proposed project or
>>>> >>> schedule.
>>>> >>>
>>>> >>> Secondly, are there any specifics for the development blog that we
>>>> >>>
>>>> >>> need
>>>> >>>
>>>> >>> to create for the project? Over time what is the blog expected to
>>>> >>> convey.
>>>> >>>
>>>> >>> Also, I have to create a new wiki page for my project as none
>>>> exists.
>>>> >>>
>>>> >>> I
>>>> >>>
>>>> >>> want to know how to add one.
>>>> >>>
>>>> >>> --
>>>> >>>
>>>> >>> Rohini Kulkarni
>>>> >>>
>>>> >>>
>>>> >>> -- Joel Sherrill, Ph.D. Director of Research & Development
>>>> >>> joel.sherrill at OARcorp.com On-Line Applications Research Ask me
>>>> about
>>>> >>> RTEMS: a free RTOS Huntsville AL 35805 Support Available (256)
>>>> >>>
>>>> >>> 722-9985
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>>
>>>> >>> Rohini Kulkarni
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>>
>>>> >>> Rohini Kulkarni
>>>> >>>
>>>> >>>
>>>> >>> --joel
>>>> >>>
>>>> >>>
>>>> >>> --joel
>>>> >>> _______________________________________________
>>>> >>> devel mailing list
>>>> >>> devel at rtems.org
>>>> >>> http://lists.rtems.org/mailman/listinfo/devel
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> Hesham
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> Hesham
>>>> >>> _______________________________________________
>>>> >>> devel mailing list
>>>> >>> devel at rtems.org
>>>> >>> http://lists.rtems.org/mailman/listinfo/devel
>>>> >>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Rohini Kulkarni
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > devel mailing list
>>>> > devel at rtems.org
>>>> > http://lists.rtems.org/mailman/listinfo/devel
>>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel at rtems.org
>>> http://lists.rtems.org/mailman/listinfo/devel
>>>
>>
>>
>
>
> --
> Rohini Kulkarni
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20150621/6115a9cc/attachment-0002.html>


More information about the devel mailing list