GSoC 2015: Raspberry Pi 2 Support

Rohini Kulkarni krohini1593 at gmail.com
Sun Jun 21 19:04:28 UTC 2015


I missed mentioning the number of dhrystones in the previous mail.

Originally it was 1 million.
The new number of dhrystones I executed is 100 million.

On Mon, Jun 22, 2015 at 12:29 AM, Rohini Kulkarni <krohini1593 at gmail.com>
wrote:

> Hi all,
>
> I have managed to get a significant performance improvement with some
> changes in configurations.
>
> The measured time was for dhrystones reduced from 12 to "too small to be
> measured "
>
> For dhrystones the time was 0.4.
>
> The number of dhrystones per second increased from approximately 83333 to
> 2500000 :)
>
> Thanks!
>
> On Sun, Jun 21, 2015 at 1:32 AM, Rohini Kulkarni <krohini1593 at gmail.com>
> wrote:
>
>> Hi,
>>
>> I have added an SMP related post to my blog to define where exactly in
>> the code I need to work. Some feedback to indicate if I am identifying the
>> work area correctly would be very helpful!
>>
>> Thanks!
>>  On 18 Jun 2015 03:37, "Rohini Kulkarni" <krohini1593 at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I have updated my blog to reflect my understanding and attempts for
>>> cache performance issue.
>>>
>>> Lately I have been trying around memory attributes for the
>>> mm_config_table. One set of configurations for cacheable memory (inner and
>>> outer levels)ended up reducing performance further ( which I really thought
>>> would improve). So this table set up certainly controls performance.
>>>
>>> The results are not improving after turning on cache. So memory sections
>>> are perhaps not even getting cached.
>>> I get a feeling it has got to do with this mm_config_table.
>>>
>>> Updates from the github code and blog might help in further discussion.
>>>
>>> Link to github code:https://github.com/krohini1593/rtems/tree/rohini
>>>
>>> Link to Blog <http://rohiniwithrpi2.blogspot.in/p/blog-page_3.html>
>>>
>>> Thanks!
>>>
>>> On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore <alan.cudmore at gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>> Some of the code examples may give you some clues. Like this one:
>>>> https://github.com/mrvn/test/blob/master/smp.cc
>>>>
>>>> Or this:
>>>> https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT
>>>>
>>>> If you still can't figure it out, you can always join the
>>>> raspberrypi.org forums and ask on this thread:
>>>> https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904
>>>>
>>>> When it comes to the Pi 2 and SMP, you are our RTEMS expert :)
>>>>
>>>> Thanks,
>>>> Alan
>>>>
>>>>
>>>> On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni <krohini1593 at gmail.com
>>>> > wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> This is regarding Pi 2 SMP support. After powering on, the secondary
>>>>> mailboxes read one of their four mailbox registers and wait for a non-zero
>>>>> content to be written. This content is to be the physical address of the
>>>>> location from where the cores are expected to start execution.
>>>>>
>>>>> I am stuck at figuring out this address. How should I go about
>>>>> understanding this?
>>>>>
>>>>> Thanks!
>>>>> On 3 Jun 2015 19:44, "Gedare Bloom" <gedare at gwu.edu> wrote:
>>>>>
>>>>>> On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni <
>>>>>> krohini1593 at gmail.com> wrote:
>>>>>> > But, I can't say cache configurations have a role here.
>>>>>> >
>>>>>> > I'll push my code to my github project soon.
>>>>>> >
>>>>>> > P.S. The Pi2 board I possess seems to have broken down. It just
>>>>>> isn't
>>>>>> > turning on. Unable to test further. Will order one immediately.
>>>>>> >
>>>>>> Ouch. Make sure you put it in a safe space for development, clear of
>>>>>> threats like moisture, static shock, and cats.
>>>>>>
>>>>>> > On 3 Jun 2015 09:03, "Rohini Kulkarni" <krohini1593 at gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> Hi,
>>>>>> >>
>>>>>> >> Alan, your suggestion has resulted in much improvement
>>>>>> >>
>>>>>> >> arm_control=0x1000
>>>>>> >>
>>>>>> >> This has simply worked! Looks like the other cores were taking up
>>>>>> plenty
>>>>>> >> of time.
>>>>>> >> I was aware from references that the other cores run a WFI, but
>>>>>> ya, did
>>>>>> >> not get its impact.
>>>>>> >> Time for each dhrystone has reduced to 7 from 13 and the no of
>>>>>> dhrystones
>>>>>> >> per second also increased.
>>>>>> >>
>>>>>> >> But this is a change only in the config.txt not actually in the
>>>>>> boot code.
>>>>>> >>
>>>>>> >> Thanks
>>>>>> >>
>>>>>> >> Rohini
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore <
>>>>>> alan.cudmore at gmail.com>
>>>>>> >> wrote:
>>>>>> >>>
>>>>>> >>> The caches are being enabled on the RPI 1 BSP. The same code is
>>>>>> being
>>>>>> >>> executed by the RPI 2 BSP, but obviously it’s not sufficient for
>>>>>> the cache
>>>>>> >>> setup.
>>>>>> >>> I have been reading through this long thread, and it is very
>>>>>> informative:
>>>>>> >>> https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904
>>>>>> >>>
>>>>>> >>> I am starting to understand the setup that is required to enable
>>>>>> caches
>>>>>> >>> on the RPI 2. For example this message near the bottom of page 3
>>>>>> gives a
>>>>>> >>> good indication of the speedup available by configuring the MMU
>>>>>> and caches
>>>>>> >>> correctly:
>>>>>> >>> Quote from above thread
>>>>>> >>> ------------------------------
>>>>>> >>> Enabling I/D caches and branch prediction, just like the julia
>>>>>> demo uses,
>>>>>> >>> it takes ~12 seconds, or ~21 fps. It's just one core but also a
>>>>>> much smaller
>>>>>> >>> loop than the julia demo has.
>>>>>> >>>
>>>>>> >>> Enabling the MMU and mapping memory inner/outer write-back, write
>>>>>> >>> allocate and the framebuffer inner write-through, no write
>>>>>> allocate + outer
>>>>>> >>> write-back, write-allocate it takes ~8 seconds, of 32 fps.
>>>>>> >>>
>>>>>> >>> PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2
>>>>>> cache
>>>>>> >>> effect.
>>>>>> >>> -------------------------
>>>>>> >>> End of quote
>>>>>> >>>
>>>>>> >>> The person who posted the above comment (mrvn) posted the code
>>>>>> here:
>>>>>> >>> https://github.com/mrvn/test/blob/master/mmu.cc
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Also, it seems that when the Pi 2 starts up, cores 1-3 are put in
>>>>>> a wait
>>>>>> >>> loop always accessing the bus. By putting this option in the
>>>>>> config.txt file
>>>>>> >>> you can put the other cores to sleep, speeding up the code on
>>>>>> core 1.
>>>>>> >>>  arm_control=0x1000
>>>>>> >>> It would be worth trying that option to see if the benchmark
>>>>>> speeds up.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Alan
>>>>>> >>>
>>>>>> >>> On Jun 2, 2015, at 8:05 AM, Hesham ALMatary <
>>>>>> heshamelmatary at gmail.com>
>>>>>> >>> wrote:
>>>>>> >>>
>>>>>> >>> On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni <
>>>>>> krohini1593 at gmail.com>
>>>>>> >>> wrote:
>>>>>> >>>
>>>>>> >>> From what I saw, they have to be enabled separately. Cache/mmu are
>>>>>> >>> disabled
>>>>>> >>> upon reset.
>>>>>> >>>
>>>>>> >>> For the existing Raspberry BSP [1] there's a code for MMU/Cache
>>>>>> init,
>>>>>> >>> however I don't know about Pi2 and where its code is.
>>>>>> >>>
>>>>>> >>> [1]
>>>>>> >>>
>>>>>> https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi
>>>>>> >>>
>>>>>> >>> On 2 Jun 2015 16:59, "Hesham ALMatary" <heshamelmatary at gmail.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Hi,
>>>>>> >>>
>>>>>> >>> Aren't the MMU/Caches enabled by default for RPi [1]?
>>>>>> >>>
>>>>>> >>> [1]
>>>>>> >>>
>>>>>> >>>
>>>>>> https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c
>>>>>> >>>
>>>>>> >>> On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill
>>>>>> >>> <joel.sherrill at oarcorp.com> wrote:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni <
>>>>>> krohini1593 at gmail.com>
>>>>>> >>> wrote:
>>>>>> >>>
>>>>>> >>> Dr. Joel,
>>>>>> >>>
>>>>>> >>> So we can't say something solely on the basis of this result?
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> I don't think so. If Linux performs the same, then what you did
>>>>>> is as
>>>>>> >>> good as it gets.
>>>>>> >>>
>>>>>> >>> However, if Linux is faster then some setting still isn't right.
>>>>>> >>>
>>>>>> >>> You need a reference measurement to have any confidence. It is
>>>>>> possible
>>>>>> >>> you did something but didn't actually turn the cache (or all the
>>>>>> cache)
>>>>>> >>> on.
>>>>>> >>>
>>>>>> >>> On 2 Jun 2015 16:28, "Rohini Kulkarni" <krohini1593 at gmail.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>> I have not run it under linux on pi2 yet. Will have to run and
>>>>>> check
>>>>>> >>> the result.
>>>>>> >>>
>>>>>> >>> On 2 Jun 2015 16:16, "Joel Sherrill" <joel.sherrill at oarcorp.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni <
>>>>>> krohini1593 at gmail.com>
>>>>>> >>> wrote:
>>>>>> >>>
>>>>>> >>> HI,
>>>>>> >>>
>>>>>> >>> I tried running the dhrystone benchmark with some changes for
>>>>>> >>>
>>>>>> >>> cache/mmu
>>>>>> >>>
>>>>>> >>> set up.
>>>>>> >>>
>>>>>> >>> However, the output shows a reduction in performance.
>>>>>> >>> The time to run through the dhrystone has increased from 12 to 13
>>>>>> and
>>>>>> >>> dhrystones run per second decreased.
>>>>>> >>>
>>>>>> >>> According to this result, things were better with caches disabled.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> I have been working on this since two days and could not figure
>>>>>> out an
>>>>>> >>> improvement. Any pointers?
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> How did it do under Linux on the Pi2?
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Thanks.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni
>>>>>> >>> <krohini1593 at gmail.com> wrote:
>>>>>> >>>
>>>>>> >>> Hi All,
>>>>>> >>>
>>>>>> >>> I have to implement the cache coherency support for Cortex A7.
>>>>>> But for
>>>>>> >>> A7 MPCore, unlike for A9, I am not able to find any register
>>>>>> >>> description for the Snoop Control Unit from the TRM.
>>>>>> >>>
>>>>>> >>> I need help here on how to proceed.
>>>>>> >>>
>>>>>> >>> Additionally for A9 there is a single bit for A9 in the Auxiliary
>>>>>> >>> Control Register which enables cache broadcast operations. The
>>>>>> >>>
>>>>>> >>> register
>>>>>> >>>
>>>>>> >>> format is different for A7 and again I am unable to find how to
>>>>>> >>>
>>>>>> >>> achieve
>>>>>> >>>
>>>>>> >>> the same for A7.
>>>>>> >>>
>>>>>> >>> Thanks!
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill
>>>>>> >>> <joel.sherrill at oarcorp.com> wrote:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On 5/5/2015 11:11 AM, Rohini Kulkarni wrote:
>>>>>> >>>
>>>>>> >>> Hi,
>>>>>> >>>
>>>>>> >>> I am working with the code for bsp hooks. I am referring to
>>>>>> existing
>>>>>> >>> ARM multicore bsp codes, zync mainly.
>>>>>> >>>
>>>>>> >>> 1. There are existing hooks for the raspberry pi. Where should the
>>>>>> >>>
>>>>>> >>> code
>>>>>> >>>
>>>>>> >>> for the  Pi2 hooks be added?
>>>>>> >>>
>>>>>> >>> The Pi and Pi2 are remarkably similar so Pi2 should be placed
>>>>>> inside
>>>>>> >>> the Pi BSP directory.
>>>>>> >>> There is already a Pi2 variant of that code built. But we know
>>>>>> >>>
>>>>>> >>> specific
>>>>>> >>>
>>>>>> >>> places where there
>>>>>> >>> are variances. Depending on the scope of what is different, it
>>>>>> can be
>>>>>> >>> as simple as
>>>>>> >>> a cpp conditional in a .h to select a value or two
>>>>>> implementations of
>>>>>> >>>
>>>>>> >>> a
>>>>>> >>>
>>>>>> >>> single method
>>>>>> >>> and the Makefile.am picking the right file to build based on the
>>>>>> board
>>>>>> >>> variant.
>>>>>> >>>
>>>>>> >>> The big question to always ask is: Is this specific to the Pi2 and
>>>>>> >>> incompatible with the Pi?
>>>>>> >>>
>>>>>> >>> Since the Pi BSP is still missing capabilities, it is likely code
>>>>>> >>> common to both will
>>>>>> >>> be added this summer. For example, did the mailbox interface
>>>>>> change? I
>>>>>> >>> don't know
>>>>>> >>> but would guess that it didn't.  Each new capability added needs
>>>>>> that
>>>>>> >>> added.
>>>>>> >>>
>>>>>> >>> And any differences need to be analyzed to pick the least
>>>>>> intrusive
>>>>>> >>>
>>>>>> >>> way
>>>>>> >>>
>>>>>> >>> to provide
>>>>>> >>> alternate implementations. Or enable special code like the Pi2 SMP
>>>>>> >>> support which
>>>>>> >>> is dependent on --enable-smp and being a Pi2.
>>>>>> >>>
>>>>>> >>> 2. Am I right in understanding that I will have to implement A7
>>>>>> >>> specific functions as have been for A9? I am referring
>>>>>> specifically to
>>>>>> >>> the arm-a9mpcore-start.h
>>>>>> >>>
>>>>>> >>> Yes.
>>>>>> >>>
>>>>>> >>> If the code is very similar between the a7 and a9, then a
>>>>>> discussion
>>>>>> >>> on devel@ should occur to decide the best way to minimize
>>>>>> duplication.
>>>>>> >>>
>>>>>> >>> If you end up with a7 specific code, you should follow the
>>>>>> location
>>>>>> >>>
>>>>>> >>> and
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> naming patterns already established. That places it in
>>>>>> >>> libbsp/arm/shared/...
>>>>>> >>> so it can be used by any BSP with the right SMP core.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> I am referring to existing codes to locate and get hold of what
>>>>>> needs
>>>>>> >>> to be done in the hooks. However, being new to such
>>>>>> implementations, I
>>>>>> >>> am taking longer to understand the details. Any suggestions that
>>>>>> might
>>>>>> >>> help here are welcome
>>>>>> >>>
>>>>>> >>> The answer will depend on the factors listed above. When code can
>>>>>> >>> be shared, we want to share it across as many BSPs as makes sense.
>>>>>> >>> When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2),
>>>>>> then
>>>>>> >>> you want to find the way to account for the variation in the least
>>>>>> >>> intrusive code way possible.
>>>>>> >>>
>>>>>> >>> Thanks!
>>>>>> >>>
>>>>>> >>> On 1 May 2015 12:45, "Rohini Kulkarni" <krohini1593 at gmail.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Hi,
>>>>>> >>>
>>>>>> >>> Excited to be a part of  this edition of GSoC! Thanks to
>>>>>> everybody for
>>>>>> >>> helping me get here and congratulations to all the participating
>>>>>> >>> students!
>>>>>> >>>
>>>>>> >>> So, now getting to work, firstly I wish to know, specifically
>>>>>> from my
>>>>>> >>> mentors, any changes that must be made to my proposed project or
>>>>>> >>> schedule.
>>>>>> >>>
>>>>>> >>> Secondly, are there any specifics for the development blog that we
>>>>>> >>>
>>>>>> >>> need
>>>>>> >>>
>>>>>> >>> to create for the project? Over time what is the blog expected to
>>>>>> >>> convey.
>>>>>> >>>
>>>>>> >>> Also, I have to create a new wiki page for my project as none
>>>>>> exists.
>>>>>> >>>
>>>>>> >>> I
>>>>>> >>>
>>>>>> >>> want to know how to add one.
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>>
>>>>>> >>> Rohini Kulkarni
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> -- Joel Sherrill, Ph.D. Director of Research & Development
>>>>>> >>> joel.sherrill at OARcorp.com On-Line Applications Research Ask me
>>>>>> about
>>>>>> >>> RTEMS: a free RTOS Huntsville AL 35805 Support Available (256)
>>>>>> >>>
>>>>>> >>> 722-9985
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>>
>>>>>> >>> Rohini Kulkarni
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>>
>>>>>> >>> Rohini Kulkarni
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --joel
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --joel
>>>>>> >>> _______________________________________________
>>>>>> >>> devel mailing list
>>>>>> >>> devel at rtems.org
>>>>>> >>> http://lists.rtems.org/mailman/listinfo/devel
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>> Hesham
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>> Hesham
>>>>>> >>> _______________________________________________
>>>>>> >>> devel mailing list
>>>>>> >>> devel at rtems.org
>>>>>> >>> http://lists.rtems.org/mailman/listinfo/devel
>>>>>> >>>
>>>>>> >>>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> --
>>>>>> >> Rohini Kulkarni
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > devel mailing list
>>>>>> > devel at rtems.org
>>>>>> > http://lists.rtems.org/mailman/listinfo/devel
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel at rtems.org
>>>>> http://lists.rtems.org/mailman/listinfo/devel
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Rohini Kulkarni
>>>
>>
>
>
> --
> Rohini Kulkarni
>



-- 
Rohini Kulkarni
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20150622/a05021d3/attachment-0002.html>


More information about the devel mailing list