RTEMS_FATAL_SOURCE_EXCEPTION in RTEMS

Christian Mauderer christian.mauderer at embedded-brains.de
Fri May 3 09:04:22 UTC 2019


----- Ursprüngliche Mail -----
> Von: "Amarnath MB" <amarnath.mb at mistralsolutions.com>
> An: "Christian Mauderer" <christian.mauderer at embedded-brains.de>
> CC: "RTEMS Users" <users at rtems.org>, "Ravi G Patil" <ravigp at mistralsolutions.com>, "Shekhar Suman Singh"
> <shekhar.s at mistralsolutions.com>
> Gesendet: Freitag, 3. Mai 2019 10:37:20
> Betreff: Re: RTEMS_FATAL_SOURCE_EXCEPTION in RTEMS

> Hi Christian,
> 
> Sorry for the formatting issue, from now onward i will follow like you
> suggested.
> 
>>Hello Amarnath,
>>
>>just a note in front: Could you try to keep the '>' signs (some mail
> programs show them as a solid line)
>>at the front of the lines intact and don't use ones on your newly written
> lines? Otherwise it's a little
>>hard to see the answers. There's normally no strong formatting
> requirements on this list (also we theoretically have a policy:
>>https://devel.rtems.org/wiki/RTEMSMailingLists#Policies) but please try to
> keep the mails readable.
>>
>>I tried to fix the indentation signs in my answer below. I hope that I
> caught all your remarks.
> 
> Thanks.
> 
>>----- Ursprüngliche Mail -----
>>> Von: "Amarnath MB" <amarnath.mb at mistralsolutions.com>
>>> An: "Christian Mauderer" <christian.mauderer at embedded-brains.de>
>>> CC: "RTEMS Users" <users at rtems.org>, "Ravi G Patil" <
> ravigp at mistralsolutions.com>, "Shekhar Suman Singh"
>>> <shekhar.s at mistralsolutions.com>
>>> Gesendet: Freitag, 3. Mai 2019 09:35:27
>>Then it would be best to find the reason for the exception and avoid the
> interrupt lock if possible.
> 
> We are discussing this issue with our FPGA team and hope to resolve it soon.
> 
>>>
>>>> > Flash controller is implemented using the AHB to external SRAM
> interface
>>>> in
>>>> > the Cortex-M System Design Kit and there are no pins shared between
> RAM
>>>> and
>>>> > NOR Flash.  We can't doubt that flash is blocking bus accesses for the
>>>> RAM,
>>>> > because the test executes fine in the uboot running from RAM.
>>>>
>>>> So it's a custom FPGA based design? Most likely your RAM and Flash
>>>> controller are both connected to the same system bus (AHB in that
> case). I
>>>> don't know the modules you mentioned but you might want to take a look
> at
>>>> the documentation whether they lock the AHB or not.
>>>>
>>>  Yes its a custom FPGA based design. Sure, I will go through the
>>> documentation.
>>
>>If the flash controller really blocks all other bus accesses (which would
> be an odd design),
>>you might can move the RAM and Flash to an independent bus or maybe add
> some kind of bus bridge
>>to separate the Flash. But I'm not that deep into the ARM blocks.
> 
> Our FPGA team has to see whether it is feasible.
> 
>>>
>>>> U-Boot most likely doesn't use interrupts. I assume you are waiting for
> a
>>>> flash operation to be finished in some busy wait loop there (polling a
> flag
>>>> or similar). That loop can be fully put into the processor cache (if you
>>>> have one which is quite likely). So in your U-Boot test case, the
> external
>>>> RAM most likely isn't accessed during the flash operations. If you have
> a
>>>> problem
>>>> with conflicting bus accesses you maybe only see them there if you
> disable
>>>> all caches.
>>>
>>> You are correct, we are waiting in a loop polling the busy status of NOR
>>> flash. Currently, the caches are enabled in both uboot and RTEMS.
>>>
>>
>>Maybe you can try to disable them in U-Boot and see whether your behavior
> changes?
>>It could be a bug in disguise if the U-Boot code only works due to caches.
> These are
>>quite nasty because they sometimes depend on the exact position of the
> code. In most
>>cases it works because the loop is in the cache but if for some reason the
> processor
>>has to load a new cache line for the loop it doesn't work. Really hard to
> find because
>>you can change something somewhere completely different and that moves
> your code and
>>triggers a bug.
> 
> Ya sure, I will test the same test code on uboot with caches disabled.
> 
>>>> In RTEMS on the other hand you are running with enabled interrupts and
>>>> task switches. So RAM access are more or less guaranteed during a longer
>>>> flash operation if you don't lock the interrupts.
>>>
>>> Okay, got it.
>>>
>>>> >
>>>> > You had mentioned there are a lot of possible reasons for a
>>>> > RTEMS_FATAL_SOURCE_EXCEPTION, is it somewhere documented what all
> reasons
>>>> > can cause this exception. Any references would be very much helpful.
>>>>
>>>> Basically for an ARM system, you have that source:
>>>>
>>>> cpukit/score/cpu/arm/arm-exception-default.c:24:  rtems_fatal(
>>>> RTEMS_FATAL_SOURCE_EXCEPTION, (rtems_fatal_code) frame );
>>>>
>>>> That is default handler for all exceptions where no extra handler is
>>>> installed. So basically every exception that is listed in the ARM manual
>>>> can be the source.
>>>>
>>> Thank you that was helpful.
>>>
>>
>>If you have a debugger connected, you can set a breakpoint to that handler
> and
>>maybe to _Terminate. Sometimes you can get a source together with the
> reference
>>manual of your processor.
> 
> Sure, we will look in to it.
> 
>>Maybe take a look at your linker command file or use objdump to analyze
> your elf file. Something seems odd
>>there.
> Below is content of linkcmd
> MEMORY {
>    RAM      : ORIGIN = 0x40008000,  LENGTH = 64M - 512k - 32k
>    RAM_MMU  : ORIGIN = 0x40008000 + 64M - 512k - 32k, LENGTH = 32k
> }
> REGION_ALIAS ("REGION_START", RAM);
> REGION_ALIAS ("REGION_VECTOR", RAM);
> REGION_ALIAS ("REGION_TEXT", RAM);
> REGION_ALIAS ("REGION_TEXT_LOAD", RAM);
> REGION_ALIAS ("REGION_RODATA", RAM);
> REGION_ALIAS ("REGION_RODATA_LOAD", RAM);
> REGION_ALIAS ("REGION_DATA", RAM);
> REGION_ALIAS ("REGION_DATA_LOAD", RAM);
> REGION_ALIAS ("REGION_FAST_TEXT", RAM);
> REGION_ALIAS ("REGION_FAST_TEXT_LOAD", RAM);
> REGION_ALIAS ("REGION_FAST_DATA", RAM);
> REGION_ALIAS ("REGION_FAST_DATA_LOAD", RAM);
> REGION_ALIAS ("REGION_BSS", RAM);
> REGION_ALIAS ("REGION_WORK", RAM);
> REGION_ALIAS ("REGION_STACK", RAM);
> REGION_ALIAS ("REGION_NOCACHE", RAM);
> REGION_ALIAS ("REGION_NOCACHE_LOAD", RAM);
> 
> bsp_stack_irq_size = DEFINED (bsp_stack_irq_size) ? bsp_stack_irq_size :
> 4096;
> bsp_stack_abt_size = DEFINED (bsp_stack_abt_size) ? bsp_stack_abt_size :
> 1024;
> 
> bsp_section_rwbarrier_align = DEFINED (bsp_section_rwbarrier_align) ?
> bsp_section_rwbarrier_align : 1M;
> 

Looks OK to me. It's still odd why the PC was on some flash address. I don't think that I can help you a lot more at the current point. So I'll just let you investigate on that topic some more.

> *Thank you & Regards,*
> *Amarnath MB*
> 
> On Fri, May 3, 2019 at 1:37 PM Christian Mauderer <
> christian.mauderer at embedded-brains.de> wrote:
> 
>> Hello Amarnath,
>>
>> just a note in front: Could you try to keep the '>' signs (some mail
>> programs show them as a solid line) at the front of the lines intact and
>> don't use ones on your newly written lines? Otherwise it's a little hard to
>> see the answers. There's normally no strong formatting requirements on this
>> list (also we theoretically have a policy:
>> https://devel.rtems.org/wiki/RTEMSMailingLists#Policies) but please try
>> to keep the mails readable.
>>
>> I tried to fix the indentation signs in my answer below. I hope that I
>> caught all your remarks.
>>
>> ----- Ursprüngliche Mail -----
>> > Von: "Amarnath MB" <amarnath.mb at mistralsolutions.com>
>> > An: "Christian Mauderer" <christian.mauderer at embedded-brains.de>
>> > CC: "RTEMS Users" <users at rtems.org>, "Ravi G Patil" <
>> ravigp at mistralsolutions.com>, "Shekhar Suman Singh"
>> > <shekhar.s at mistralsolutions.com>
>> > Gesendet: Freitag, 3. Mai 2019 09:35:27
>> > Betreff: Re: RTEMS_FATAL_SOURCE_EXCEPTION in RTEMS
>>
>> > On Thu, May 2, 2019 at 3:35 PM Christian Mauderer <
>> > christian.mauderer at embedded-brains.de> wrote:
>> >
>> >> ----- Ursprüngliche Mail -----
>> >> > Von: "Amarnath MB" <amarnath.mb at mistralsolutions.com>
>> >> > An: "Christian Mauderer" <christian.mauderer at embedded-brains.de>
>> >> > CC: "RTEMS Users" <users at rtems.org>, "Ravi G Patil" <
>> >> ravigp at mistralsolutions.com>, "Shekhar Suman Singh"
>> >> > <shekhar.s at mistralsolutions.com>
>> >> > Gesendet: Donnerstag, 2. Mai 2019 11:18:48
>> >> > Betreff: Re: RTEMS_FATAL_SOURCE_EXCEPTION in RTEMS
>> >>
>> >> > Hi Christian,
>> >> >
>> >> > As per your suggestion, i tried disabling  global interrupt before my
>> >> erase
>> >> > call and enabling it afterwards. With this setup the test routine is
>> >> > passing without giving exception. So i think, interrupt was the issue
>> >> here.
>> >> >
>> >>
>> >> Hello Amarnath,
>> >>
>> >> good. Than you have at least a hint toward the problem. I wouldn't see
>> it
>> >> as the solution. Like already discussed, depending on your application,
>> >> this can lead to missed interrupts. If you have a very specific case for
>> >> writing the flash that isn't during your normal operation (for example
>> >> firmware updates) you are most likely fine with that solution. But if
>> you
>> >> write for example log files, you might get a quite unpredictable
>> behavior
>> >> later.
>> >> Hi Christian,
>> >>
>> > Thank you for the detailed explanation. Even though we are using NOR
>> > flash for firmware storage, there is a chance that it may be used for log
>> > files later.
>> >
>>
>> Then it would be best to find the reason for the exception and avoid the
>> interrupt lock if possible.
>>
>> >
>> >> > Flash controller is implemented using the AHB to external SRAM
>> interface
>> >> in
>> >> > the Cortex-M System Design Kit and there are no pins shared between
>> RAM
>> >> and
>> >> > NOR Flash.  We can't doubt that flash is blocking bus accesses for the
>> >> RAM,
>> >> > because the test executes fine in the uboot running from RAM.
>> >>
>> >> So it's a custom FPGA based design? Most likely your RAM and Flash
>> >> controller are both connected to the same system bus (AHB in that
>> case). I
>> >> don't know the modules you mentioned but you might want to take a look
>> at
>> >> the documentation whether they lock the AHB or not.
>> >>
>> >  Yes its a custom FPGA based design. Sure, I will go through the
>> > documentation.
>>
>> If the flash controller really blocks all other bus accesses (which would
>> be an odd design), you might can move the RAM and Flash to an independent
>> bus or maybe add some kind of bus bridge to separate the Flash. But I'm not
>> that deep into the ARM blocks.
>>
>> >
>> >> U-Boot most likely doesn't use interrupts. I assume you are waiting for
>> a
>> >> flash operation to be finished in some busy wait loop there (polling a
>> flag
>> >> or similar). That loop can be fully put into the processor cache (if you
>> >> have one which is quite likely). So in your U-Boot test case, the
>> external
>> >> RAM most likely isn't accessed during the flash operations. If you have
>> a
>> >> problem
>> >> with conflicting bus accesses you maybe only see them there if you
>> disable
>> >> all caches.
>> >
>> > You are correct, we are waiting in a loop polling the busy status of NOR
>> > flash. Currently, the caches are enabled in both uboot and RTEMS.
>> >
>>
>> Maybe you can try to disable them in U-Boot and see whether your behavior
>> changes? It could be a bug in disguise if the U-Boot code only works due to
>> caches. These are quite nasty because they sometimes depend on the exact
>> position of the code. In most cases it works because the loop is in the
>> cache but if for some reason the processor has to load a new cache line for
>> the loop it doesn't work. Really hard to find because you can change
>> something somewhere completely different and that moves your code and
>> triggers a bug.
>>
>> >> In RTEMS on the other hand you are running with enabled interrupts and
>> >> task switches. So RAM access are more or less guaranteed during a longer
>> >> flash operation if you don't lock the interrupts.
>> >
>> > Okay, got it.
>> >
>> >> >
>> >> > You had mentioned there are a lot of possible reasons for a
>> >> > RTEMS_FATAL_SOURCE_EXCEPTION, is it somewhere documented what all
>> reasons
>> >> > can cause this exception. Any references would be very much helpful.
>> >>
>> >> Basically for an ARM system, you have that source:
>> >>
>> >> cpukit/score/cpu/arm/arm-exception-default.c:24:  rtems_fatal(
>> >> RTEMS_FATAL_SOURCE_EXCEPTION, (rtems_fatal_code) frame );
>> >>
>> >> That is default handler for all exceptions where no extra handler is
>> >> installed. So basically every exception that is listed in the ARM manual
>> >> can be the source.
>> >>
>> > Thank you that was helpful.
>> >
>>
>> If you have a debugger connected, you can set a breakpoint to that handler
>> and maybe to _Terminate. Sometimes you can get a source together with the
>> reference manual of your processor.
>>
>> >
>> >> >
>> >> > Thank you once again for your prompt response, it really saved lot of
>> our
>> >> > time.
>> >> >
>> >> > Below is the exception frame we got,
>> >> > FYI, 0x0 - 0x007FFFFF is NOR Flash and 0x40000000 to 0x7FFFFFFF the
>> RAM.
>> >>
>> >> Yust a note: Putting memory at 0 might hides some NULL-pointer accesses.
>> >> If you can do that in your design you might want to think about putting
>> >> some boot ROM there that isn't accessible by the application (for
>> example
>> >> locked via MPU).
>> >>
>> > I got it. We have kept uboot at 0x0 and RTEMS at 0x40000 offset of flash,
>> > we have restricted access from application to these uboot and RTEMS
>> > sectors.
>> >
>> >> Of course there can be worse types of memory Flash (which should be
>> mostly
>> >> a read only memory). I had a controller with some RAM there once.
>> Finding
>> >> NULL-pointer accesses on such a system is really nasty.
>> >>
>> >> >
>> >> > *** FATAL ***
>> >> > fatal source: 9 (RTEMS_FATAL_SOURCE_EXCEPTION)
>> >> >
>> >> > R0   = 0x00000040 R8  = 0x00100000
>> >> > R1   = 0x00080000 R9  = 0x00000001
>> >> > R2   = 0x00000048 R10 = 0x4010a3e8
>> >> > R3   = 0x00000048 R11 = 0x4011737c
>> >> > R4   = 0x00000a00 R12 = 0x00000000
>> >> > R5   = 0x00000500 SP  = 0x40100c40
>> >> > R6   = 0x00000055 LR  = 0x4000a36c
>> >> > R7   = 0x000000aa PC  = 0x003fde80
>> >> > CPSR = 0x200000d2 VEC = 0x00000001
>> >> > RTEMS version: 5.0.0.
>> >> > RTEMS tools: 7.4.0 20181206 (RTEMS 5, RSB
>> >> > 40ae056f12e1cbe530f76a3ebd1e2ac745a888ef, Newlib
>> >> > dc6e94551f09d3a983afd571478d63a09d6f66fa)
>> >> > executing thread ID: 0x08a010002
>> >> > executing thread name: Alpi
>> >>
>> >> Your program counter (PC) points to a Flash address here. Are you sure
>> >> that your application runs entirely from RAM?
>> >>
>> > We are sure that RTEMS is executing entirely from RAM, we are debugging
>> on
>> > this issue.
>>
>> Maybe take a look at your linker command file or use objdump to analyze
>> your elf file. Something seems odd there.
>>
>> Best regards
>>
>> Christian
>>
>> >>
>> >> Best regards
>> >>
>> >> Christian
>> >>
>> >> >
>> >> > *Thank you & Regards,*
>> >> > *Amarnath MB*
>> >> >
>> >> >
>> >> >
>> >> > On Thu, May 2, 2019 at 12:49 AM Christian Mauderer <
>> >> > christian.mauderer at embedded-brains.de> wrote:
>> >> >
>> >> >>
>> >> >> ----- Ursprüngliche Mail -----
>> >> >> > Von: "Amarnath MB" <amarnath.mb at mistralsolutions.com>
>> >> >> > An: "Christian Mauderer" <christian.mauderer at embedded-brains.de>
>> >> >> > CC: "RTEMS Users" <users at rtems.org>, "Ravi G Patil" <
>> >> >> ravigp at mistralsolutions.com>, "Shekhar Suman Singh"
>> >> >> > <shekhar.s at mistralsolutions.com>
>> >> >> > Gesendet: Mittwoch, 1. Mai 2019 17:07:13
>> >> >> > Betreff: Re: RTEMS_FATAL_SOURCE_EXCEPTION in RTEMS
>> >> >>
>> >> >> > Hi Christian,
>> >> >> >
>> >> >> > Thanks for your quick response.
>> >> >> > Our boot process is like, first u-boot will be loaded from the NOR
>> >> flash
>> >> >> > and then the uboot copies RTEMS application from NOR flash to the
>> RAM.
>> >> >> > After that RTEMS executes entirely from RAM.
>> >> >> >
>> >> >> > How can i issue a global interrupt disable? Is it using *
>> >> >> > rtems_interrupt_disable(0)*?
>> >> >> > One more doubt, if i issue a global interrupt disable then will
>> there
>> >> be
>> >> >> > chance that i can miss few clock ticks?
>> >> >> >
>> >> >> > *Thank you & Regards,*
>> >> >> > *Amarnath MB*
>> >> >> >
>> >> >>
>> >> >> Hello Amarnath,
>> >> >>
>> >> >> with a global interrupt lock, all kinds of events can be missed
>> >> including
>> >> >> clock ticks. Most should be processed just a little late but some
>> >> >> interfaces might loose packets or data (depending on the data rate /
>> >> flash
>> >> >> access times). So if not necessary it's not a good solution.
>> >> >>
>> >> >> The rtems_interrupt_disable() function should be called with a level
>> >> >> argument. See the example at
>> >> >>
>> >>
>> https://docs.rtems.org/branches/master/c-user/interrupt_manager.html#rtems-interrupt-disable
>> >> .
>> >> >> Note that for SMP configurations, you might have to use other
>> functions.
>> >> >>
>> >> >> But if your application runs entirely from RAM than the bus access
>> >> >> shouldn't be the problem. So the interrupt might isn't the right
>> guess.
>> >> >>
>> >> >> Do you have any more information on the exception and where it
>> happens?
>> >> >> Some output or a stack trace from a debugger?
>> >> >>
>> >> >> What kind of flash controller is used? Can it block bus accesses for
>> the
>> >> >> RAM? Does your RAM share pins with the flash?
>> >> >>
>> >> >> With kind regards
>> >> >>
>> >> >> Christian
>> >> >>
>> >> >> >
>> >> >> >
>> >> >> > On Wed, May 1, 2019 at 8:16 PM Christian Mauderer <
>> >> >> > christian.mauderer at embedded-brains.de> wrote:
>> >> >> >
>> >> >> >> ----- Ursprüngliche Mail -----
>> >> >> >> > Von: "Amarnath MB" <amarnath.mb at mistralsolutions.com>
>> >> >> >> > An: "RTEMS Users" <users at rtems.org>
>> >> >> >> > CC: "Ravi G Patil" <ravigp at mistralsolutions.com>, "Shekhar
>> Suman
>> >> >> Singh"
>> >> >> >> <shekhar.s at mistralsolutions.com>
>> >> >> >> > Gesendet: Mittwoch, 1. Mai 2019 16:28:23
>> >> >> >> > Betreff: RTEMS_FATAL_SOURCE_EXCEPTION in RTEMS
>> >> >> >>
>> >> >> >> > Hi All,
>> >> >> >> >
>> >> >> >> > I'm developing RTEMS 5.00 BSP and device drivers for a custom
>> >> >> ARM926EJ-S
>> >> >> >> > core. I was successful in porting and building BSP for the ARM
>> >> core.
>> >> >> >> >
>> >> >> >> > We are facing a strange issue with the generic NOR flash driver
>> we
>> >> >> have
>> >> >> >> > developed. Our device drivers are designed such that it can be
>> used
>> >> >> with
>> >> >> >> > RTEMS as well as the bare metal programs.
>> >> >> >> > Our driver is working fine in bare metal program (erase, read,
>> >> write
>> >> >> >> > everything), but when we use the driver with RTEMS application
>> and
>> >> >> issue
>> >> >> >> an
>> >> >> >> > erase call, then the application gives
>> >> RTEMS_FATAL_SOURCE_EXCEPTION.
>> >> >> >> >
>> >> >> >> > For testing driver in RTEMS, we have added each test routine as
>> a
>> >> >> custom
>> >> >> >> > shell command using rtems_shell_add_cmd().
>> >> >> >> >
>> >> >> >> > For your info we also tested the same driver with u-boot and it
>> >> works
>> >> >> >> > fines.
>> >> >> >> > Can anyone guide me on this issue?
>> >> >> >> >
>> >> >> >> > *Thank you & Regards,*
>> >> >> >> > *Amarnath MB*
>> >> >> >> >
>> >> >> >>
>> >> >> >> Hello Amarnath,
>> >> >> >>
>> >> >> >> it's a little hard with the given information to tell you a
>> concrete
>> >> >> >> problem. There are a lot of possible reasons for a
>> >> >> >> RTEMS_FATAL_SOURCE_EXCEPTION.
>> >> >> >>
>> >> >> >> From your description, my guess would be some problem with an
>> >> interrupt.
>> >> >> >> Most U-Boot code avoids interrupts. I don't know about your bare
>> >> metal
>> >> >> >> application but maybe in some test application, you don't have too
>> >> much
>> >> >> >> interrupts too. In RTEMS you have most likely at least a periodic
>> >> tick
>> >> >> >> interrupt.
>> >> >> >>
>> >> >> >> Do you execute your code from the same flash or do you keep some
>> >> data in
>> >> >> >> the flash? In that case it could be an access error during a flash
>> >> >> erase or
>> >> >> >> write (for example caused by an interrupt). In that case you might
>> >> have
>> >> >> to
>> >> >> >> do a global interrupt disable before you enter certain routines.
>> >> >> >>
>> >> >> >> Best regards
>> >> >> >>
>> >> >> >> Christian
>> >> >>
>> >>
>> --
>> --------------------------------------------
>> embedded brains GmbH
>> Christian Mauderer
>> Dornierstr. 4
>> D-82178 Puchheim
>> Germany
>> email: christian.mauderer at embedded-brains.de
>> Phone: +49-89-18 94 741 - 18
>> Fax:   +49-89-18 94 741 - 08
>> PGP: Public key available on request.
>>
>> Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

-- 
--------------------------------------------
embedded brains GmbH
Christian Mauderer
Dornierstr. 4
D-82178 Puchheim
Germany
email: christian.mauderer at embedded-brains.de
Phone: +49-89-18 94 741 - 18
Fax:   +49-89-18 94 741 - 08
PGP: Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



More information about the users mailing list