RTEMS 5.1 pc686 BSP malloc_info problem?

Alan Cudmore alan.cudmore at gmail.com
Mon Oct 12 16:14:43 UTC 2020


Hi Chris,
I'm not sure that I can easily create a test to show that this
condition exists. I think the rtems_rfs_bitmap_create_search function
works as it is intended to, but during the last iteration of the for
loop, if 'size' is zero and 'bit' is 31, the 'search_map' variable is
incremented once more, and the value of RTEMS_RFS_BITMAP_ELEMENT_CLEAR
(0xFFFFFFFF) is written to that location. This location is one address
beyond the memory that was allocated for the search_map in
rtems_rfs_bitmap_open.
I guess that most of the time this is a silent side effect, but my
application just happened to have memory lined up such that the extra
write causes the malloc Allocator mutex to fail, causing the
malloc_info call to block indefinitely. I would consider this a lucky
break!
The exact same example application does not fail on RTEMS 4.11. I
think the problem still exists, but in that case, the word that gets
written is different.

Here are some debug printfs from rtems_rfs_bitmap_open and
rtems_rfs_bitmap_create_search:

>From rtems_rfs_bitmap_open:
RFS - rtems_rfs_bitmap_open - search_bits malloced size = 16 bytes
RFS - rtems_rfs_bitmap_open - addr of search_bits = 0x00C03814
RFS -> size of search_map = 4
RFS -> control->size = 4095

>From the subsequent call to rtems_rfs_bitmap_create_search:
These printfs are in the if clause where bit == 31 (line 633)
RFS --> search_map before increment addr 00C03814, size = 3071
RFS --> search_map after increment -> writing
RTEMS_RFS_BITMAP_ELEMENT_CLEAR (-1) to addr 00C03818
RFS --> search_map before increment addr 00C03818, size = 2047
RFS --> search_map after increment -> writing
RTEMS_RFS_BITMAP_ELEMENT_CLEAR (-1) to addr 00C0381C
RFS --> search_map before increment addr 00C0381C, size = 1023
RFS --> search_map after increment -> writing
RTEMS_RFS_BITMAP_ELEMENT_CLEAR (-1) to addr 00C03820
RFS --> search_map before increment addr 00C03820, size = 0
RFS --> search_map after increment -> writing
RTEMS_RFS_BITMAP_ELEMENT_CLEAR (-1) to addr 00C03824

It's this last write to 00C03824 that causes the problem. I think the
fix just involves checking to see if size == 0 before executing the if
clause. I wanted to be sure that this extra write was not needed.

If you have an idea for a test case, I can work on it, but if you
think that this is good enough, I can propose a patch.

Also, thanks for the idea of using RTEMS_DEBUG Sebastian, I need to
upgrade my RTEMS toolbox with the latest techniques.

Alan


On Sun, Oct 11, 2020 at 6:20 PM Chris Johns <chrisj at rtems.org> wrote:
>
> On 10/10/20 7:35 am, Alan Cudmore wrote:
> > After doing a lot of tracing through my application, it looks like
> > malloc_info works fine before we start our cFS application, but it
> > blocks after the cFS is initialized. This suggests some sort of memory
> > corruption.
> > I started by instrumenting our code to call malloc info during various
> > stages of application initialization, and finally narrowed it down to
> > the code where we create a RAM Disk and format it with RFS.
> > (skipping a bunch of other malloc based troubleshooting.. )
> > After I followed the issue into the RFS init, I was able to narrow
> > down the place where malloc_info stopped working to here:
> > https://git.rtems.org/rtems/tree/cpukit/libfs/src/rfs/rtems-rfs-bitmaps.c?h=5#n637
> > During the RFS format process.
> > In this section of the code, the size variable is 0, meaning it will
> > exit the for loop and then return from the function, but it increments
> > the "search_map" variable and writes to memory through the pointer one
> > more time before exiting the loop and function. It's at this point
> > where malloc_info starts blocking.
> >
> > It seems to me that this if block should be skipped when size == 0. I
> > tried that and the malloc_info issue seems to be fixed.
>
> Would you be able to create a test case for this? The test is ..
>
> https://git.rtems.org/rtems/tree/testsuites/fstests/fsrfsbitmap01/test.c
>
> Or if you could please provide the values in `control` I can add the test.
>
> > Is this an RFS bug writing into other memory, or is this last write
> > needed before the function updates?
>
> It would seem so.
>
> > If this looks like a bug, should I write a ticket and provide a patch?
>
> Yes please. It would be nice to have a test case that fails so we can isolate
> the cause.
>
> Chris


More information about the devel mailing list