<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 12, 2020 at 11:15 AM Alan Cudmore <<a href="mailto:alan.cudmore@gmail.com">alan.cudmore@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Chris,<br>
I'm not sure that I can easily create a test to show that this<br>
condition exists. I think the rtems_rfs_bitmap_create_search function<br>
works as it is intended to, but during the last iteration of the for<br>
loop, if 'size' is zero and 'bit' is 31, the 'search_map' variable is<br>
incremented once more, and the value of RTEMS_RFS_BITMAP_ELEMENT_CLEAR<br>
(0xFFFFFFFF) is written to that location. This location is one address<br>
beyond the memory that was allocated for the search_map in<br>
rtems_rfs_bitmap_open.<br>
I guess that most of the time this is a silent side effect, but my<br>
application just happened to have memory lined up such that the extra<br>
write causes the malloc Allocator mutex to fail, causing the<br>
malloc_info call to block indefinitely. I would consider this a lucky<br>
break!<br>
The exact same example application does not fail on RTEMS 4.11. I<br>
think the problem still exists, but in that case, the word that gets<br>
written is different.<br>
<br>
Here are some debug printfs from rtems_rfs_bitmap_open and<br>
rtems_rfs_bitmap_create_search:<br>
<br>
>From rtems_rfs_bitmap_open:<br>
RFS - rtems_rfs_bitmap_open - search_bits malloced size = 16 bytes<br>
RFS - rtems_rfs_bitmap_open - addr of search_bits = 0x00C03814<br>
RFS -> size of search_map = 4<br>
RFS -> control->size = 4095<br>
<br>
>From the subsequent call to rtems_rfs_bitmap_create_search:<br>
These printfs are in the if clause where bit == 31 (line 633)<br>
RFS --> search_map before increment addr 00C03814, size = 3071<br>
RFS --> search_map after increment -> writing<br>
RTEMS_RFS_BITMAP_ELEMENT_CLEAR (-1) to addr 00C03818<br>
RFS --> search_map before increment addr 00C03818, size = 2047<br>
RFS --> search_map after increment -> writing<br>
RTEMS_RFS_BITMAP_ELEMENT_CLEAR (-1) to addr 00C0381C<br>
RFS --> search_map before increment addr 00C0381C, size = 1023<br>
RFS --> search_map after increment -> writing<br>
RTEMS_RFS_BITMAP_ELEMENT_CLEAR (-1) to addr 00C03820<br>
RFS --> search_map before increment addr 00C03820, size = 0<br>
RFS --> search_map after increment -> writing<br>
RTEMS_RFS_BITMAP_ELEMENT_CLEAR (-1) to addr 00C03824<br>
<br>
It's this last write to 00C03824 that causes the problem. I think the<br>
fix just involves checking to see if size == 0 before executing the if<br>
clause. I wanted to be sure that this extra write was not needed.<br>
<br>
If you have an idea for a test case, I can work on it, but if you<br>
think that this is good enough, I can propose a patch.<br>
<br>
Also, thanks for the idea of using RTEMS_DEBUG Sebastian, I need to<br>
upgrade my RTEMS toolbox with the latest techniques.<br></blockquote><div><br></div><div>If, while analysing this issues, you came up with any consistency checks</div><div>or assertions that can be added to the code when debug is enabled, </div><div>those would be welcomed. It is hard to go back and add them without </div><div>the analysis like you did hunting this bug.</div><div><br></div><div>--joel</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Alan<br>
<br>
<br>
On Sun, Oct 11, 2020 at 6:20 PM Chris Johns <<a href="mailto:chrisj@rtems.org" target="_blank">chrisj@rtems.org</a>> wrote:<br>
><br>
> On 10/10/20 7:35 am, Alan Cudmore wrote:<br>
> > After doing a lot of tracing through my application, it looks like<br>
> > malloc_info works fine before we start our cFS application, but it<br>
> > blocks after the cFS is initialized. This suggests some sort of memory<br>
> > corruption.<br>
> > I started by instrumenting our code to call malloc info during various<br>
> > stages of application initialization, and finally narrowed it down to<br>
> > the code where we create a RAM Disk and format it with RFS.<br>
> > (skipping a bunch of other malloc based troubleshooting.. )<br>
> > After I followed the issue into the RFS init, I was able to narrow<br>
> > down the place where malloc_info stopped working to here:<br>
> > <a href="https://git.rtems.org/rtems/tree/cpukit/libfs/src/rfs/rtems-rfs-bitmaps.c?h=5#n637" rel="noreferrer" target="_blank">https://git.rtems.org/rtems/tree/cpukit/libfs/src/rfs/rtems-rfs-bitmaps.c?h=5#n637</a><br>
> > During the RFS format process.<br>
> > In this section of the code, the size variable is 0, meaning it will<br>
> > exit the for loop and then return from the function, but it increments<br>
> > the "search_map" variable and writes to memory through the pointer one<br>
> > more time before exiting the loop and function. It's at this point<br>
> > where malloc_info starts blocking.<br>
> ><br>
> > It seems to me that this if block should be skipped when size == 0. I<br>
> > tried that and the malloc_info issue seems to be fixed.<br>
><br>
> Would you be able to create a test case for this? The test is ..<br>
><br>
> <a href="https://git.rtems.org/rtems/tree/testsuites/fstests/fsrfsbitmap01/test.c" rel="noreferrer" target="_blank">https://git.rtems.org/rtems/tree/testsuites/fstests/fsrfsbitmap01/test.c</a><br>
><br>
> Or if you could please provide the values in `control` I can add the test.<br>
><br>
> > Is this an RFS bug writing into other memory, or is this last write<br>
> > needed before the function updates?<br>
><br>
> It would seem so.<br>
><br>
> > If this looks like a bug, should I write a ticket and provide a patch?<br>
><br>
> Yes please. It would be nice to have a test case that fails so we can isolate<br>
> the cause.<br>
><br>
> Chris<br>
</blockquote></div></div>