[PATCH 2/9] cpukit/jffs2: Protect the inode cache

Wed Dec 13 20:29:31 UTC 2023

On Wed, Dec 13, 2023 at 1:32 PM Sebastian Huber <
sebastian.huber at embedded-brains.de> wrote:

> On 13.12.23 19:41, Kinsey Moore wrote:
> > On Wed, Dec 13, 2023 at 8:35 AM Sebastian Huber
> > <sebastian.huber at embedded-brains.de
> > <mailto:sebastian.huber at embedded-brains.de>> wrote:
> >
> >     On 13.12.23 15:27, Kinsey Moore wrote:
> >      > On Wed, Dec 13, 2023 at 12:26 AM Sebastian Huber
> >      > <sebastian.huber at embedded-brains.de
> >     <mailto:sebastian.huber at embedded-brains.de>
> >      > <mailto:sebastian.huber at embedded-brains.de
> >     <mailto:sebastian.huber at embedded-brains.de>>> wrote:
> >      >
> >      >     On 09.12.23 03:31, Kinsey Moore wrote:
> >      >      > The inode cache can be altered and queried by multiple
> >     threads of
> >      >      > execution, even before the introduction of delayed write
> >     support for
> >      >      > NAND. This provides a new lock to prevent simultaneous
> >      >     modification of
> >      >      > the cache.
> >      >
> >      >     Under which condition is the inode cache accessed without the
> >     file
> >      >     system instance lock for normal operations (no delayed works
> >     stuff)?
> >      >
> >      >     Your new code still has no test cases and the configuration
> >     option is
> >      >     not documented (http://devel.rtems.org/ticket/4961
> >     <http://devel.rtems.org/ticket/4961>
> >      >     <http://devel.rtems.org/ticket/4961
> >     <http://devel.rtems.org/ticket/4961>>).
> >      >
> >      >     I am still in favour of an alternative locking approach:
> >      >
> >      >     1. The delayed work support uses a mutex D and a condition
> >     variable C
> >      >     used with D.
> >      >
> >      >     2. Add a queue for the delayed work to the fs information and
> >     a node to
> >      >     register the info in the delayed work support.
> >      >
> >      >     3. The first delayed work request of a JFFS2 instance
> >     registers the fs
> >      >     information in the delayed work support and uses C to signal
> the
> >      >     work to
> >      >     the delayed work task.
> >      >
> >      >     4. Further requests just get enqueued and signaled using D
> and C.
> >      >
> >      >     5. When a instance is unmounted, drain the delayed work queue
> >     using
> >      >     D and C.
> >      >
> >      >     The delayed work uses the fs info mutex to protect the work.
> >     You need
> >      >     also reference count for the fs info to control the work and
> >     the drain
> >      >     during unmount.
> >      >
> >      >
> >      > Using the FS information lock at the level of delayed work
> callback
> >      > isn't workable with the current API exposed/consumed by the JFFS2
> >      > library as this information is not exposed to the thread calling
> the
> >      > delayed work without modification of the JFFS2 library itself or
> >     abusing
> >      > macros to pull in information that isn't actually provided to
> >     them (and
> >      > would require that local variable naming be extremely consistent
> >     across
> >      > usages of this abusive macro).All that is available is the
> callback
> >      > function pointer and an opaque void pointer argument.
> >
> >     I don't understand the problem. If you need JFFS2 specific details,
> why
> >     don't you implement this part to the JFFS2 area?
> >
> >
> > I could, but I prefer to minimize changes to external code where
> > possible. It's not strictly necessary in this case, so I'm avoiding it.
> >
> >
> >      > Other
> >      > implementations that use this library achieve safe locking
> >     without the
> >      > FS information lock.
> >
> >     What is "this library"?
> >
> >
> > The JFFS2 library.
> >
> >
> >     Before you started with adding some locks here and some locks there,
> >     the
> >     complete JFFS2 state was protected by a single instance lock. This is
> >     not great in terms of SMP performance, however, it is very simple
> >     and it
> >     works. I don't know why you can't get the instance lock, do the
> delayed
> >     work, and then release the instance lock.
> >
> >
> > The lock is not available to the delayed work caller without modifying
> > the JFFS2 code and, while I'm sure it would work fine from a data
> > integrity perspective, it was not intended to operate that way. If I
> > were going to go this direction to reduce complexity, it might make more
> > sense to disable delayed write support and force all writes to be
> > immediate such that it behaves like NOR. The downside to reduced locking
> > granularity or delayed write removal would be additional wear on the
> > NAND flash.
>
> In which place in the code do you have problems to get the fs info
> block? I am absolutely not in favour of having the internal locking
> enabled for JFFS2. We use this file system on lower end controllers.
>
> The call to submit delayed work does not provide the FS info as a
parameter:
void jffs2_queue_delayed_work(struct delayed_work *work, int delay_ms);
This is wrapped as:
#define queue_delayed_work(workqueue, delayed_work, delay_ms) ...

The initialization call for delayed work does not have direct access to the
FS info, either:
#define INIT_DELAYED_WORK(delayed_work, delayed_workqueue_callback) ...

> Independent of this, for NAND I would use YAFFS2.
>

Unfortunately, YAFFS2 is licensed GPL2 and not suitable for inclusion in an
RTEMS application without commercial licensing, dynamic loading via libdl,
or writing an implementation of that file system from scratch. I would
agree that JFFS2 isn't ideal for modern NAND given the 4GB file system
limit.

Kinsey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/devel/attachments/20231213/fae987e4/attachment.htm>