IMFS and "real storage", etc.

Tue Nov 7 18:30:27 UTC 2000

Since this was going to be buried in the reply to Matthew's message,
I thought I would put it to the list as a new thread.

>>>>>> untar filesystem

I could not resist putting in another plug for a read-only filesystem
that uses a tarball as its "underlying storage".  This would be
exceptionally cool for use as a ROM-filesystem.  

> My ideas are for a pretty full filesystem, it would provide a proper
> abstraction layer in the same way that imfs does (i.e user would just do
> read/write etc). I was not going to do any caching i was just going to lock
> the
> device and have the user re-try (during the erase),..)
> 
> But i have got a quite well defined allocation methord. If you could send,..
> me the guidelines and and plans you have i will add what i can.

The basic problem with the IMFS is that it assumes that it can
write directly into "storage".  Since it has malloc'ed memory,
this assumption is true for all directory entries and file
contents.  In essence, the IMFS is a data structure that mimics
a filesystem.  But it does do some things just like a real filesystem
would.  File contents are block oriented -- not assumed to be
contiguous blocks of memory.

As discussed earlier on the list, if you could dynamically 
associate malloc/free/memcpy with an "instantiation" (i.e. 
each mount point) of the IMFS, then you could use RAM 
that is not from the program heap.  This would be cool
for using the IMFS with SRAM, battery backed memory, etc.  But
there is one minor glitch here.  I believe the IMFS assumes
that a mount point starts out empty.  There would have to be
some tinkering to mount() to make the IMFS work with 
persistent storage even if the storage were battery back RAM.

To use the IMFS with "real storage" (e.g. flash, hard disk, 
floppy, etc), you would have to address a bit more than the
previous paragraph talks about.  You have to provide an
interface to a "block manager" that would allocate and
deallocate blocks of memory.  This is logically equivalent to
the malloc/free above EXCEPT that the IMFS does NOT
allocate the same size block everywhere.  Each directory
entry is malloc'ed separately.  This is not like a true
filesystem where multiple directory entries would be in 
a single disk block.  

On top of that, each place the IMFS writes into a file
or directory entry, that code may need to be modified
to ensure that it honors the fact that it can not write
directly into permanent storage.  This could be a very
simple modification if we assume that we cache disk
blocks being modified and that the IMFS is writing
into the cached blocks.  Then all it has to do is
mark them dirty and needing to be written.  [I thought
fixing this would be awful but Jennifer and I did a
review on this and  determined the IMFS was not as 
bad as it could be.]

[NOTE: A nice implementation/debug technique would
be to provide a "block-oriented IMFS" which interfaced
to a "block layer" that was still malloc/free at the 
core.  This would let one work out the kinks without
having a tough driver to write/debug.  Plus all the
IMFS modifications could be tested on a simulator.]

But notice that we know have added a "block abstraction".
It doesn't matter where those blocks reside if we
manage the interface correctly.  Define the interface
between filesystem implementations and the next layer
such that it will work independent of actual storage 
type (flash or hard disk).  Then the "block interface"
can plug into a cache manager which in turn can 
plug into a scheduler component.  This scheduler
could be a disk scheduler for a hard disk or an access
scheduler for flash.  Flash has timing and rewrite
constraints that could be managed nicely this way.
By placing it logically underneath a scheduler operations
can be done in priority or FIFO order.

[NOTE: Doing the "block oriented IMFS" would let
you plug it into the above layers and debug them also.]

One of things I would like to see is tha ability to 
specify caching algorithm on a filesystem basis
and scheduling algorithm on a physical device basis.
My Ph.D. research was in real-time disk caching
and scheduling algorithms.  Depending on your
application access mix and requirements, there are
a variety of algorithms to choose from.  For example,
N-Step C-SCAN is reasonably fair and efficient.  But
it leads to priority inversion.  Simple priority 
avoids priority inversion but can be horribly 
inefficient.  I created disk scheduling and 
caching algorithms that try to strike a balance
for real-time systems.  The bottom line is that
there are probably dozens of caching and scheduling
algorithms (I identified a total of about 12
that would be worth selecting between in a real system.)
Being able to configure the actual algorithm 
used is a nice feature.

That's about it in a nutshell.  It is far from a
detailed design but it gives you a good idea of
what needs to happen.

-- 
Joel Sherrill, Ph.D.             Director of Research & Development
joel at OARcorp.com                 On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
   Support Available             (256) 722-9985