Untar pending changes - Was: Cutting 4.11.0 Soon

Pavel Pisa ppisa4lists at pikron.com
Fri Nov 20 01:32:50 UTC 2015


Hello Chris,

thanks for clarification and test.

On Friday 20 of November 2015 00:51:44 Chris Johns wrote:
> On 20/11/2015 2:56 AM, Pavel Pisa wrote:
> > Hello Chris,
> >
> > Please, describe what do you prefer and if something should
> > be done for 4.11 or only master should be considered.
>
> I attach a script showing the issues I am attempting to describe. If you
> look at test 1 and 2 a file can be replaced by a directory and an empty
> directory can be replaced by a file but a directory that is not empty
> cannot be replaced by a file. There are a few other combinations and I
> am sure there are others I have not covered.
>
> I am wondering if the logic in the patch that does mkdir, checks the
> result then does a stat if required should be a stat and then some logic
> based on the results of the tests I have provided?
>

My patch solves only case where directory is "overwritten" by directory.
As you have showed in the test script there are more legitimate cases
(GNU tar on Linux seems to provide same results as your script expects).

It seems that usual/standardized tar behavior is to try unlink()
if there is a non-directory object in a way of created directory.

A look at the full weight tar projects sources
----------------------------------------------

Because of licenses, I will point to BSD licensed libarchive now.
That is code used by FreeBSD bsdtar version.

The source location of tar entry extract call

  https://github.com/libarchive/libarchive/blob/master/tar/read.c#L372

implementation of archive_read_extract2()

  https://github.com/libarchive/libarchive/blob/master/libarchive/archive_read_extract2.c#L83

for operations targetting real filesystem/disc results in a call to _archive_write_disk_header()

  https://github.com/libarchive/libarchive/blob/master/libarchive/archive_write_disk_posix.c#L449

actual object is prepared in restore_entry() which is there

  https://github.com/libarchive/libarchive/blob/master/libarchive/archive_write_disk_posix.c#L1837

It seems to use unconditional unlink() if mode is ARCHIVE_EXTRACT_UNLINK is set.
For other modes it directly calls create_filesystem_object() which for regular file does

  open(a->name, O_WRONLY | O_CREAT | O_EXCL | O_BINARY | O_CLOEXEC, mode)

for directories

  mkdir(a->name, mode);

If the first attempt fails with ENOTDIR or ENOENT, it tries to create required directory structure.
If error is EISDIR or EEXIST and mode is ARCHIVE_EXTRACT_NO_OVERWRITE, then reports error.
If the object in a way of regular file or other non-directory object is directory (EISDIR),
it tries rmdir() which should work on empty directory. If EEXIST is reported then logic
tries to find way to get rid of object. If original object is not directory it uses unlink.
If original object is a directory and extracted one is not, tries rmdir(). Then tries to create
object again.

If both, original and extracted objects are directory, then it skips deletion
and notes that mode has to be updated later.

Generally, use of libarchive directly in RTEMS would be most generic option.
But for small tatgets (LPC17xx, small SPARCs etc.) it is a code and memory killer
which would demand more memory only for untar that deices has at all.

So it would worth to use simpler and smaller code. It would worth to unite
three implementations on single one with pointer to function for read of
source and function/description how to deal with target. There seems to be
only one special case for now and it is extraction of regular file in IMFS.

But such rewrite is not 4.11 material at all.

If we consider rtems_filesystem_location_free() equivalent for both unlink() and rmdir(),
then directory to file switch case should be supported by actual rtems_tarfs_load().
The case of dir to dir is solved by patch already. Case of non dir to dir needs additional
unlink() and mkdir() attempt. This is reasonably simple and I can add it to actual
proposed code.

Best wishes,

              Pavel


More information about the devel mailing list