Untar pending changes - Was: Cutting 4.11.0 Soon

Chris Johns chrisj at rtems.org
Sun Nov 22 22:23:52 UTC 2015


On 20/11/2015 12:32 PM, Pavel Pisa wrote:
> Hello Chris,
> 
> thanks for clarification and test.
> 
> On Friday 20 of November 2015 00:51:44 Chris Johns wrote:
>> On 20/11/2015 2:56 AM, Pavel Pisa wrote:
>>> Hello Chris,
>>>
>>> Please, describe what do you prefer and if something should
>>> be done for 4.11 or only master should be considered.
>>
>> I attach a script showing the issues I am attempting to describe. If you
>> look at test 1 and 2 a file can be replaced by a directory and an empty
>> directory can be replaced by a file but a directory that is not empty
>> cannot be replaced by a file. There are a few other combinations and I
>> am sure there are others I have not covered.
>>
>> I am wondering if the logic in the patch that does mkdir, checks the
>> result then does a stat if required should be a stat and then some logic
>> based on the results of the tests I have provided?
>>
> 
> My patch solves only case where directory is "overwritten" by directory.
> As you have showed in the test script there are more legitimate cases
> (GNU tar on Linux seems to provide same results as your script expects).
> 

Yes. My intention is for us to review what is practical for our systems.

> It seems that usual/standardized tar behavior is to try unlink()
> if there is a non-directory object in a way of created directory.
> 
> A look at the full weight tar projects sources
> ----------------------------------------------
> 
> Because of licenses, I will point to BSD licensed libarchive now.
> That is code used by FreeBSD bsdtar version.
> 
> The source location of tar entry extract call
> 
>   https://github.com/libarchive/libarchive/blob/master/tar/read.c#L372
> 
> implementation of archive_read_extract2()
> 
>   https://github.com/libarchive/libarchive/blob/master/libarchive/archive_read_extract2.c#L83
> 
> for operations targetting real filesystem/disc results in a call to _archive_write_disk_header()
> 
>   https://github.com/libarchive/libarchive/blob/master/libarchive/archive_write_disk_posix.c#L449
> 
> actual object is prepared in restore_entry() which is there
> 
>   https://github.com/libarchive/libarchive/blob/master/libarchive/archive_write_disk_posix.c#L1837
> 
> It seems to use unconditional unlink() if mode is ARCHIVE_EXTRACT_UNLINK is set.
> For other modes it directly calls create_filesystem_object() which for regular file does
> 
>   open(a->name, O_WRONLY | O_CREAT | O_EXCL | O_BINARY | O_CLOEXEC, mode)
> 
> for directories
> 
>   mkdir(a->name, mode);
> 
> If the first attempt fails with ENOTDIR or ENOENT, it tries to create required directory structure.
> If error is EISDIR or EEXIST and mode is ARCHIVE_EXTRACT_NO_OVERWRITE, then reports error.
> If the object in a way of regular file or other non-directory object is directory (EISDIR),
> it tries rmdir() which should work on empty directory. If EEXIST is reported then logic
> tries to find way to get rid of object. If original object is not directory it uses unlink.
> If original object is a directory and extracted one is not, tries rmdir(). Then tries to create
> object again.
> 
> If both, original and extracted objects are directory, then it skips deletion
> and notes that mode has to be updated later.

Thanks for the summary.

> Generally, use of libarchive directly in RTEMS would be most generic option.
> But for small tatgets (LPC17xx, small SPARCs etc.) it is a code and memory killer
> which would demand more memory only for untar that deices has at all.

I agree libarchive is much to much for us. We need something lean and
small because tar files are often used to bootstrap a file systems.

> So it would worth to use simpler and smaller code. It would worth to unite
> three implementations on single one with pointer to function for read of
> source and function/description how to deal with target. There seems to be
> only one special case for now and it is extraction of regular file in IMFS.

Yes it would be good. A consistent result would make life simpler for
our users. Do we need a new ticket for this?

> 
> But such rewrite is not 4.11 material at all.
> 

I agree.

> If we consider rtems_filesystem_location_free() equivalent for both unlink() and rmdir(),
> then directory to file switch case should be supported by actual rtems_tarfs_load().
> The case of dir to dir is solved by patch already. Case of non dir to dir needs additional
> unlink() and mkdir() attempt. This is reasonably simple and I can add it to actual
> proposed code.

Excellent and thanks.

Chris


More information about the devel mailing list