[PATCH] untar: Make behavior similar to GNU or BSD tar

Christian MAUDERER christian.mauderer at embedded-brains.de
Tue Dec 7 08:52:30 UTC 2021


Hello Chris,

Am 07.12.21 um 05:10 schrieb Chris Johns:
> On 3/12/21 11:50 pm, Christian Mauderer wrote:
>> RTEMS untar implementation had problems with overwriting or integrating
>> archives into existing directory structures. This patch adapts the
>> behavior to mimic that of a GNU tar or BSD tar and extends the tar01
>> test to check for the behavior. That is:
>>
>> * If a directory structure exists, the files from the archive will be
>>    integrated. Existing files are overwritten.
> 
> What currently happens?

The untar fails if a directory exists that is in the archive. Note that 
is mostly true if the archive contains directories and not only files. 
That means: If I have two example archives that look like follows:

   > tar tvf image-error.tar.gz
   drwxr-xr-x christian_m/domainusers 0 2021-11-26 15:31 l1/
   drwxr-xr-x christian_m/domainusers 0 2021-11-26 14:38 l1/l2/
   -rw-r--r-- christian_m/domainusers 12 2021-11-26 14:27 l1/l2/x.txt

   > tar tvf image-ok.tar.gz
   -rw-r--r-- christian_m/domainusers 12 2021-11-26 14:27 l1/l2/x.txt

The first image contains the directories l1 and l1/l2 and the file 
l1/l2/x.txt. The second contains only the l1/l2/x.txt.

With our current implementation, I would be able to extract the first 
archive one times and a second try would fail because l1 and l1/l2 
already exist. The second archive could be extracted multiple times and 
would overwrite x.txt every time.

> 
>> * If a file exists and the archive contains a directory with the same
>>    name, the file is removed and a directory is created. In the above
>>    example: if l1/l2 is a file it will be overwritten with a new
>>    directory.
> 
> OK
> 
>> * If a directory exists and the archive contains a file with the same
>>    name, the directory will be replaced if it is empty. If it contains
>>    files, the result is an error.
>>
>> * An archive also can contain only a file without the parent
>>    directories. If in that case one of the parent directories exists as a
>>    file extracting the archive results in an error. In the example: if
>>    l1/l2 is a file and the archive doesn't contain the directories but
>>    only the file l1/l2/x.txt that would be an error.
>>
>> * In case of an error, it is possible that the archive has been
>>    partially extracted.
> 
> And what was there is not recoverable.
> 

Correct. Note that I basically just tested what GNU and BSD tar do to 
get a reference what is "expected behavior". See

    https://devel.rtems.org/ticket/4552

There would be a number of other reasonable behaviors that could be 
considered right too. But I think the default tar utilities are a good 
reference.

> Functionally this is not a big change and so I am left wondering why the
> original developer(s) did not do this?

I think the original behavior of the code in RTEMS was a bit different. 
I found a bug from 2019:

   https://devel.rtems.org/ticket/3823

The solution to that bug solved the bug but changed the behavior of tar 
a bit and introduced the new problem that tar can't integrate data into 
existing directory structures.

> 
> I think the changes make sense and I do not think it will break any applications
> I know of that use this code.

Thanks for reviewing it. I'll push the patch soon.

Best regards

Christian

> 
> Chris
> 

-- 
--------------------------------------------
embedded brains GmbH
Herr Christian MAUDERER
Dornierstr. 4
82178 Puchheim
Germany
email: christian.mauderer at embedded-brains.de
phone: +49-89-18 94 741 - 18
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


More information about the devel mailing list