[RTEMS Project] #4568: untar: problems with existing directories
RTEMS trac
trac at rtems.org
Thu Dec 9 07:16:26 UTC 2021
#4568: untar: problems with existing directories
---------------------------------+--------------------------------
Reporter: Christian Mauderer | Owner: Christian Mauderer
Type: defect | Status: assigned
Priority: normal | Milestone: Indefinite
Component: lib | Version: 6
Severity: normal | Keywords:
Blocked By: | Blocking:
---------------------------------+--------------------------------
Cloned from #4552:
----
Our current implementation of untar in cpukit/libmisc/untar/untar.c has
problems if a directory in the archive already exists. Note that this is
no problem, if the archive contains only a file.
The problem exists on 5 and master.
Example: If I have a tar.gz file which contains a file and directories
l1/l2/x.txt and call Untar_FromGzChunk_Print twice, the first attempt will
print
{{{
untar: dir: l1
untar: dir: l1/l2
untar: file: l1/l2/x.txt (s:12,m:0644)
}}}
After that the directories l1 already exists. So if I re-try to extract
the archive, I'll get the following:
{{{
untar: dir: l1
untar: mkdir: l1: (17) File exists
}}}
My expectation would have been that the files are just integrated into an
existing directory structure. If a file exists, it should be overwritten.
We have multiple references for expected behavior. GNU or BSD `tar` or
POSIX `pax`. In my experience `tar` is the better known tool so my
suggestion would be to use the default behavior of `tar` as a reference.
=== GNU or BSD `tar`
I tested the default behavior of GNU `tar` and BSD `tar`. It seems to be
the same for both:
- If a directory structure exists, the files from the archive will be
integrated. Existing files are overwritten.
- If a file exists and the archive contains a directory with the same
name, the file is removed and a directory is created. In the above
example: if `l1/l2` is a file it will be overwritten with a new directory.
- If a directory exists and the archive contains a file with the same
name, the directory will be replaced if it is empty. If it contains files,
the result is an error.
- An archive also can contain only a file without the parent directories.
If in that case one of the parent directories exists as a file extracting
the archive results in an error. In the example: if `l1/l2` is a file and
the archive doesn't contain the directories but only the file
`l1/l2/x.txt` that would be an error.
In case of an error, it is possible that the archive has been partially
extracted.
Note: GNU `tar` has options to change the behavior (like --recursive-
unlink). I'm sure there are similar options in BSD `tar`. From my point of
view we should adapt to the default behavior, so I ignored these options.
=== The POSIX `pax` utility
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html
Default behavior is described as follows:
If an attempt is made to extract a directory when the directory already
exists, this shall not be considered an error. If an attempt is made to
extract a FIFO when the FIFO already exists, this shall not be considered
an error.
From some quick tests `pax` has a similar behavior like `tar`. The only
difference I noted is that empty directories are not overwritten with
files from the archive.
--
Ticket URL: <http://devel.rtems.org/ticket/4568>
RTEMS Project <http://www.rtems.org/>
RTEMS Project
More information about the bugs
mailing list