Closed
Description
I tried opening:
- each of the tar files used to test Golang's tar package (here with details about each in the tests here).
- each of the tar files used to test node-tar, found here.
- each of the tar files used to test libarchive, found here. Note I had to uudecode these.
Note all the above have permissive licenses so it may be possible to borrow these tars for our test assets.
I used the test code below to open each, ignored those that opened successfully, and for those that failed compared whether some other tools could open them. The interesting cases are where other tools (particularly GNU tar) can open them, but we cannot. Note: I mostly didn't extract the entries, just checked they could be listed. In some cases, the tar can be listed, but extraction will fail.
test code I used
// See https://aka.ms/new-console-template for more information
using System.Formats.Tar;
using Xunit;
public static class C
{
public async static Task Main()
{
List<Task> tasks = new();
foreach (string path in Directory.EnumerateFiles(@"C:\git\go\src\archive\tar\testdata", "*.tar"))
{
tasks.Add(Task.Run(async () =>
{
TarEntry? entry = null;
try
{
//Console.WriteLine($"{path} opening...");
using FileStream fs = new(path, FileMode.Open);
using TarReader reader = new(fs, leaveOpen: false);
while ((entry = await reader.GetNextEntryAsync()) != null)
{
var ms = new MemoryStream();
Assert.NotEmpty(entry.Name);
Assert.True(Enum.IsDefined(entry.EntryType));
Assert.True(Enum.IsDefined(entry.Format));
if (entry.EntryType == TarEntryType.Directory)
continue;
var ds = entry.DataStream;
if (ds != null && ds.Length > 0)
{
ds.CopyTo(ms);
}
}
}
catch (Exception ex) //when (!(ex is FormatException))
{
Console.WriteLine($"{path} opening {entry?.Name} threw {ex.Message}");
}
}));
}
await Task.WhenAll(tasks);
}
}
source | Column1 | issue | gnu tar | 7z | golang | .NET | .NET Exception |
---|---|---|---|---|---|---|---|
golang | gnu-multi-hdrs.tar | duplicate headers | reads one | reads one w/warning | reads one | ERROR | A metadata entry of type 'LongPath' was unexpectedly found after a metadata entry of type 'LongPath'. |
golang | gnu-incremental.tar | incremental format | reads ok | reads ok | ERROR | Unable to read beyond the end of the stream. | |
golang | invalid-go17.tar | ?? | reads ok | reads ok | reads ok | ERROR | Could not find any recognizable digits. |
golang | hdr-only.tar | just header | reads with errors | reads ok | reads ok | ERROR | Additional non-parsable characters are at the end of the string. |
golang | nil-uid.tar | zero uid | reads ok | reads w/warnings | reads ok | ERROR | Unable to read beyond the end of the stream. |
golang | pax-multi-hdrs.tar | 2 headers | reads ok | reads w/warnings | reads ok | ERROR | A metadata entry of type 'ExtendedAttributes' was unexpectedly found after a metadata entry of type 'ExtendedAttributes'. |
golang | pax-bad-mtime-file.tar | bad modified time | reads ok | reads w/warnings | ERROR | Unable to read beyond the end of the stream. | |
golang | pax-pos-size-file.tar | ? | reads ok | reads w/warnings | reads ok | ERROR | Unable to read beyond the end of the stream. |
golang | v7.tar | v7 | reads ok | reads ok | reads ok | ERROR | Could not find any recognizable digits. |
golang | sparse-formats.tar | something about sparseness | reads ok | reads ok | ERROR | Additional non-parsable characters are at the end of the string. | |
golang | ustar-file-reg.tar | non-zero device numbers. | reads ok | reads ok | ERROR | Unable to read beyond the end of the stream. | |
golang | writer-big.tar | truncated huge | ERROR | reads ok | ERROR | Could not find any recognizable digits. | |
golang | pax-path-hdr.tar | ? | reads empty | ERROR | reads header | ERROR | Unable to read beyond the end of the stream. |
golang | writer-big-long.tar | truncated huge | ERROR | reads w/ unexpected end of data | reads ok | ERROR | Unable to read beyond the end of the stream. |
mine | huge.tar | dd if=/dev/zero bs=1G count=16 > huge.tar | reads | ERROR | Value was either too large or too small for a UInt32 | ||
golang | issue10968.tar | garbled header | ERROR | ERROR (but OK) | Could not find any recognizable digits. | ||
golang | issue11169.tar | ?? | ERROR | ERROR (but OK) | Additional non-parsable characters are at the end of the string. | ||
golang | neg-size.tar | negative size | ERROR | refuses | ERROR | ERROR (but OK) | Could not find any recognizable digits. |
golang | pax-bad-hdr-file.tar | bad header | reads with errors | reads ok | ERROR | ERROR (but OK) | Unable to read beyond the end of the stream. |
node | long-pax.tar | 120 byte filename (pax limit 100) | reads headers | reads w/ unexpected end of data | ERROR | 120-byte-filename-cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc threw Unable to read beyond the end of the stream. | |
node | next-file-has-long.tar | link to 170 byte name in GNU | ERROR | Entry 'NextFileHasLongPath' was expected to be in the GNU format, but did not have the expected version data. | |||
node | path-missing.tar | empty name | "Substituting `.' for empty member name" (but not clear this is useful..) | silently uses tar file name | ERROR on extraction | Cannot create 'c:\tar' because a file or directory with the same name already exists (NOTE -- we should probably fix to fail earlier, in GetDestinationAndLinkPaths()) | |
node | links-strip.tar | ?symlink and hardlinks | reads ok | reads w/ unexpected end of data | ERROR | Unable to read beyond the end of the stream. | |
mine | empty.tar | 0 bytes | reads OK | reads ok | OK | ||
libarchive | test_compat_gtar_2.tar | huge gid | reads OK | reads ok | ERROR | Could not find any recognizable digits. | |
libarchive | test_compat_perl_archive_tar.tar | ? | reads OK | reads ok | ERROR | Could not find any recognizable digits. | |
libarchive | test_compat_gtar_1.tar | 200 byte filenames and symlink? | reads OK | reads ok | ERROR | Could not find any recognizable digits. | |
libarchive | test_compat_plexus_archiver_tar.tar | reads OK w/tar: A lone zero block at 3 | reads w/ There are some data after the end of the payload data | ERROR | Could not find any recognizable digits. | ||
libarchive | test_compat_solaris_tar_acl.tar | reads OK w/Unknown file type ‘A’ | reads ok | OK | (no exception, but unexpected TarEntryType 65 = 'A' .. A custom extension) | ||
libarchive | test_compat_tar_hardlink_1.tar | reads OK | reads w/ unexpected end of data | ERROR | Could not find any recognizable digits. | ||
libarchive | test_read_format_gtar_sparse_1_17_posix00.tar | reads OK | reads ok | ERROR | The entry './PaxHeaders.38659/sparse' has a duplicate extended attribute. | ||
libarchive | test_read_format_tar_invalid_pax_size.tar | ERRORS | ERROR | ERROR | Could not find any recognizable digits. |
Possibly some of these are expected limitations, but for the others we should add checkboxes and work through and fix them.