Chilkat.Tar Class Overview

Chilkat.Tar creates, verifies, lists, and extracts TAR archives, including compressed TAR formats such as .tar.gz, .tgz, .tar.bz2, and .tar.Z. It supports adding files and directory trees, controlling paths and permissions, filtering included or extracted files, extracting from memory, extracting selected files to memory or BinData, producing XML listings, and creating Debian .deb package archives from control and data tarballs.

What the Class Is Used For

Use Chilkat.Tar when an application needs to package files into TAR archives, create compressed TAR files, inspect archive contents, safely extract archives, or selectively extract matching files. The class can work with local TAR files, compressed TAR files, in-memory TAR bytes, and individual matching archive entries.

Create TAR Archives Add files and directory trees, then write plain TAR, TAR.GZ, or TAR.BZ2 output.
Extract TAR Archives Untar plain or compressed archives while controlling destination paths, matching patterns, and maximum extraction count.
List and Verify Generate an XML listing with ListXml or verify an archive by scanning its TAR headers.
Work In Memory Extract from in-memory TAR bytes or extract the first matching file directly to memory or BinData.

Typical Workflow: Create a TAR Archive

  1. Create a Tar object.
  2. Optionally set archive-writing properties such as WriteFormat, DirPrefix, FileMode, DirMode, ScriptFileMode, UserId, UserName, GroupId, and GroupName.
  3. Optionally configure filters with MustMatch, MustNotMatch, and MatchCaseSensitive.
  4. Add files and directory trees with AddFile, AddFile2, AddDirRoot, or AddDirRoot2.
  5. Write the archive with WriteTar, WriteTarGz, or WriteTarBz2.
  6. Call ClearDirRootsAndFiles before reusing the object for a new archive input set.

Typical Workflow: Extract a TAR Archive

  1. Create a Tar object.
  2. Set UntarFromDir to the destination directory.
  3. Leave NoAbsolutePaths enabled unless absolute extraction paths are intentionally required.
  4. Optionally set MustMatch, MustNotMatch, UntarDiscardPaths, or UntarMaxCount.
  5. For listing-only behavior, set CaptureXmlListing and SuppressOutput.
  6. Extract with Untar, UntarGz, UntarBz2, or UntarZ.
  7. Check LastErrorText after failures or unexpected behavior.
Safe extraction default: NoAbsolutePaths defaults to true, which removes leading / or \ characters from absolute paths during extraction.

Core Concepts

Concept Meaning Important Members
Input Set Files and directory trees queued for the next archive-writing operation. AddFile, AddFile2, AddDirRoot, AddDirRoot2, ClearDirRootsAndFiles
Path in TAR The internal path stored for each entry in the TAR archive. DirPrefix, AddDirRoot2, AddFile2
Write Format TAR header format used when writing archives. WriteFormat
Permissions Metadata File, directory, and script permissions stored in TAR headers. FileMode, DirMode, ScriptFileMode
Filtering Include or extract only matching entries, or skip entries that match a pattern. MustMatch, MustNotMatch, MatchCaseSensitive
XML Listing XML description of archive contents, either from ListXml or captured during extraction. ListXml, CaptureXmlListing, XmlListing

Creating TAR Archives

Method Adds Path Behavior
AddFile A local file. Adds the file to the next WriteTar* call.
AddFile2 A local file with an explicit path inside the TAR. The DirPrefix property does not apply because the path-in-tar is directly specified.
AddDirRoot A directory tree root. Include one or more directory trees by calling this method multiple times before writing the TAR.
AddDirRoot2 A directory tree root with a root prefix. The rootPrefix is added to paths in the TAR. It should not end with a forward slash. If DirPrefix is also set, it is added first.
ClearDirRootsAndFiles Nothing new; clears the queued input set. Removes all files and directory roots previously added with AddFile* and AddDirRoot*.
Multiple roots: A TAR archive can contain multiple directory trees. Call AddDirRoot or AddDirRoot2 multiple times, then call a single WriteTar* method.

Writing TAR Output

Method Output Type Use When
WriteTar Plain .tar Create an uncompressed TAR archive from the queued files and directory roots.
WriteTarGz .tar.gz / .tgz Create a gzip-compressed TAR archive.
WriteTarBz2 .tar.bz2 Create a bzip2-compressed TAR archive.

Archive Writing Properties

Property Purpose Default / Guidance
WriteFormat TAR format used when writing an archive. Valid values are gnu, pax, and ustar. Default is gnu.
DirPrefix Prefix added to each file path within the TAR archive. For example, subdir1 causes subdir1/ to be prepended. Does not apply to files added with AddFile2.
FileMode Permission mode stored in TAR headers for file entries. Default is octal 0644.
DirMode Permission mode stored in TAR headers for directory entries. Default is octal 0755.
ScriptFileMode Permission mode stored for shell script files. Applies to .sh, .csh, .bash, and .bsh files. Default is octal 0755.
UserId / GroupId Default numerical UID and GID stored in TAR headers. Both default to 1000.
UserName / GroupName Default user and group names stored in TAR headers. Defaults to the logged-on username of the application’s process.
Filename encoding: The WriteTar* methods always use utf-8 when storing filenames within the TAR archive.

Extracting TAR Archives

Method Input Type Return Value
Untar Plain TAR file. Number of files and directories extracted, or -1 for failure.
UntarGz .tar.gz or .tar.gzip Boolean success/failure.
UntarBz2 .tar.bz2 or .tar.bzip2 Boolean success/failure.
UntarZ .tar.Z Boolean success/failure.
UntarFromMemory In-memory TAR bytes. Number of files and directories extracted, or -1 for failure.
Destination directory: Untar methods extract to the directory specified by UntarFromDir. The default is ., meaning the current working directory.

Extraction Control Properties

Property Purpose Guidance
UntarFromDir Destination root directory for extracted files. If UntarDiscardPaths is false, the archive directory tree is recreated under this directory.
UntarDiscardPaths Discards all path information during extraction. When true, all files are extracted into a single directory. Default is false.
UntarMaxCount Maximum number of files to extract. Default is 0, meaning no maximum. Useful with matching patterns to extract only the first desired file.
NoAbsolutePaths Converts absolute paths to relative paths during extraction. Default is true. Helps avoid extracting files into system directories such as C:\Windows\system32.
Charset Character encoding used when interpreting filenames during untar operations. Default is utf-8 and is typically not changed.
UntarDebugLog Logs information about each extracted file or directory. Similar to verbose logging. Output appears in LastErrorText, LastErrorXml, or LastErrorHtml.

Filtering Files During Create or Extract

Property Behavior Example
MustMatch File paths must match this pattern to be included when creating or extracted when untarring. *.txt includes or extracts only .txt files.
MustNotMatch File paths matching this pattern are skipped when creating or extracting. *.obj skips object files.
MatchCaseSensitive Controls whether MustMatch and MustNotMatch matching is case-sensitive. Default is false.
UntarCaseSensitive Deprecated alias-like behavior for case-sensitive matching. Use MatchCaseSensitive instead.
Pattern syntax: Matching patterns may include zero or more asterisks. Each * represents zero or more characters.

Listing, Verifying, and Listing Without Extracting

Member Purpose Important Details
ListXml Scans a TAR archive and returns XML describing the files and directories found within the archive. Use when the archive should be inspected without extracting.
VerifyTar Verifies that a TAR archive is valid. Opens the archive and scans the entire file by walking the TAR headers.
CaptureXmlListing Captures an XML listing during untar operations. The captured XML has the same format as ListXml.
XmlListing Holds the XML listing captured by the last untar operation. Populated only when CaptureXmlListing was true.
SuppressOutput Prevents untar methods from producing output. Useful with CaptureXmlListing to obtain the contents of a TAR archive without extracting.
List-only technique: Set CaptureXmlListing = true and SuppressOutput = true, then call an untar method to capture the archive listing without extracting files.

Memory and Single-File Extraction Methods

Method Input Output
UntarFirstMatchingToBd TAR file path and match pattern. Extracts the first matching file into a supplied BinData object.
UntarFirstMatchingToMemory In-memory TAR bytes and match pattern. Extracts and returns the first matching file as bytes.
UntarFromMemory In-memory TAR bytes. Extracts archive contents to the local filesystem under UntarFromDir.

Directory Roots and Archive Input Inspection

Member Purpose Guidance
NumDirRoots Number of directory roots previously added with AddDirRoot or AddDirRoot2. Use to confirm how many directory trees are queued for the next WriteTar* call.
GetDirRoot Returns the Nth directory root value. Indexing begins at 0.

Debian Package Helper

Method Inputs Output
CreateDeb control.tar.gz path and data.tar.gz path. Creates a Debian binary package archive at the specified .deb output path.

Method Summary by Category

Category Methods Purpose
Add archive input AddFile, AddFile2, AddDirRoot, AddDirRoot2, ClearDirRootsAndFiles Queue files and directory trees for the next archive-writing operation.
Write archives WriteTar, WriteTarGz, WriteTarBz2 Create plain, gzip-compressed, or bzip2-compressed TAR archives.
Extract archives Untar, UntarGz, UntarBz2, UntarZ, UntarFromMemory Extract plain, compressed, or in-memory TAR archives to the filesystem.
Extract selected content UntarFirstMatchingToBd, UntarFirstMatchingToMemory Extract the first archive entry matching a pattern into memory.
Inspect and verify ListXml, VerifyTar, GetDirRoot List archive contents, validate TAR headers, or inspect queued directory roots.
Package helper CreateDeb Create a Debian .deb package archive from control and data tarballs.
Async support LoadTaskCaller Support asynchronous task workflows.

Diagnostics and Troubleshooting

Problem Area Member What to Check
Archive does not contain expected files AddFile*, AddDirRoot*, MustMatch, MustNotMatch Confirm the input set was added before writing and that filters are not excluding files.
Path inside TAR is not as expected DirPrefix, AddDirRoot2, AddFile2 Remember that AddFile2 explicitly controls the path-in-tar and is not affected by DirPrefix.
Extraction writes files into the wrong place UntarFromDir, UntarDiscardPaths, NoAbsolutePaths Check the destination directory, whether paths are being discarded, and whether absolute paths are being made relative.
Only one file should be extracted MustMatch, UntarMaxCount, UntarFirstMatchingToBd, UntarFirstMatchingToMemory Use a specific match pattern and limit extraction count, or extract the first matching file directly to memory.
Need archive contents without extracting files ListXml, CaptureXmlListing, SuppressOutput Use ListXml, or capture an XML listing during an untar call while suppressing output.
Filename characters are interpreted incorrectly Charset The default is utf-8. Change only when extracting archives known to use a different filename encoding.
Need detailed extraction logging UntarDebugLog Enable to log information about each extracted file or directory to diagnostic output.
Need operation details after failure LastErrorText Check diagnostic text after failed or unexpected add, write, list, verify, untar, memory-extract, or package-creation operations.

Common Pitfalls

Pitfall Better Approach
Calling WriteTar* before adding files or roots. Call AddFile, AddFile2, AddDirRoot, or AddDirRoot2 first.
Expecting DirPrefix to affect AddFile2. AddFile2 directly specifies the path within the TAR, so DirPrefix does not apply.
Adding an AddDirRoot2 root prefix with a trailing slash. Use a prefix such as abc/123, not abc/123/.
Disabling NoAbsolutePaths without a strong reason. Keep the default enabled to avoid extracting archive entries into unexpected absolute paths.
Using UntarDiscardPaths and expecting directory trees to be recreated. Leave UntarDiscardPaths false when the archive’s directory structure should be preserved.
Expecting UntarCaseSensitive to be the primary setting. Use MatchCaseSensitive; the old property is deprecated.
Forgetting that GetFrameData-style behavior does not apply here. TAR extraction methods write files or return counts/booleans; use UntarFirstMatchingToBd or UntarFirstMatchingToMemory for in-memory extraction.

Best Practices

Recommendation Reason
Use VerifyTar before extracting untrusted archives. It scans the TAR headers and helps identify invalid archives before extraction.
Leave NoAbsolutePaths enabled for extraction. It helps prevent archive entries from writing into important system locations.
Use MustMatch and MustNotMatch to limit archive content. The same filtering model works for both creating and extracting archives.
Use AddFile2 when the exact path-in-tar matters. It explicitly controls the stored archive path for that file.
Use CaptureXmlListing and SuppressOutput for listing-only extraction workflows. This allows archive contents to be inspected without writing files.
Use UntarFirstMatchingToBd or UntarFirstMatchingToMemory for single-file access. These methods avoid extracting the entire archive when only one matching file is needed.
Call ClearDirRootsAndFiles before reusing the object for a new TAR. It prevents previously queued files or directory roots from being included accidentally.
Check LastErrorText after failures. It provides useful diagnostic detail for writing, extracting, listing, verifying, filtering, path handling, and compressed archive operations.

Summary

Chilkat.Tar is the Chilkat class for creating, reading, listing, verifying, and extracting TAR archives. It supports plain TAR and common compressed TAR formats, file and directory-tree inputs, path-prefix control, permissions metadata, match-based filtering, safe extraction defaults, XML archive listings, memory-based extraction, and Debian package creation.

The most important practical guidance is to add files or directory roots before writing, use AddFile2 when the internal archive path must be explicit, keep NoAbsolutePaths enabled when extracting, use match patterns to limit what is written or extracted, and inspect LastErrorText when an operation fails.