# node-tar
Fast and full-featured Tar for Node.js
The API is designed to mimic the behavior of `tar(1)` on unix systems.
If you are familiar with how tar works, most of this will hopefully be
straightforward for you. If not, then hopefully this module can teach
you useful unix skills that may come in handy someday :)
## Background
A "tar file" or "tarball" is an archive of file system entries
(directories, files, links, etc.) The name comes from "tape archive".
If you run `man tar` on almost any Unix command line, you'll learn
quite a bit about what it can do, and its history.
Tar has 5 main top-level commands:
* `c` Create an archive
* `r` Replace entries within an archive
* `u` Update entries within an archive (ie, replace if they're newer)
* `t` List out the contents of an archive
* `x` Extract an archive to disk
The other flags and options modify how this top level function works.
## High-Level API
These 5 functions are the high-level API. All of them have a
single-character name (for unix nerds familiar with `tar(1)`) as well
as a long name (for everyone else).
All the high-level functions take the following arguments, all three
of which are optional and may be omitted.
1. `options` - An optional object specifying various options
2. `paths` - An array of paths to add or extract
3. `callback` - Called when the command is completed, if async. (If
sync or no file specified, providing a callback throws a
`TypeError`.)
If the command is sync (ie, if `options.sync=true`), then the
callback is not allowed, since the action will be completed immediately.
If a `file` argument is specified, and the command is async, then a
`Promise` is returned. In this case, if async, a callback may be
provided which is called when the command is completed.
If a `file` option is not specified, then a stream is returned. For
`create`, this is a readable stream of the generated archive. For
`list` and `extract` this is a writable stream that an archive should
be written into. If a file is not specified, then a callback is not
allowed, because you're already getting a stream to work with.
`replace` and `update` only work on existing archives, and so require
a `file` argument.
Sync commands without a file argument return a stream that acts on its
input immediately in the same tick. For readable streams, this means
that all of the data is immediately available by calling
`stream.read()`. For writable streams, it will be acted upon as soon
as it is provided, but this can be at any time.
### Warnings and Errors
Tar emits warnings and errors for recoverable and unrecoverable situations,
respectively. In many cases, a warning only affects a single entry in an
archive, or is simply informing you that it's modifying an entry to comply
with the settings provided.
Unrecoverable warnings will always raise an error (ie, emit `'error'` on
streaming actions, throw for non-streaming sync actions, reject the
returned Promise for non-streaming async operations, or call a provided
callback with an `Error` as the first argument). Recoverable errors will
raise an error only if `strict: true` is set in the options.
Respond to (recoverable) warnings by listening to the `warn` event.
Handlers receive 3 arguments:
- `code` String. One of the error codes below. This may not match
`data.code`, which preserves the original error code from fs and zlib.
- `message` String. More details about the error.
- `data` Metadata about the error. An `Error` object for errors raised by
fs and zlib. All fields are attached to errors raisd by tar. Typically
contains the following fields, as relevant:
- `tarCode` The tar error code.
- `code` Either the tar error code, or the error code set by the
underlying system.
- `file` The archive file being read or written.
- `cwd` Working directory for creation and extraction operations.
- `entry` The entry object (if it could be created) for `TAR_ENTRY_INFO`,
`TAR_ENTRY_INVALID`, and `TAR_ENTRY_ERROR` warnings.
- `header` The header object (if it could be created, and the entry could
not be created) for `TAR_ENTRY_INFO` and `TAR_ENTRY_INVALID` warnings.
- `recoverable` Boolean. If `false`, then the warning will emit an
`error`, even in non-strict mode.
#### Error Codes
* `TAR_ENTRY_INFO` An informative error indicating that an entry is being
modified, but otherwise processed normally. For example, removing `/` or
`C:\` from absolute paths if `preservePaths` is not set.
* `TAR_ENTRY_INVALID` An indication that a given entry is not a valid tar
archive entry, and will be skipped. This occurs when:
- a checksum fails,
- a `linkpath` is missing for a link type, or
- a `linkpath` is provided for a non-link type.
If every entry in a parsed archive raises an `TAR_ENTRY_INVALID` error,
then the archive is presumed to be unrecoverably broken, and
`TAR_BAD_ARCHIVE` will be raised.
* `TAR_ENTRY_ERROR` The entry appears to be a valid tar archive entry, but
encountered an error which prevented it from being unpacked. This occurs
when:
- an unrecoverable fs error happens during unpacking,
- an entry has `..` in the path and `preservePaths` is not set, or
- an entry is extracting through a symbolic link, when `preservePaths` is
not set.
* `TAR_ENTRY_UNSUPPORTED` An indication that a given entry is
a valid archive entry, but of a type that is unsupported, and so will be
skipped in archive creation or extracting.
* `TAR_ABORT` When parsing gzipped-encoded archives, the parser will
abort the parse process raise a warning for any zlib errors encountered.
Aborts are considered unrecoverable for both parsing and unpacking.
* `TAR_BAD_ARCHIVE` The archive file is totally hosed. This can happen for
a number of reasons, and always occurs at the end of a parse or extract:
- An entry body was truncated before seeing the full number of bytes.
- The archive contained only invalid entries, indicating that it is
likely not an archive, or at least, not an archive this library can
parse.
`TAR_BAD_ARCHIVE` is considered informative for parse operations, but
unrecoverable for extraction. Note that, if encountered at the end of an
extraction, tar WILL still have extracted as much it could from the
archive, so there may be some garbage files to clean up.
Errors that occur deeper in the system (ie, either the filesystem or zlib)
will have their error codes left intact, and a `tarCode` matching one of
the above will be added to the warning metadata or the raised error object.
Errors generated by tar will have one of the above codes set as the
`error.code` field as well, but since errors originating in zlib or fs will
have their original codes, it's better to read `error.tarCode` if you wish
to see how tar is handling the issue.
### Examples
The API mimics the `tar(1)` command line functionality, with aliases
for more human-readable option and function names. The goal is that
if you know how to use `tar(1)` in Unix, then you know how to use
`require('tar')` in JavaScript.
To replicate `tar czf my-tarball.tgz files and folders`, you'd do:
```js
tar.c(
{
gzip: <true|gzip options>,
file: 'my-tarball.tgz'
},
['some', 'files', 'and', 'folders']
).then(_ => { .. tarball has been created .. })
```
To replicate `tar cz files and folders > my-tarball.tgz`, you'd do:
```js
tar.c( // or tar.create
{
gzip: <true|gzip options>
},
['some', 'files', 'and', 'folders']
).pipe(fs.createWriteStream('my-tarball.tgz'))
```
To replicate `tar xf my-tarball.tgz` you'd do:
```js
tar.x( // or tar.extract(
{
file: 'my-tarball.tgz'
}
).then(_=> { .. tarball has been dumped in cwd .. })
```
To replicate `cat my-tarball.tgz | tar x -C some-dir --strip=1`:
```js
fs.createReadStream('my-tarball.tgz').pipe(
tar.x({
strip: 1,
C: 'some-dir' // alias for cwd:'some-dir', also ok
})
)
```
To replicate `tar tf my-tarball.tgz`, do thi