# yauzl
[![Build Status](https://travis-ci.org/thejoshwolfe/yauzl.svg?branch=master)](https://travis-ci.org/thejoshwolfe/yauzl)
[![Coverage Status](https://img.shields.io/coveralls/thejoshwolfe/yauzl.svg)](https://coveralls.io/r/thejoshwolfe/yauzl)
yet another unzip library for node. For zipping, see
[yazl](https://github.com/thejoshwolfe/yazl).
Design principles:
* Follow the spec.
Don't scan for local file headers.
Read the central directory for file metadata.
(see [No Streaming Unzip API](#no-streaming-unzip-api)).
* Don't block the JavaScript thread.
Use and provide async APIs.
* Keep memory usage under control.
Don't attempt to buffer entire files in RAM at once.
* Never crash (if used properly).
Don't let malformed zip files bring down client applications who are trying to catch errors.
* Catch unsafe file names.
See `validateFileName()`.
## Usage
```js
var yauzl = require("yauzl");
yauzl.open("path/to/file.zip", {lazyEntries: true}, function(err, zipfile) {
if (err) throw err;
zipfile.readEntry();
zipfile.on("entry", function(entry) {
if (/\/$/.test(entry.fileName)) {
// Directory file names end with '/'.
// Note that entires for directories themselves are optional.
// An entry's fileName implicitly requires its parent directories to exist.
zipfile.readEntry();
} else {
// file entry
zipfile.openReadStream(entry, function(err, readStream) {
if (err) throw err;
readStream.on("end", function() {
zipfile.readEntry();
});
readStream.pipe(somewhere);
});
}
});
});
```
See also `examples/` for more usage examples.
## API
The default for every optional `callback` parameter is:
```js
function defaultCallback(err) {
if (err) throw err;
}
```
### open(path, [options], [callback])
Calls `fs.open(path, "r")` and reads the `fd` effectively the same as `fromFd()` would.
`options` may be omitted or `null`. The defaults are `{autoClose: true, lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}`.
`autoClose` is effectively equivalent to:
```js
zipfile.once("end", function() {
zipfile.close();
});
```
`lazyEntries` indicates that entries should be read only when `readEntry()` is called.
If `lazyEntries` is `false`, `entry` events will be emitted as fast as possible to allow `pipe()`ing
file data from all entries in parallel.
This is not recommended, as it can lead to out of control memory usage for zip files with many entries.
See [issue #22](https://github.com/thejoshwolfe/yauzl/issues/22).
If `lazyEntries` is `true`, an `entry` or `end` event will be emitted in response to each call to `readEntry()`.
This allows processing of one entry at a time, and will keep memory usage under control for zip files with many entries.
`decodeStrings` is the default and causes yauzl to decode strings with `CP437` or `UTF-8` as required by the spec.
The exact effects of turning this option off are:
* `zipfile.comment`, `entry.fileName`, and `entry.fileComment` will be `Buffer` objects instead of `String`s.
* Any Info-ZIP Unicode Path Extra Field will be ignored. See `extraFields`.
* Automatic file name validation will not be performed. See `validateFileName()`.
`validateEntrySizes` is the default and ensures that an entry's reported uncompressed size matches its actual uncompressed size.
This check happens as early as possible, which is either before emitting each `"entry"` event (for entries with no compression),
or during the `readStream` piping after calling `openReadStream()`.
See `openReadStream()` for more information on defending against zip bomb attacks.
When `strictFileNames` is `false` (the default) and `decodeStrings` is `true`,
all backslash (`\`) characters in each `entry.fileName` are replaced with forward slashes (`/`).
The spec forbids file names with backslashes,
but Microsoft's `System.IO.Compression.ZipFile` class in .NET versions 4.5.0 until 4.6.1
creates non-conformant zipfiles with backslashes in file names.
`strictFileNames` is `false` by default so that clients can read these
non-conformant zipfiles without knowing about this Microsoft-specific bug.
When `strictFileNames` is `true` and `decodeStrings` is `true`,
entries with backslashes in their file names will result in an error. See `validateFileName()`.
When `decodeStrings` is `false`, `strictFileNames` has no effect.
The `callback` is given the arguments `(err, zipfile)`.
An `err` is provided if the End of Central Directory Record cannot be found, or if its metadata appears malformed.
This kind of error usually indicates that this is not a zip file.
Otherwise, `zipfile` is an instance of `ZipFile`.
### fromFd(fd, [options], [callback])
Reads from the fd, which is presumed to be an open .zip file.
Note that random access is required by the zip file specification,
so the fd cannot be an open socket or any other fd that does not support random access.
`options` may be omitted or `null`. The defaults are `{autoClose: false, lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}`.
See `open()` for the meaning of the options and callback.
### fromBuffer(buffer, [options], [callback])
Like `fromFd()`, but reads from a RAM buffer instead of an open file.
`buffer` is a `Buffer`.
If a `ZipFile` is acquired from this method,
it will never emit the `close` event,
and calling `close()` is not necessary.
`options` may be omitted or `null`. The defaults are `{lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}`.
See `open()` for the meaning of the options and callback.
The `autoClose` option is ignored for this method.
### fromRandomAccessReader(reader, totalSize, [options], [callback])
This method of reading a zip file allows clients to implement their own back-end file system.
For example, a client might translate read calls into network requests.
The `reader` parameter must be of a type that is a subclass of
[RandomAccessReader](#class-randomaccessreader) that implements the required methods.
The `totalSize` is a Number and indicates the total file size of the zip file.
`options` may be omitted or `null`. The defaults are `{autoClose: true, lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}`.
See `open()` for the meaning of the options and callback.
### dosDateTimeToDate(date, time)
Converts MS-DOS `date` and `time` data into a JavaScript `Date` object.
Each parameter is a `Number` treated as an unsigned 16-bit integer.
Note that this format does not support timezones,
so the returned object will use the local timezone.
### validateFileName(fileName)
Returns `null` or a `String` error message depending on the validity of `fileName`.
If `fileName` starts with `"/"` or `/[A-Za-z]:\//` or if it contains `".."` path segments or `"\\"`,
this function returns an error message appropriate for use like this:
```js
var errorMessage = yauzl.validateFileName(fileName);
if (errorMessage != null) throw new Error(errorMessage);
```
This function is automatically run for each entry, as long as `decodeStrings` is `true`.
See `open()`, `strictFileNames`, and `Event: "entry"` for more information.
### Class: ZipFile
The constructor for the class is not part of the public API.
Use `open()`, `fromFd()`, `fromBuffer()`, or `fromRandomAccessReader()` instead.
#### Event: "entry"
Callback gets `(entry)`, which is an `Entry`.
See `open()` and `readEntry()` for when this event is emitted.
If `decodeStrings` is `true`, entries emitted via this event have already passed file name validation.
See `validateFileName()` and `open()` for more information.
If `validateEntrySizes` is `true` and this entry's `compressionMethod` is `0` (stored without compression),
this entry has already passed entry size validation.
See `open()` for more information.
#### Event: "end"
Emitted after the last `entry` event has been