跳到主要内容

二进制数据

This page is intended as an introduction to working with binary data in JavaScript. Bun implements a number of data types and utilities for working with binary data, most of which are Web-standard. Any Bun-specific APIs will be noted as such.

Below is a quick "cheat sheet" that doubles as a table of contents. Click an item in the left column to jump to that section.

ArrayBuffer and views

Until 2009, there was no language-native way to store and manipulate binary data in JavaScript. ECMAScript v5 introduced a range of new mechanisms for this. The most fundamental building block is ArrayBuffer, a simple data structure that represents a sequence of bytes in memory.

// this buffer can store 8 bytes
const buf = new ArrayBuffer(8);

Despite the name, it isn't an array and supports none of the array methods and operators one might expect. In fact, there is no way to directly read or write values from an ArrayBuffer. There's very little you can do with one except check its size and create "slices" from it.

const buf = new ArrayBuffer(8);
buf.byteLength; // => 8

const slice = buf.slice(0, 4); // returns new ArrayBuffer
slice.byteLength; // => 4

To do anything interesting we need a construct known as a "view". A view is a class that wraps an ArrayBuffer instance and lets you read and manipulate the underlying data. There are two types of views: typed arrays and DataView.

DataView

The DataView class is a lower-level interface for reading and manipulating the data in an ArrayBuffer.

Below we create a new DataView and set the first byte to 3.

const buf = new ArrayBuffer(4);
// [0b00000000, 0b00000000, 0b00000000, 0b00000000]

const dv = new DataView(buf);
dv.setUint8(0, 3); // write value 3 at byte offset 0
dv.getUint8(0); // => 3
// [0b00000011, 0b00000000, 0b00000000, 0b00000000]

Now let's write a Uint16 at byte offset 1. This requires two bytes. We're using the value 513, which is 2 * 256 + 1; in bytes, that's 00000010 00000001.

dv.setUint16(1, 513);
// [0b00000011, 0b00000010, 0b00000001, 0b00000000]

console.log(dv.getUint16(1)); // => 513

We've now assigned a value to the first three bytes in our underlying ArrayBuffer. Even though the second and third bytes were created using setUint16(), we can still read each of its component bytes using getUint8().

console.log(dv.getUint8(1)); // => 2
console.log(dv.getUint8(2)); // => 1

Attempting to write a value that requires more space than is available in the underlying ArrayBuffer will cause an error. Below we attempt to write a Float64 (which requires 8 bytes) at byte offset 0, but there are only four total bytes in the buffer.

dv.setFloat64(0, 3.1415);
// ^ RangeError: Out of bounds access

The following methods are available on DataView:

TypedArray

Typed arrays are a family of classes that provide an Array-like interface for interacting with data in an ArrayBuffer. Whereas a DataView lets you write numbers of varying size at a particular offset, a TypedArray interprets the underlying bytes as an array of numbers, each of a fixed size.

备注

Note — It's common to refer to this family of classes collectively by their shared superclass TypedArray. This class as internal to JavaScript; you can't directly create instances of it, and TypedArray is not defined in the global scope. Think of it as an interface or an abstract class.

const buffer = new ArrayBuffer(3);
const arr = new Uint8Array(buffer);

// contents are initialized to zero
console.log(arr); // Uint8Array(3) [0, 0, 0]

// assign values like an array
arr[0] = 0;
arr[1] = 10;
arr[2] = 255;
arr[3] = 255; // no-op, out of bounds

While an ArrayBuffer is a generic sequence of bytes, these typed array classes interpret the bytes as an array of numbers of a given byte size. The top row contains the raw bytes, and the later rows contain how these bytes will be interpreted when viewed using different typed array classes.

The following classes are typed arrays, along with a description of how they interpret the bytes in an ArrayBuffer:

The table below demonstrates how the bytes in an ArrayBuffer are interpreted when viewed using different typed array classes.

To create a typed array from a pre-defined ArrayBuffer:

// create typed array from ArrayBuffer
const buf = new ArrayBuffer(10);
const arr = new Uint8Array(buf);

arr[0] = 30;
arr[1] = 60;

// all elements are initialized to zero
console.log(arr); // => Uint8Array(10) [ 30, 60, 0, 0, 0, 0, 0, 0, 0, 0 ];

If we tried to instantiate a Uint32Array from this same ArrayBuffer, we'd get an error.

const buf = new ArrayBuffer(10);
const arr = new Uint32Array(buf);
// ^ RangeError: ArrayBuffer length minus the byteOffset
// is not a multiple of the element size

A Uint32 value requires four bytes (16 bits). Because the ArrayBuffer is 10 bytes long, there's no way to cleanly divide its contents into 4-byte chunks.

To fix this, we can create a typed array over a particular "slice" of an ArrayBuffer. The Uint16Array below only "views" the first 8 bytes of the underlying ArrayBuffer. To achieve these, we specify a byteOffset of 0 and a length of 2, which indicates the number of Uint32 numbers we want our array to hold.

// create typed array from ArrayBuffer slice
const buf = new ArrayBuffer(10);
const arr = new Uint32Array(buf, 0, 2);

/*
buf _ _ _ _ _ _ _ _ _ _ 10 bytes
arr [_______,_______] 2 4-byte elements
*/

arr.byteOffset; // 0
arr.length; // 2

You don't need to explicitly create an ArrayBuffer instance; you can instead directly specify a length in the typed array constructor:

const arr2 = new Uint8Array(5);

// all elements are initialized to zero
// => Uint8Array(5) [0, 0, 0, 0, 0]

Typed arrays can also be instantiated directly from an array of numbers, or another typed array:

// from an array of numbers
const arr1 = new Uint8Array([0, 1, 2, 3, 4, 5, 6, 7]);
arr1[0]; // => 0;
arr1[7]; // => 7;

// from another typed array
const arr2 = new Uint8Array(arr);

Broadly speaking, typed arrays provide the same methods as regular arrays, with a few exceptions. For example, push and pop are not available on typed arrays, because they would require resizing the underlying ArrayBuffer.

const arr = new Uint8Array([0, 1, 2, 3, 4, 5, 6, 7]);

// supports common array methods
arr.filter(n => n > 128); // Uint8Array(1) [255]
arr.map(n => n * 2); // Uint8Array(8) [0, 2, 4, 6, 8, 10, 12, 14]
arr.reduce((acc, n) => acc + n, 0); // 28
arr.forEach(n => console.log(n)); // 0 1 2 3 4 5 6 7
arr.every(n => n < 10); // true
arr.find(n => n > 5); // 6
arr.includes(5); // true
arr.indexOf(5); // 5

Refer to the MDN documentation for more information on the properties and methods of typed arrays.

Uint8Array

It's worth specifically highlighting Uint8Array, as it represents a classic "byte array"—a sequence of 8-bit unsigned integers between 0 and 255. This is the most common typed array you'll encounter in JavaScript.

It is the return value of TextEncoder#encode, and the input type of TextDecoder#decode, two utility classes designed to translate strings and various binary encodings, most notably "utf-8".

const encoder = new TextEncoder();
const bytes = encoder.encode("hello world");
// => Uint8Array(11) [ 104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100 ]

const decoder = new TextDecoder();
const text = decoder.decode(bytes);
// => hello world

Buffer

Bun implements Buffer, a Node.js API for working with binary data that pre-dates the introduction of typed arrays in the JavaScript spec. It has since been re-implemented as a subclass of Uint8Array. It provides a wide range of methods, including several Array-like and DataView-like methods.

const buf = Buffer.from("hello world");
// => Buffer(11) [ 104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100 ]

buf.length; // => 11
buf[0]; // => 104, ascii for 'h'
buf.writeUInt8(72, 0); // => ascii for 'H'

console.log(buf.toString());
// => Hello world

For complete documentation, refer to the Node.js documentation.

Blob

Blob is a Web API commonly used for representing files. Blob was initially implemented in browsers (unlike ArrayBuffer which is part of JavaScript itself), but it is now supported in Node and Bun.

It isn't common to directly create Blob instances. More often, you'll receive instances of Blob from an external source (like an <input type="file"> element in the browser) or library. That said, it is possible to create a Blob from one or more string or binary "blob parts".

const blob = new Blob(["<html>Hello</html>"], {
type: "text/html",
});

blob.type; // => text/html
blob.size; // => 19

These parts can be string, ArrayBuffer, TypedArray, DataView, or other Blob instances. The blob parts are concatenated together in the order they are provided.

const blob = new Blob([
"<html>",
new Blob(["<body>"]),
new Uint8Array([104, 101, 108, 108, 111]), // "hello" in binary
"</body></html>",
]);

The contents of a Blob can be asynchronously read in various formats.

await blob.text(); // => <html><body>hello</body></html>
await blob.arrayBuffer(); // => ArrayBuffer (copies contents)
await blob.stream(); // => ReadableStream

BunFile

BunFile is a subclass of Blob used to represent a lazily-loaded file on disk. Like File, it adds a name and lastModified property. Unlike File, it does not require the file to be loaded into memory.

const file = Bun.file("index.txt");
// => BunFile

File

备注

Browser only. Experimental support in Node.js 20.

File is a subclass of Blob that adds a name and lastModified property. It's commonly used in the browser to represent files uploaded via a <input type="file"> element. Node.js and Bun implement File.

// on browser!
// <input type="file" id="file" />

const files = document.getElementById("file").files;
// => File[]
const file = new File(["<html>Hello</html>"], "index.html", {
type: "text/html",
});

Refer to the MDN documentation for complete docs information.

Streams

Streams are an important abstraction for working with binary data without loading it all into memory at once. They are commonly used for reading and writing files, sending and receiving network requests, and processing large amounts of data.

Bun implements the Web APIs ReadableStream and WritableStream.

备注

Bun also implements the node:stream module, including Readable, Writable, and Duplex. For complete documentation, refer to the Node.js docs.

To create a simple readable stream:

const stream = new ReadableStream({
start(controller) {
controller.enqueue("hello");
controller.enqueue("world");
controller.close();
},
});

The contents of this stream can be read chunk-by-chunk with for await syntax.

for await (const chunk of stream) {
console.log(chunk);
// => "hello"
// => "world"
}

For a more complete discussion of streams in Bun, see API > Streams.

Conversion

Converting from one binary format to another is a common task. This section is intended as a reference.

From ArrayBuffer

Since ArrayBuffer stores the data that underlies other binary structures like TypedArray, the snippets below are not converting from ArrayBuffer to another format. Instead, they are creating a new instance using the data stored underlying data.

To TypedArray

new Uint8Array(buf);

To DataView

new DataView(buf);

To Buffer

// create Buffer over entire ArrayBuffer
Buffer.from(buf);

// create Buffer over a slice of the ArrayBuffer
Buffer.from(buf, 0, 10);

To string

new TextDecoder().decode(buf);

To number[]

Array.from(new Uint8Array(buf));

To Blob

new Blob([buf], { type: "text/plain" });

To ReadableStream

The following snippet creates a ReadableStream and enqueues the entire ArrayBuffer as a single chunk.

new ReadableStream({
start(controller) {
controller.enqueue(buf);
controller.close();
},
});
With chunking

To stream the ArrayBuffer in chunks, use a Uint8Array view and enqueue each chunk.

const view = new Uint8Array(buf);
const chunkSize = 1024;

new ReadableStream({
start(controller) {
for (let i = 0; i < view.length; i += chunkSize) {
controller.enqueue(view.slice(i, i + chunkSize));
}
controller.close();
},
});

From TypedArray

To ArrayBuffer

This retrieves the underlying ArrayBuffer. Note that a TypedArray can be a view of a slice of the underlying buffer, so the sizes may differ.

arr.buffer;

To DataView

To creates a DataView over the same byte range as the TypedArray.

new DataView(arr.buffer, arr.byteOffset, arr.byteLength);

To Buffer

Buffer.from(arr);

To string

new TextDecoder().decode(arr);

To number[]

Array.from(arr);

To Blob

new Blob([arr.buffer], { type: "text/plain" });

To ReadableStream

new ReadableStream({
start(controller) {
controller.enqueue(arr);
controller.close();
},
});
With chunking

To stream the ArrayBuffer in chunks, split the TypedArray into chunks and enqueue each one individually.

new ReadableStream({
start(controller) {
for (let i = 0; i < arr.length; i += chunkSize) {
controller.enqueue(arr.slice(i, i + chunkSize));
}
controller.close();
},
});

From DataView

To ArrayBuffer

view.buffer;

To TypedArray

Only works if the byteLength of the DataView is a multiple of the BYTES_PER_ELEMENT of the TypedArray subclass.

new Uint8Array(view.buffer, view.byteOffset, view.byteLength);
new Uint16Array(view.buffer, view.byteOffset, view.byteLength / 2);
new Uint32Array(view.buffer, view.byteOffset, view.byteLength / 4);
// etc...

To Buffer

Buffer.from(view.buffer, view.byteOffset, view.byteLength);

To string

new TextDecoder().decode(view);

To number[]

Array.from(view);

To Blob

new Blob([view.buffer], { type: "text/plain" });

To ReadableStream

new ReadableStream({
start(controller) {
controller.enqueue(view.buffer);
controller.close();
},
});
With chunking

To stream the ArrayBuffer in chunks, split the DataView into chunks and enqueue each one individually.

new ReadableStream({
start(controller) {
for (let i = 0; i < view.byteLength; i += chunkSize) {
controller.enqueue(view.buffer.slice(i, i + chunkSize));
}
controller.close();
},
});

From Buffer

To ArrayBuffer

buf.buffer;

To TypedArray

new Uint8Array(buf);

To DataView

new DataView(buf.buffer, buf.byteOffset, buf.byteLength);

To string

buf.toString();

To number[]

Array.from(buf);

To Blob

new Blob([buf], { type: "text/plain" });

To ReadableStream

new ReadableStream({
start(controller) {
controller.enqueue(buf);
controller.close();
},
});
With chunking

To stream the ArrayBuffer in chunks, split the Buffer into chunks and enqueue each one individually.

new ReadableStream({
start(controller) {
for (let i = 0; i < buf.length; i += chunkSize) {
controller.enqueue(buf.slice(i, i + chunkSize));
}
controller.close();
},
});

From Blob

To ArrayBuffer

The Blob class provides a convenience method for this purpose.

await blob.arrayBuffer();

To TypedArray

new Uint8Array(await blob.arrayBuffer());

To DataView

new DataView(await blob.arrayBuffer());

To Buffer

Buffer.from(await blob.arrayBuffer());

To string

await blob.text();

To number[]

Array.from(new Uint8Array(await blob.arrayBuffer()));

To ReadableStream

blob.stream();

From ReadableStream

It's common to use Response as a convenient intermediate representation to make it easier to convert ReadableStream to other formats.

stream; // ReadableStream

const buffer = new Response(stream).arrayBuffer();

However this approach is verbose and adds overhead that slows down overall performance unnecessarily. Bun implements a set of optimized convenience functions for converting ReadableStream various binary formats.

To ArrayBuffer

// with Response
new Response(stream).arrayBuffer();

// with Bun function
Bun.readableStreamToArrayBuffer(stream);

To Uint8Array

// with Response
new Response(stream).bytes();

// with Bun function
Bun.readableStreamToBytes(stream);

To TypedArray

// with Response
const buf = await new Response(stream).arrayBuffer();
new Int8Array(buf);

// with Bun function
new Int8Array(Bun.readableStreamToArrayBuffer(stream));

To DataView

// with Response
const buf = await new Response(stream).arrayBuffer();
new DataView(buf);

// with Bun function
new DataView(Bun.readableStreamToArrayBuffer(stream));

To Buffer

// with Response
const buf = await new Response(stream).arrayBuffer();
Buffer.from(buf);

// with Bun function
Buffer.from(Bun.readableStreamToArrayBuffer(stream));

To string

// with Response
new Response(stream).text();

// with Bun function
await Bun.readableStreamToText(stream);

To number[]

// with Response
const buf = await new Response(stream).arrayBuffer();
Array.from(new Uint8Array(buf));

// with Bun function
Array.from(new Uint8Array(Bun.readableStreamToArrayBuffer(stream)));

Bun provides a utility for resolving a ReadableStream to an array of its chunks. Each chunk may be a string, typed array, or ArrayBuffer.

// with Bun function
Bun.readableStreamToArray(stream);

To Blob

new Response(stream).blob();

To ReadableStream

To split a ReadableStream into two streams that can be consumed independently:

const [a, b] = stream.tee();