-
Notifications
You must be signed in to change notification settings - Fork 218
Description
Motivation
deck.gl has multiple feature areas that would benefit from a shared & standardized computation module. Example use cases:
- To consume data input in memory-efficient formats such as Arrow and Parquet
- Attribute animation/transition (currently implemented with ad-hoc transforms)
- Data aggregation (currently implemented with ad-hoc transforms)
- To perform one-time data transform such as 64bit split, coordinate system conversion (currently implemented on the CPU)
Proposal
Create a new module @luma.gl/gpgpu
.
The proposed syntax is strongly inspired by tensorflow.js, especially the functions in Creation, Slicing and Joining, Arithmetic, Basic Math & Reduction
Example API
Creation: returns a wrapper of a GPU buffer
// A column with stride of 1, example: Int32Array [property0, property1, property2, ...]
gpu.array1d(data: TypedArray): GPUAttribute
// A column with stride > 1, example: Float64Array [x0, y0, x1, y1, x2, y2, ...]
gpu.array2d(data: TypedArray, shape: [number, number]): GPUAttribute
// Constant scaler|vec2|vec3|vec4
gpu.constant(value: number | NumericArray): GPUAttribute
Reshape: joining, slicing and/or rearranging GPU buffers
// Interleaving multiple columns, example: [x0, x1, x2, ...] + [y0, y1, y2, ...] -> [x0, y0, x1, y1, x2, y2, ...]
gpu.stack(values: GPUAttribute[]): GPUAttribute
// N-D slicing of interleaved buffer, example: [a0, b0, c1, a1, b1, c1, a2, b2, c2, ...] -> [a0, c0, a1, c1, c2, c2, ...]
gpu.slice(value: GPUAttribute, begin: number[], size: number[]): GPUAttribute
Transform: math operations on GPU buffers
// element-wise add
gpu.add(value: GPUAttribute, 1): GPUAttribute
// min value across dimensions
gpu.min(value: GPUAttribute): number
// map each 64bit float element to two 32bit floats, as in highPart=Math.fround(x) and lowPart = x - highPart
gpu.fp64Split(value: GPUAttribute): [highPart: GPUAttribute, lowPart: GPUAttribute]
Interface with loaders.gl
There is no direct dependency on loaders.gl, but the module can be "loaders friendly" by accepting a Table
shaped input:
gpu.array1d(table: Table, columnNameOrIndex: string | number): GPUAttribute
Interface with deck.gl
deck.gl could add support for accessors that return a @luma.gl/gpgpu
GPUAttribute
object. If such an accessor is provided, instead of filling an attribute's value array on the CPU, the underlying GPU buffer is directly transferred.
Sample layer with JSON input:
import {ScatterplotLayer} from '@deck.gl/layers';
import {extent} from 'd3-array';
import {scaleLog} from 'd3-scale';
import memoize from 'memoize';
const getColorScaleMemoized = memoize(
data => scaleLog()
.domain(extent(data, d => d.color_value))
.range([0, 200, 255, 255], [255, 180, 0, 255]);
);
const layer = new ScatterplotLayer({
data: 'points.json',
getPosition: d => [d.x, d.y, d.z],
getRadius: d => Math.max(Math.min(d.radius_value * 10, 100), 1),
getFillColor: (d, {data}) => getColorScaleMemoized(data)(d.color_value)
});
Equivalent layer with Arrow input (option A):
import {ScatterplotLayer} from '@deck.gl/layers';
import {gpu} from '@luma.gl/gpgpu';
import {ArrowLoader} from '@loaders.gl/arrow';
import type {Table, ArrowTableBatch} from '@loaders.gl/schema';
const layer = new ScatterplotLayer({
data: 'points.arrow',
loaders: [ArrowLoader],
getPosition: (_, {data}: {data: Table | ArrowTableBatch}) => {
const x = gpu.array1d(data, 'x');
const y = gpu.array1d(data, 'y');
const z = gpu.constant(0);
return gpu.stack([x, y, z]);
},
getRadius: (_, {data}: {data: Table | ArrowTableBatch}) => {
const value = gpu.array1d(data, 'radius_value');
return value.mul(10).clipByValue(1, 100);
},
getFillColor: (_, {data}: {data: Table | ArrowTableBatch}) => {
const value = gpu.array1d(data, 'color_value');
return value.scaleLog([[0, 200, 255, 255], [255, 180, 0, 255]]);
}
});
Equivalent declarative layer with Arrow input (option B):
{
"type": "ScatterplotLayer",
"data": "points.arrow",
"getPosition": ["x", "y", "z"],
"getRadius": {
"source": "radius_value",
"transform": [
{"func": "mul", "args": [10]},
{"func": "clipByValue", "args": [1, 100]}
]
},
"getFillColor": {
"source": "color_value",
"transform": [
{"func": "scaleLog", "args": [[0, 200, 255, 255], [255, 180, 0, 255]]}
]
}
}
Implementation Considerations
-
It might be appropriate to move the
BufferTransform
andTextureTransform
classes from the engine module to this new module. -
The module will contain multiple "backends" for WebGL2 and WebGPU. Dynamic import can be used to reduce runtime footprint.
-
Actual GPU resources (shaders/buffer) will need to be lazily allocated/written when the buffer is accessed. This allows a) the JS wrapper to be created without waiting for an available device; b) batching calculations for performance instead of running one render pass for each JS function; c) the buffer to be created on the same device where it will be used for render:
gpuAttribute.getBuffer(device: Device): Buffer;
-
Release of no longer needed resources. Consider the following case:
getPosition: (_, {data}: {data: Table | ArrowTableBatch}) => { const x = gpu.array1d(data, 'x'); // intermediate buffer that will not be needed after evaluation const y = gpu.array1d(data, 'y'); // intermediate buffer that will not be needed after evaluation const z = gpu.constant(0); return gpu.stack([x, y, z]); // output buffer that will be used for render }
We could have something similar to tf.tidy(fn) which cleans up all intermediate tensors allocated by
fn
except those returned byfn
.Alternatively we could consider using FinalizationRegistry to clean up intermediate buffers, though the application will have less control of when the clean up happens. (i.e. the standard deck.gl Layer tests will fail due to unreleased WebGL resources).
Discussion
-
Do we want to use an existing external library instead of rolling our own?
First of all, I have not conducted an extensive investigation of existing offerings, so additional comment on this is much welcomed. Based on my own experience, the main pain point (with a long maintenance tail) is context sharing (required for deck.gl to reuse the output GPU buffer without reading it out to the CPU).
- tensorflow.js: a proof-of-concept is available here. It is very mature with a big user base, cross-platform existence, and a variety of backend implementations (WebGL, WebGPU, WebAssembly). The library itself is fairly heavy-weight (> 1 MB minified) with extra machine-learning functionalities, though it could likely be reduced if we re-distribute a tree-shaked bundle. Forcing it to use an external WebGL context is painful because the context state handoff is not clean.
- gpu.js: the ability to write JavaScript functions that get translated to shader code is very appealing. The library has not been updated for 2 years and I doubt there will be WebGPU support.
-
TBD