A "lazy" / "meta" implementation of the array api? #777

adonath · 2024-01-11T02:18:36Z

adonath
Jan 11, 2024

In addition to the already available implementations of the array api I think it could be interesting to have a lazy / meta implementation of the standard. What I mean is a small, minimal dependency, standalone library, compatible with the array api, that provides inference of the shape and dtype of resulting arrays, without ever initializing the data and executing any flops.

PyTorch already has something like this with the "meta" device. For example:

import torch

data = torch.ones((1000, 1000, 100), device="meta")
kernel = torch.ones((100, 10), device="meta")

result = torch.matmul(data, kernel)
print(result.shape)
print(result.dtype)

However this misses for example the device handling, as the device is constrained to "meta". I presume that dask must have something very similar. Jax also must have something very similar for the jitted computations. However I think it is only exposed to users with a different API via jax.eval_shape() and not via an "meta" array object.

Similarly to the torch example one would use a hypothetical library lazy_array_api:

import lazy_array_api as xp

data = xp.ones((1000, 1000, 100), device="cpu")
kernel = xp.ones((100, 10), device="cpu")

result = xp.matmul(data, kernel)
print(result.shape)
print(result.dtype)

The use case I have in mind is mostly debugging, validation and testing of computational intense algorithms ("dry runs"). For now I just wanted to share the idea and bring it up for discussion.

rgommers · 2024-01-11T22:06:18Z

rgommers
Jan 11, 2024
Maintainer

This would be super useful indeed. It's not a small amount of work I suspect. For indexing there is https://github.com/Quansight-Labs/ndindex/, which basically implements this "meta" idea. That's probably one of the most hairy parts to do, and a good start. But correctly doing all shape calculations for all functions in the API is also a large job. Perhaps others know of reusable functionality elsewhere for this? For PyTorch I believe it's too much baked into the library to be able to reuse it standalone.

0 replies

adonath · 2024-01-12T17:45:58Z

adonath
Jan 12, 2024
Author

Thanks a lot @rgommers for the response!

It's not a small amount of work I suspect.

I only partly agree, because there is nothing particular difficult about it. The expected behavior is well defined, the API is already defined, so there is no tricky code to be figured out. It is just a matter of implementing the already defined behavior "dilligently".

For indexing there is https://github.com/Quansight-Labs/ndindex/, which basically implements this "meta" idea.

This is indeed a great start, I was not aware of this project.

But correctly doing all shape calculations for all functions in the API is also a large job.

I think the effort could actually be limited, because looking at https://github.com/numpy/numpy/tree/main/numpy/array_api, the files already pre-group the api into operations with the same behavior in terms of shape computation, i.e. element wise, indexing, searching, statistical, etc. For each group the behavior only needs to defined once, the rest is filling in boiler plate code. In addition there is broadcasting and indexing, which always applies. I'm less sure about the dtype promotion, but this must have been coded somewhere already as well.

Perhaps others know of reusable functionality elsewhere for this? For PyTorch I believe it's too much baked into the library to be able to reuse it standalone.

I agree PyTorch is already to large of a dependency. From a quick search I only found https://github.com/NeuralEnsemble/lazyarray, which seems to be un-maintained. It also has a different approach of building a graph and then delay the evaluation.

0 replies

adonath · 2024-01-22T16:51:01Z

adonath
Jan 22, 2024
Author

I'd like to get a better idea of the actual implementation effort and just share some more thoughts on this idea.

The main use case for this would be shape inference and dtype inference. I guess it makes sense to really consider it as an implementation of the standard, with a corresponding array object, operators and functions. Shapes and dtypes are directly computed, there will be no graph like handling / delayed execution. This is out of scope.
Data dependent methods, such as unique cannot really be supported. Only if one allows "un-initialized" shapes (see below)
Sometimes it might useful to work with "un-initialized shapes". For example the length of a "batch axes" might not be known and often one would not care about it either. As "batches" are treated independent (None, 2, 3, 4) could be handled as a valid shape. After specific operations, such as the mean along the batch axis, the shape becomes known.
However it is not a valid shape in the Array-API.
Typically users / developers would change from the lazy implementation to one supporting data. This might be relevant because, they would like to work with dtypes and device supported by the actual implementation. So having some backend specific device validation even for lazy implementation would be good.
In general I guess one would use https://github.com/numpy/numpy/tree/main/numpy/array_api as a starting point, delete all actual functionality and use it as an API template. The type promotion table and return type handling could be kept.
ndindex is great! It also already provides https://quansight-labs.github.io/ndindex/api.html#ndindex.broadcast_shapes and does the indexing. So it will definitely be a dependency.
ndindex has Numpy and Cython as an optional dependency. I think the situation will be exactly the same for such a lazy implementation. I don't think anything else would be needed.

0 replies

lucascolley · 2024-01-22T17:02:36Z

lucascolley
Jan 22, 2024

Sometimes it might useful to work with "un-initialized shapes". For example the length of a "batch axes" might not be known and often one would not care about it either. As "batches" are treated independent (None, 2, 3, 4) could be handled as a valid shape. After specific operations, such as the mean along the batch axis, the shape becomes known.
However it is not a valid shape in the Array-API.

Is this not a valid shape? The spec for shape says

out (Tuple[Optional[int], …]) – array dimensions. An array dimension must be None if and only if a dimension is unknown.

0 replies

adonath · 2024-01-22T18:12:26Z

adonath
Jan 22, 2024
Author

Indeed this is already part of the spec. I missed that before!

0 replies

adonath · 2024-02-05T21:35:52Z

adonath
Feb 5, 2024
Author

Actually I'd be interested in starting a repo and playing around with this a bit. Any preference for a name @rgommers or @lucascolley? What about ndshape or xpshape for example?

0 replies

TomNicholas · 2024-07-25T22:21:03Z

TomNicholas
Jul 25, 2024

@adonath Xarray has an internal version of exactly this, which actually gets used by default every time you use xarray to open data from disk without dask installed. We have had an open issue about the idea of lifting this functionality out into a separate library for years: pydata/xarray#5081

Our lazy indexing classes aren't very well publicised but there are some docs here, and the implementation is all in this file . The key object is our LazilyIndexedArray.

The one (major) limitation is that our implementation only supports indexing, not its dual concatenation, so would eagerly compute if you tried to call np.concatenate on two LazilyIndexedArrays (see pydata/xarray#4628)

What I mean is a small, minimal dependency, standalone library, compatible with the array api, that provides inference of the shape and dtype of resulting arrays, without ever initializing the data and executing any flops.

This would be amazing! If there is interest from other parties in creating this package I think xarray would be very interested to collaborate & could have a lot to offer.

Any preference for a name?

larray? larry? 😁

2 replies

shoyer Jul 26, 2024

Xarray's lazy indexing is a little different from this idea as I understand -- it supports actually computing values, too, when arrays are explicitly converted into NumPy.

saulshanabrook Aug 2, 2024

What I mean is a small, minimal dependency, standalone library, compatible with the array api, that provides inference of the shape and dtype of resulting arrays, without ever initializing the data and executing any flops.

I would also be interested in collaborating on this! I know it's a bit of an experimental dependency, but I would be curious about trying to implement this on top of egglog. It could allow a modular way to add support for actual computation or other types of analysis on top of the same infrastructure as well.

TomNicholas · 2024-07-25T23:07:48Z

TomNicholas
Jul 25, 2024

(I got an email saying @dcherian posted this comment which now I can't find)

https://autoray.readthedocs.io/en/latest/lazy_computation.html seems relevant.

--- (my reply)

Investigating the computational graph, including cost and memory usage,

of a calculation ahead of time. This is also interesting because it's the same basic insight that Cubed is based around. i.e. that if you know the shape and dtype before evaluation, you also know the size, and hence the memory usage. https://github.com/cubed-dev/cubed

0 replies

cbourjau · 2024-07-26T10:32:32Z

cbourjau
Jul 26, 2024
Collaborator

We recently open-sourced our ONNX-backed lazy implementation of the array API, ndonnx. Being ONNX-backed means that it is 100% lazy yet provides full type and shape inference for every operation.

0 replies

jakevdp · 2024-09-05T19:05:12Z

jakevdp
Sep 5, 2024
Collaborator

Hey - I just saw this. For what it's worth, JAX is built around the concept of abstract evaluation, which I think is what is being described here: a way to compute output shapes and dtypes without doing any computation. JAX v0.4.32 will also include Array API support in its default namespace.

So if you'd like a way to do this kind of lazy/abstract evaluation over array API implementations using an existing package, you can use jax.eval_shape:

In [1]: import jax

In [2]: def f(x):
   ...:     xp = x.__array_namespace__()  # built-in in JAX v0.4.32 or newer
   ...:     return xp.outer(x, x[:-1])
   ...: 

In [3]: x = jax.numpy.arange(4)

In [4]: jax.eval_shape(f, x)
Out[4]: ShapeDtypeStruct(shape=(4, 3), dtype=int32)

0 replies

adonath · 2024-09-06T14:45:06Z

adonath
Sep 6, 2024
Author

Thanks everyone for the numerous and diverse comments! The initial motivation for this issue was to potentially initiate work on a dedicated package, however it became obvious that there have already been multiple efforts from the community to implement similar functionality. There are many mentioned in this discussion thread, here is a short summary:

xarray has the the LazilyIndexedArrays
There is autoray.lazy
Jax has abstract arrays and evaluations and jax.eval_shape
PyTorch has a meta device
There is ndonnx for ONNX backed operations
There is also mlx, which is lazily evaluated by default.

It seems all options are more or less compatible with the array-api. If not yet, they will be soon. I think at this point there is no strong motivation for another implementation, as potential users can just make their choice based on the existing options. Of course there is the question of unification / consolidating efforts, but the functionality is typically bound to the actual array implementation. Especially with regards to device handling and additional functionality, not support by the array API. The abstract evaluation step, is also typically the step where a computational graph is built for compilation, which is again tightly coupled to the actual implementation of the array API. So It does not make sense for existing packages to change to an independent implementation. I think the only remaining motivation for an independent package might be the minimal dependency / stand-alone approach.

I would keep the discussion open for now and let others comment, as they might add new options and this discussion and for now it becomes more of an entry point for users.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A "lazy" / "meta" implementation of the array api? #777

{{title}}

Replies: 11 comments 2 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

A "lazy" / "meta" implementation of the array api? #777

adonath Jan 11, 2024

Replies: 11 comments · 2 replies

rgommers Jan 11, 2024 Maintainer

adonath Jan 12, 2024 Author

adonath Jan 22, 2024 Author

lucascolley Jan 22, 2024

adonath Jan 22, 2024 Author

adonath Feb 5, 2024 Author

TomNicholas Jul 25, 2024

shoyer Jul 26, 2024

saulshanabrook Aug 2, 2024

TomNicholas Jul 25, 2024

cbourjau Jul 26, 2024 Collaborator

jakevdp Sep 5, 2024 Collaborator

adonath Sep 6, 2024 Author

adonath
Jan 11, 2024

Replies: 11 comments 2 replies

rgommers
Jan 11, 2024
Maintainer

adonath
Jan 12, 2024
Author

adonath
Jan 22, 2024
Author

lucascolley
Jan 22, 2024

adonath
Jan 22, 2024
Author

adonath
Feb 5, 2024
Author

TomNicholas
Jul 25, 2024

TomNicholas
Jul 25, 2024

cbourjau
Jul 26, 2024
Collaborator

jakevdp
Sep 5, 2024
Collaborator

adonath
Sep 6, 2024
Author