Hi all,
So there're some great things about PyTorch, but one of the not-great things is that it uses a mostly-but-not-fully different API than the one used for numpy, theano, and tensorflow. I find myself having to consult a lookup-table when I want to run familiar commands.
Are there any plans to make a numpytorch, where the API matches numpy's as closely as possible?
An advantage would be that many functions written for numpy could also be used by pytorch. It would also be nice to be able to call matplotlib functions (plot, imshow, etc) directly with torch variables without having to do any hacking of matplotlib.
cc @ezyang @gchanan @zou3519 @mruberry @rgommers @heitorschueroff
this is something that'd be useful, but we cant commit yet to fully supporting such a package. We're first sorting out not having differences between Variable
and Tensor
, and then we'd get to torch.np
which then has to possibly expose a Tensor and a Variable class maybe? Have to think this through.
@petered do you have a more concrete design proposal? do you see yourself using it just for Tensor, or also for Variable?
So I'm a bit new to PyTorch and haven't fully wrapped my head around why Variable and Tensor are two different things (from the user's perspective anyway), though I see why the concepts should be separated in code.
do you see yourself using it just for Tensor, or also for Variable?
I would want this to work with autograd, if that's what you mean, so Variable.
More concretely, it would be great to work towards a torch.np
package, where:
.shape
instead of .size()
etc). There is only one kind of Tensor (instead of torch.FloatTensor
, torch.cuda.FloatTensor
, etc.), which has a property dtype, and a flag indicating which device it should live on. eg arr = torch.np.array([1,2,3], dtype='float32', device='gpu0', requires_grad=False)
with enable_autograd(False): ...
, which would effectively make functions return Tensors, not Variables. The Tensor API matches numpy's exactly
This is, IMHO, the biggest stopper in teaching (otherwise close to perfect) pytorch as a first deep learning framework.
Really looking forward to the day pytorch will match numpy API.
Really looking forward to the day pytorch will match numpy API. +1
As mentioned in pytorch/tutorials#197
some numpy-like syntax is not recommend, such as numpy.asarray
Note that NumPy changes planned in NEP-18 could be helpful here: http://www.numpy.org/neps/nep-0018-array-function-protocol.html
Is there still a plan to provide numpy-like api? It's really annoying to write two separate functions for numpy and pytorch.
I think there is a plan, @umanwizard might be working on it.
I opened a gist with some thoughts on this, which we have been discussing. But it seems gist is not the best format for open discussion, so I am reposting it here. @gchanan , can you paste your comments from the gist into this issue?
The goal of this project is to create an alternative Python API for PyTorch which is as similar as possible to NumPy, while retaining distinguishing features like simple GPU support and gradient computation, as illustrated below.
Basic Example:
>>> import torch.numpy as np
>>> a = np.arange(10)
>>> a.sum()
tensor(45)
torch.Tensor
and a torch.numpy.ndarray
and use both at will.torch.numpy
API.ndarray._cuda
, ndarray._backward
, etc).The torch.numpy.ndarray class is implemented similarly to torch.Tensor
: it is a python object wrapping a python extension API object (torch._np_compat._ndarray_base) which ultimately wraps a Variable
. This approach was chosen for two major reasons:
torch.Tensor
) allows us to implement exactly the set of methods we want on the torch.numpy.ndarray
type.torch.Tensor
enables existing Python binding codegen to be leveraged with minimal changes.Tensors and ndarrays can be freely converted one to another:
import torch, torch.numpy as np
>>> a = torch.randn(10)._np_compat()
>>> b = torch.randn(10)._np_compat()
>>> c = np.hypot(a, b)
>>> type(c)
<class 'torch.numpy.ndarray'>
>>> t = c._torch()
>>> type(t)
<class 'torch.Tensor'>
Code generation logic has been extended to allow NumPy API bindings to be defined in native_functions.yaml similarly to any other native function bindings. For example:
- func: sum(np.ndarray a) -> np.ndarray
variants: function, method
np_compat: True
causes a new signature to be added to the argument parser in the generated binding function THPVariable_sum
, the same function that already handles torch.Tensor.sum
.
In order to distinguish between the two cases, we make the generated binding function accept a template parameter <bool compat>
, controlling whether it should parse arguments according to the NumPy compatibility API. Then in the list of methods for the python extension objects backing torch.Tensor
and torch.numpy.ndarray
, we would add THPVariable_sum<false>
and THPVariable_sum<true>
, respectively.
Other than the bindings, this declaration of sum
does not cause any code to be generated. The actual functionality will be implemented by the existing at::native::sum
after appropriately translating arguments, as described below.
The argument parsing logic is extended to support the new compatibility mode. Parsers are now initialized with two separate lists of signatures: one for the traditional API and one for the new one. When invoked in the old mode, the new-API signatures are ignored, and everything works the same as always.
When invoked in the new compatibility mode, the argument parsing works in two steps. First, the arguments are parsed against the compatiblity signatures. If a match is found, the argument names are translated into PyTorch equivalents (e.g., a
is replaced by input
, keepdims
by keepdim
, and so on), and argument types are converted if necessary (e.g., any ndarray
is unwrapped and re-wrapped as a torch.Tensor
). This new set of arguments is then matched against the PyTorch API set of signatures, and dispatched as appropriate.
A set of common argument name translations (for now: a
, keepdims
, and axis
) is provided by default. It is also possible to add custom translations for a particular binding. The following example causes shape
to be replaced by size
.
- func: ones(int[1] shape, np.dtype dtype=float) -> np.ndarray
variants: function
np_compat: True
additional_translations:
shape: size
Obviously, if a function is supported by NumPy and not by PyTorch, we need to actually implement it, not just rely on argument translation magic.
The most straightfoward way to do this is to create a PyTorch binding, mark it as hidden, and then define a NumPy compatibility binding depending on it. For example:
- func: hypot(Tensor input, Tensor b) -> Tensor
variants: function
hidden: True
dispatch:
CPU: hypot
CUDA: hypot
- func: hypot(np.ndarray a, np.ndarray b) -> np.ndarray
variants: function
np_compat: True
The required underlying function at::native::hypot(Tensor const& input, Tensor const& b)
can then be implemented as usual, and torch.numpy.hypot(a, b)
will return the equivalent of sqrt(a*a + b*b)
, as expected.
A torch.numpy.ndarray
can be created on the GPU in either of two ways: either by creating it as usual in PyTorch and converting it to an ndarray using torch.Tensor._np_compat
(which just involves wrapping and unwrapping some objects, not copying any data), or by calling _cuda
on an existing torch.numpy.ndarray
. The ndarray can then be used as usual:
>>> import torch.numpy as np
>>> cpu = np.arange(10)
>>> cpu.sum()
tensor(45)
>>> gpu = cpu._cuda()
>>> gpu.sum()
tensor(45, device='cuda:0')
Not yet implemented.
Since a torch.numpy.ndarray
wraps a variable, in principle tracking gradients should be straightforward.
However, currently, much of the logic for backward
is implemented in python, on the torch.Tensor
type and in the torch.autograd
package, and so is not available to torch.numpy.ndarray
. In order to make this work, we can refactor the relevant code in order to share it between both types.
Keeping with the convention of prefixing API extensions that don't exist in NumPy with underscores, this functionality would be accessed via functions like ndarry._backward
, ndarray._grad
, ndarray._requires_grad
, and so on.
None of these are implemented yet in my proof of concept
NumPy supports a very rich notion of dtype allowing complex structures, whereas PyTorch tensors are made of scalars: float, double, and so on.
Unless we decide that it's worth making a fundamental refactor of PyTorch in order to support this, it is out of scope.
Some work has already been done on designing and implementing NumPy-like type promotion in PyTorch: pytorch/pytorch#5795 and pytorch/pytorch#9515. Now that we are implementing this NumPy compatibility layer, the importance of that project increases.
Implementing this type promotion feature would involve:
NumPy functionality is spread throughout many different packages, to a much greater extent than PyTorch. For example, while PyTorch has torch.randn
, NumPy has numpy.random.randn
.
We can specify these with an option in the YAML defining the binding:
- func: randn(int[] size)
np_compat: True
package: 'random'
Implementing this is straightforward: we will define a set of different packages (torch.numpy.random
, torch.numpy.matlib
, torch.numpy.linalg
, and so on), and when generating the bindings, we will add each one to the list of functions for the appropriate module.
NumPy ufuncs have a few parameters with no real PyTorch equivalent.
where
: This is used for masking. Masking support in PyTorch has been discussed for a while. If we decide to implement it, we can then re-use the same implementation in the NumPy API.order
: NumPy allows passing an "order" parameter determining the layout of function outputs. We can implement this by transforming calls like a.foo(order='K')
to a call to the corresponding foo_out
, passing a tensor with the correct strides.Discussed this with Soumith, Greg, and Sam -- we decided we would take a slightly simpler approach involving only one tensor type.
Just to provide some context here, the concern was 1) having multiple APIs that "live forever" and 2) downstream components have to worry about the tensor mode, which doesn't seem necessary.
Basic Example:
Doesn't this example already work? (besides the torch.numpy part). Maybe you can use an example that doesn't work today?
Users should be able to seamlessly convert between a torch.Tensor and a torch.numpy.ndarray and use both at will.
are you suggesting you'll be able to have interoperate numpy arrays and pytorch tensors without having to do conversion?
Any functions not existing in NumPy but necessary for interaction with PyTorch features should be clearly marked by underscores, signifying immediately to readers that they are PyTorch-specific extensions (e.g. ndarray._cuda, ndarray._backward, etc).
What's the rationale behind this?
Binding Generation
It would be nice to rewrite this section a bit in the "new world" where there is only a single tensor type.
The most straightfoward way to do this is to create a PyTorch binding, mark it as hidden, and then define a NumPy compatibility binding depending on it. For example:
does this still apply if we only have 1 tensor type? What changes?
Differentiation support
If we force the user to convert to pytorch tensors (and don't support arbitrary interop), don't we get differentiation for free?
Adding an option to use exactly the same rules as NumPy, in cases where the design requires slightly different behavior in PyTorch. This option would be used for the NumPy compatibility API.
If we don't have a separate tensor type, what changes here?
package: 'random'
we already have a concept for this, python_module.
order
Do all of the numpy functions that we want to implement that have an order parameter have an out variant? Also CC @VitalyFedyunin who is working on order now.
Doesn't this example already work? (besides the torch.numpy part). Maybe you can use an example that doesn't work today?
Better example showing param translation:
>>> import torch.numpy as np
>>> arr = np.arange(10)
>>> np.sum(a=arr, axis=0)
tensor(45)
are you suggesting you'll be able to have interoperate numpy arrays and pytorch tensors without having to do conversion?
No, I was suggesting we would have functions that let you convert between the types for "free" (i.e., with only some overhead of unwrapping/re-wrapping Python objects, no copying data)
But, this is irrelevant if we decide to go with only one tensor type.
What's the rationale behind this?
Making it obvious at a glance that the code will not work in stock Numpy.
It would be nice to rewrite this section a bit in the "new world" where there is only a single tensor type.
Yes, it will become significantly simpler.
If we don't have a separate tensor type, what changes here?
We will need to think about the exact API here (in particular, whether it makes sense to do it the "numpy way" by default, especially in cases where it would be much less performant.)
I think we definitely need to provide user control of this somehow, as I can easily imagine anything we choose causing problems for some subset of users.
we already have a concept for this, python_module.
great, thanks for the pointer
Do all of the numpy functions that we want to implement that have an order parameter have an out variant?
All numpy "Universal functions" have both order and out parameters (and a variety of other common parameters): https://docs.scipy.org/doc/numpy/reference/ufuncs.html#optional-keyword-arguments
Just to provide some context here, the concern was 1) having multiple APIs that "live forever" and 2) downstream components have to worry about the tensor mode, which doesn't seem necessary.
How difficult would it be to get the list of methods on ndarray which we literally cannot support without introducing an ndarray type? I feel that this would help people make a more informed decision about the cost of an ndarray type.
@umanwizard This project sounds great... how can we try it?
@peteroconnor-bc We made some changes to improve NP compatibility, but are still far from full compatibility. Sorry, but I am not working at Facebook anymore so I'm not sure what the current status is. @gchanan or @jeffreyksmithjr can let you know whether work is still going on to improve NumPy compatibility.
We don't have anyone whose primary focus is on numpy compatibility, but we gladly accept PRs in this area :).
While numpy-equivalent functions are gradually resolved in the pull requests, could someone point to a third-party package, if any, that allows users to select the backend (either numpy or pytorch) for numpy-like API?
Thank you.
@dizcza Not exactly what you are asking for but take a look at https://eagerpy.jonasrauber.de/
Most helpful comment
This is, IMHO, the biggest stopper in teaching (otherwise close to perfect) pytorch as a first deep learning framework.
Really looking forward to the day pytorch will match numpy API.