Pytorch: PyTorch with numpy syntax?

Created on 28 Jul 2017  路  18Comments  路  Source: pytorch/pytorch

Hi all,

So there're some great things about PyTorch, but one of the not-great things is that it uses a mostly-but-not-fully different API than the one used for numpy, theano, and tensorflow. I find myself having to consult a lookup-table when I want to run familiar commands.

Are there any plans to make a numpytorch, where the API matches numpy's as closely as possible?

An advantage would be that many functions written for numpy could also be used by pytorch. It would also be nice to be able to call matplotlib functions (plot, imshow, etc) directly with torch variables without having to do any hacking of matplotlib.

cc @ezyang @gchanan @zou3519 @mruberry @rgommers @heitorschueroff

high priority numpy ux triaged

Most helpful comment

The Tensor API matches numpy's exactly

This is, IMHO, the biggest stopper in teaching (otherwise close to perfect) pytorch as a first deep learning framework.
Really looking forward to the day pytorch will match numpy API.

All 18 comments

this is something that'd be useful, but we cant commit yet to fully supporting such a package. We're first sorting out not having differences between Variable and Tensor, and then we'd get to torch.np which then has to possibly expose a Tensor and a Variable class maybe? Have to think this through.

@petered do you have a more concrete design proposal? do you see yourself using it just for Tensor, or also for Variable?

So I'm a bit new to PyTorch and haven't fully wrapped my head around why Variable and Tensor are two different things (from the user's perspective anyway), though I see why the concepts should be separated in code.

do you see yourself using it just for Tensor, or also for Variable?

I would want this to work with autograd, if that's what you mean, so Variable.

More concretely, it would be great to work towards a torch.np package, where:

  • The Tensor API matches numpy's exactly (eg. .shape instead of .size() etc). There is only one kind of Tensor (instead of torch.FloatTensor, torch.cuda.FloatTensor, etc.), which has a property dtype, and a flag indicating which device it should live on. eg arr = torch.np.array([1,2,3], dtype='float32', device='gpu0', requires_grad=False)
  • Users do not need to think about or distinguish between Variables and Tensors. Everything is a Variable. If desired, autograd could be disabled with a context manager that sets a global flag with enable_autograd(False): ..., which would effectively make functions return Tensors, not Variables.
  • Every function in numpy is implemented. Every function can accept a list/tuple/np.ndarray/Tensor/Variable (if Tensor and Variable are still distinct). Internally, inputs are promoted to Variables before use.
  • Calls to var.expand are made automatically (for array broadcasting)

The Tensor API matches numpy's exactly

This is, IMHO, the biggest stopper in teaching (otherwise close to perfect) pytorch as a first deep learning framework.
Really looking forward to the day pytorch will match numpy API.

Really looking forward to the day pytorch will match numpy API. +1

As mentioned in pytorch/tutorials#197
some numpy-like syntax is not recommend, such as numpy.asarray

Note that NumPy changes planned in NEP-18 could be helpful here: http://www.numpy.org/neps/nep-0018-array-function-protocol.html

Is there still a plan to provide numpy-like api? It's really annoying to write two separate functions for numpy and pytorch.

I think there is a plan, @umanwizard might be working on it.

I opened a gist with some thoughts on this, which we have been discussing. But it seems gist is not the best format for open discussion, so I am reposting it here. @gchanan , can you paste your comments from the gist into this issue?

Summary

The goal of this project is to create an alternative Python API for PyTorch which is as similar as possible to NumPy, while retaining distinguishing features like simple GPU support and gradient computation, as illustrated below.

Basic Example:

>>> import torch.numpy as np
>>> a = np.arange(10)
>>> a.sum()
tensor(45)

Goals

  • The project should, when complete, implement (at least) a large subset of the NumPy ndarray API.
  • Implementing NumPy functions should not involve duplicating code if substantially similar PyTorch functions exist.
  • Users should be able to seamlessly convert between a torch.Tensor and a torch.numpy.ndarray and use both at will.
  • The new functionality must cause zero performance overhead for users of the traditional PyTorch API, and negligible (ideally zero) overhead for using the torch.numpy API.
  • Any functions not existing in NumPy but necessary for interaction with PyTorch features should be clearly marked by underscores, signifying immediately to readers that they are PyTorch-specific extensions (e.g. ndarray._cuda, ndarray._backward, etc).
  • PyTorch developers should be able to extend the API to add new NumPy functions in an easy and intuitive way.

Data Model

The torch.numpy.ndarray class is implemented similarly to torch.Tensor: it is a python object wrapping a python extension API object (torch._np_compat._ndarray_base) which ultimately wraps a Variable. This approach was chosen for two major reasons:

  • Flexibility: implementing a new type (rather than inheriting from torch.Tensor) allows us to implement exactly the set of methods we want on the torch.numpy.ndarray type.
  • Ease of implementation: A similar implementation to torch.Tensor enables existing Python binding codegen to be leveraged with minimal changes.

Tensors and ndarrays can be freely converted one to another:

import torch, torch.numpy as np
>>> a = torch.randn(10)._np_compat()
>>> b = torch.randn(10)._np_compat()
>>> c = np.hypot(a, b)
>>> type(c)
<class 'torch.numpy.ndarray'>
>>> t = c._torch()
>>> type(t)
<class 'torch.Tensor'>

Binding Generation

Code generation logic has been extended to allow NumPy API bindings to be defined in native_functions.yaml similarly to any other native function bindings. For example:

- func: sum(np.ndarray a) -> np.ndarray
  variants: function, method
  np_compat: True

causes a new signature to be added to the argument parser in the generated binding function THPVariable_sum, the same function that already handles torch.Tensor.sum.

In order to distinguish between the two cases, we make the generated binding function accept a template parameter <bool compat>, controlling whether it should parse arguments according to the NumPy compatibility API. Then in the list of methods for the python extension objects backing torch.Tensor and torch.numpy.ndarray, we would add THPVariable_sum<false> and THPVariable_sum<true>, respectively.

Other than the bindings, this declaration of sum does not cause any code to be generated. The actual functionality will be implemented by the existing at::native::sum after appropriately translating arguments, as described below.

Argument parsing and translation

The argument parsing logic is extended to support the new compatibility mode. Parsers are now initialized with two separate lists of signatures: one for the traditional API and one for the new one. When invoked in the old mode, the new-API signatures are ignored, and everything works the same as always.

When invoked in the new compatibility mode, the argument parsing works in two steps. First, the arguments are parsed against the compatiblity signatures. If a match is found, the argument names are translated into PyTorch equivalents (e.g., a is replaced by input, keepdims by keepdim, and so on), and argument types are converted if necessary (e.g., any ndarray is unwrapped and re-wrapped as a torch.Tensor). This new set of arguments is then matched against the PyTorch API set of signatures, and dispatched as appropriate.

A set of common argument name translations (for now: a, keepdims, and axis) is provided by default. It is also possible to add custom translations for a particular binding. The following example causes shape to be replaced by size.

- func: ones(int[1] shape, np.dtype dtype=float) -> np.ndarray
  variants: function
  np_compat: True
  additional_translations:
    shape: size

Adding new functions

Obviously, if a function is supported by NumPy and not by PyTorch, we need to actually implement it, not just rely on argument translation magic.

The most straightfoward way to do this is to create a PyTorch binding, mark it as hidden, and then define a NumPy compatibility binding depending on it. For example:

- func: hypot(Tensor input, Tensor b) -> Tensor
  variants: function
  hidden: True
  dispatch:
    CPU: hypot
    CUDA: hypot

- func: hypot(np.ndarray a, np.ndarray b) -> np.ndarray
  variants: function
  np_compat: True

The required underlying function at::native::hypot(Tensor const& input, Tensor const& b) can then be implemented as usual, and torch.numpy.hypot(a, b) will return the equivalent of sqrt(a*a + b*b), as expected.

CUDA support

A torch.numpy.ndarray can be created on the GPU in either of two ways: either by creating it as usual in PyTorch and converting it to an ndarray using torch.Tensor._np_compat (which just involves wrapping and unwrapping some objects, not copying any data), or by calling _cuda on an existing torch.numpy.ndarray. The ndarray can then be used as usual:

>>> import torch.numpy as np
>>> cpu = np.arange(10)
>>> cpu.sum()
tensor(45)
>>> gpu = cpu._cuda()
>>> gpu.sum()
tensor(45, device='cuda:0')

Differentiation support

Not yet implemented.

Since a torch.numpy.ndarray wraps a variable, in principle tracking gradients should be straightforward.

However, currently, much of the logic for backward is implemented in python, on the torch.Tensor type and in the torch.autograd package, and so is not available to torch.numpy.ndarray. In order to make this work, we can refactor the relevant code in order to share it between both types.

Keeping with the convention of prefixing API extensions that don't exist in NumPy with underscores, this functionality would be accessed via functions like ndarry._backward, ndarray._grad, ndarray._requires_grad, and so on.

NumPy concepts not existing in PyTorch

None of these are implemented yet in my proof of concept

dtypes

NumPy supports a very rich notion of dtype allowing complex structures, whereas PyTorch tensors are made of scalars: float, double, and so on.

Unless we decide that it's worth making a fundamental refactor of PyTorch in order to support this, it is out of scope.

Type promotion

Some work has already been done on designing and implementing NumPy-like type promotion in PyTorch: pytorch/pytorch#5795 and pytorch/pytorch#9515. Now that we are implementing this NumPy compatibility layer, the importance of that project increases.

Implementing this type promotion feature would involve:

  1. Finalizing the design elaborated by Sam and Tugrul, which appears mostly complete
  2. Implementing it in code
  3. Adding an option to use exactly the same rules as NumPy, in cases where the design requires slightly different behavior in PyTorch. This option would be used for the NumPy compatibility API.
  4. Provide options in the NumPy API to control this behavior (in particular, whether to use different type promotion rules on CUDA, when differences in performance can depend on data type width to an extreme extent)

Multiple packages

NumPy functionality is spread throughout many different packages, to a much greater extent than PyTorch. For example, while PyTorch has torch.randn, NumPy has numpy.random.randn.

We can specify these with an option in the YAML defining the binding:

- func: randn(int[] size)
  np_compat: True
  package: 'random'

Implementing this is straightforward: we will define a set of different packages (torch.numpy.random, torch.numpy.matlib, torch.numpy.linalg, and so on), and when generating the bindings, we will add each one to the list of functions for the appropriate module.

Common NumPy parameters

NumPy ufuncs have a few parameters with no real PyTorch equivalent.

  • where: This is used for masking. Masking support in PyTorch has been discussed for a while. If we decide to implement it, we can then re-use the same implementation in the NumPy API.
  • order: NumPy allows passing an "order" parameter determining the layout of function outputs. We can implement this by transforming calls like a.foo(order='K') to a call to the corresponding foo_out, passing a tensor with the correct strides.

Discussed this with Soumith, Greg, and Sam -- we decided we would take a slightly simpler approach involving only one tensor type.

Just to provide some context here, the concern was 1) having multiple APIs that "live forever" and 2) downstream components have to worry about the tensor mode, which doesn't seem necessary.

Basic Example:

Doesn't this example already work? (besides the torch.numpy part). Maybe you can use an example that doesn't work today?

Users should be able to seamlessly convert between a torch.Tensor and a torch.numpy.ndarray and use both at will.

are you suggesting you'll be able to have interoperate numpy arrays and pytorch tensors without having to do conversion?

Any functions not existing in NumPy but necessary for interaction with PyTorch features should be clearly marked by underscores, signifying immediately to readers that they are PyTorch-specific extensions (e.g. ndarray._cuda, ndarray._backward, etc).

What's the rationale behind this?

Binding Generation

It would be nice to rewrite this section a bit in the "new world" where there is only a single tensor type.

The most straightfoward way to do this is to create a PyTorch binding, mark it as hidden, and then define a NumPy compatibility binding depending on it. For example:

does this still apply if we only have 1 tensor type? What changes?

Differentiation support

If we force the user to convert to pytorch tensors (and don't support arbitrary interop), don't we get differentiation for free?

Adding an option to use exactly the same rules as NumPy, in cases where the design requires slightly different behavior in PyTorch. This option would be used for the NumPy compatibility API.

If we don't have a separate tensor type, what changes here?

package: 'random'

we already have a concept for this, python_module.

order

Do all of the numpy functions that we want to implement that have an order parameter have an out variant? Also CC @VitalyFedyunin who is working on order now.

Doesn't this example already work? (besides the torch.numpy part). Maybe you can use an example that doesn't work today?

Better example showing param translation:

>>> import torch.numpy as np
>>> arr = np.arange(10)
>>> np.sum(a=arr, axis=0)
tensor(45)

are you suggesting you'll be able to have interoperate numpy arrays and pytorch tensors without having to do conversion?

No, I was suggesting we would have functions that let you convert between the types for "free" (i.e., with only some overhead of unwrapping/re-wrapping Python objects, no copying data)

But, this is irrelevant if we decide to go with only one tensor type.

What's the rationale behind this?

Making it obvious at a glance that the code will not work in stock Numpy.

It would be nice to rewrite this section a bit in the "new world" where there is only a single tensor type.

Yes, it will become significantly simpler.

If we don't have a separate tensor type, what changes here?

We will need to think about the exact API here (in particular, whether it makes sense to do it the "numpy way" by default, especially in cases where it would be much less performant.)

I think we definitely need to provide user control of this somehow, as I can easily imagine anything we choose causing problems for some subset of users.

we already have a concept for this, python_module.

great, thanks for the pointer

Do all of the numpy functions that we want to implement that have an order parameter have an out variant?

All numpy "Universal functions" have both order and out parameters (and a variety of other common parameters): https://docs.scipy.org/doc/numpy/reference/ufuncs.html#optional-keyword-arguments

Just to provide some context here, the concern was 1) having multiple APIs that "live forever" and 2) downstream components have to worry about the tensor mode, which doesn't seem necessary.

How difficult would it be to get the list of methods on ndarray which we literally cannot support without introducing an ndarray type? I feel that this would help people make a more informed decision about the cost of an ndarray type.

@umanwizard This project sounds great... how can we try it?

@peteroconnor-bc We made some changes to improve NP compatibility, but are still far from full compatibility. Sorry, but I am not working at Facebook anymore so I'm not sure what the current status is. @gchanan or @jeffreyksmithjr can let you know whether work is still going on to improve NumPy compatibility.

We don't have anyone whose primary focus is on numpy compatibility, but we gladly accept PRs in this area :).

While numpy-equivalent functions are gradually resolved in the pull requests, could someone point to a third-party package, if any, that allows users to select the backend (either numpy or pytorch) for numpy-like API?
Thank you.

@dizcza Not exactly what you are asking for but take a look at https://eagerpy.jonasrauber.de/

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cdluminate picture cdluminate  路  3Comments

ikostrikov picture ikostrikov  路  3Comments

negrinho picture negrinho  路  3Comments

dablyo picture dablyo  路  3Comments

a1363901216 picture a1363901216  路  3Comments