Vision: ToTensor transform function, ndarray dimensions are not rearranged

Created on 27 Jan 2017 · 5Comments · Source: pytorch/vision

Converts a PIL.Image (RGB) or numpy.ndarray (H x W x C) in the range
[0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].

However for ndarray, only .from_numpy(pic) is called, without reshaping the tensor. Is it normal ?

Also, assuming range (0,255) for PIL images seems reasonable, but ndarray might have different distributions, especially for target images. Would it be interesting to have a normalize parameter which lets you choose if you want to divide values by 255 or not ?

Source

ClementPinard

👍2 ❤1

Most helpful comment

To change a numpy HxWxC array to CxHxW, and get the same behavior as if you called ToPILImage() and then ToTensor(), do

npimg = np.transpose(npimg,(2,0,1))

Doing a npimg.reshape() will not produce the same results

pammirato on 6 Mar 2017

👍9

All 5 comments

I ran into this issue today too, I think a normalize parameter would be nice as sometimes the image loading function used before this would already put the data into [0,1].

@ClementPinard As a quick work-around you can call ToPILImage first in a Compose i.e.
```
transform = Compose([ToPILImage(), ToTensor()])

alykhantejani on 31 Jan 2017

👍2

To change a numpy HxWxC array to CxHxW, and get the same behavior as if you called ToPILImage() and then ToTensor(), do

npimg = np.transpose(npimg,(2,0,1))

Doing a npimg.reshape() will not produce the same results

pammirato on 6 Mar 2017

👍9

fixed in master

soumith on 11 Mar 2017

@alykhantejani +1 on the normalize parameter.. even better would be for toTensor not to normalize at all.. isn't that what the Normalize transform is for? I got burned doing a Normalize transform with a ToTensor transform

el3ment on 19 May 2017

👍3

as @alykhantejani said ,another way to get the same effect is using numpy to convert,
assume your image is opencv format,
img_tensor = cv_img[:,:,::-1].transpose((2,0,1)).copy() # chw, RGB order,[0,255]
img_tensor = torch.from_numpy(img_tensor).float().div(255) # chw , FloatTensor type,[0,1]
img_tensor = img_tensor.unsqueeze(0) # nch*w