Vision: load image dataset from list files

Created on 1 Mar 2017 · 11Comments · Source: pytorch/vision

I suggest to add a IO for read images from a list like this to support custom image data input,

img1 label1
img2 label2
...

which like caffe's LMDB list format.

I implement it by referencing torchvision/datasets/folder.py

import torch.utils.data as data

from PIL import Image
import os
import os.path

def default_loader(path):
    return Image.open(path).convert('RGB')

def default_flist_reader(flist):
    """
    flist format: impath label\nimpath label\n ...(same to caffe's filelist)
    """
    imlist = []
    with open(flist, 'r') as rf:
        for line in rf.readlines():
            impath, imlabel = line.strip().split()
            imlist.append( (impath, int(imlabel)) )

    return imlist

class ImageFilelist(data.Dataset):
    def __init__(self, root, flist, transform=None, target_transform=None,
            flist_reader=default_flist_reader, loader=default_loader):
        self.root   = root
        self.imlist = flist_reader(flist)       
        self.transform = transform
        self.target_transform = target_transform
        self.loader = loader

    def __getitem__(self, index):
        impath, target = self.imlist[index]
        img = self.loader(os.path.join(self.root,impath))
        if self.transform is not None:
            img = self.transform(img)
        if self.target_transform is not None:
            target = self.target_transform(target)

        return img, target

    def __len__(self):
        return len(self.imlist)

The usage is same to ImageFolder class,

 46     train_loader = torch.utils.data.DataLoader(
 47         ImageFilelist(root="../place365_challenge/data_256/", flist="../place365_challenge/places365_train_challenge.txt",
 48             transform=transforms.Compose([transforms.RandomSizedCrop(224),
 49                 transforms.RandomHorizontalFlip(),
 50                 transforms.ToTensor(), normalize,
 51         ])),
 52         batch_size=64, shuffle=True,
 53         num_workers=4, pin_memory=True)
 54 
 55     val_loader = torch.utils.data.DataLoader(
 56         ImageFilelist(root="../place365_challenge/val_256/", flist="../place365_challenge/places365_val.txt",
 57             transform=transforms.Compose([transforms.Scale(256),
 58                 transforms.CenterCrop(224),
 59                 transforms.ToTensor(), normalize,
 60         ])),
 61         batch_size=16, shuffle=False,
 62         num_workers=1, pin_memory=True)

awaiting response

Source

xiahouzuoxin

👍15

Most helpful comment

Also, the csv might contain several columns, and you might only be interested in a subset of those.While possible to write a somewhat generic dataset, the interface might get clumsy, and one might get tempted to extend it to handle specific use-cases, making something which was supposed to be easy complicated.

To close this issue, I'll post a snippet of how one can go to writing their own dataset for csv-like files:

import pandas as pd

class PandasDataset(object):
    def __init__(self, path_to_csv_file, input_name, target_name):
        self.dataset = pd.read_csv(path_to_csv_file)
        self.input_name = input_name
        self.target_name = target_name
        # add transforms as well

    def __getitem__(self, idx):
        item = self.dataset.iloc[idx]
        # add transforms
        return item[self.input_name], item[self.target_name]

    def __len__(self):
        return len(self.dataset)

fmassa on 12 Nov 2017

👍4

All 11 comments

Agree with this but the title is misleading. Would better to call it load image dataset from list files.

BTW, I think it would be helpful if you make it a pull request.

Jiaming-Liu on 5 May 2017

Make it a pull request.

dlmacedo on 13 Jul 2017

👍1

I believe that using rich python libraries, one can leverage the iterator of the dataset class to do most of the things with ease. Passing a text file and reading again from it seems a bit roundabout for me. It is fine for caffe because the API is in CPP, and the dataloaders are not exposed as in pytorch.

yannadani on 21 Sep 2017

👍1

I agree with @yannadani, if you have a dataset text file it's very easy to write a dataset class to parse it. For example, one could want to use pandas to parse arbitrary csv files (which could have the space as a separator), and many input and target labels per example.

Do you think there would be value in adding a generic dataset for csv files, that tries to handle arbitrary number of data from different types? That seems like an overkill, given how easy it is to write your own dataset.

Let me know what you think.

fmassa on 29 Sep 2017

👍2

I am using this and often times the data loading speed is very slow (inconsistently.. some images take 0.001 second while others take 10 second). When number of workers are N, every N-th batch takes 10 or more second while other batches takes less time. Any ideas?

hyojinie on 25 Oct 2017

👍3

@fmassa I believe the question would be how generic can it be. In this case, the dataset will be limited to csv files and there might be some use cases which has some data\path-to-data which is not present in csv, for example in a mat file or a xml file in case of annotations. I believe unless more people use csv, then it might just be an overkill.

yannadani on 4 Nov 2017

To close this issue, I'll post a snippet of how one can go to writing their own dataset for csv-like files:

import pandas as pd

class PandasDataset(object):
    def __init__(self, path_to_csv_file, input_name, target_name):
        self.dataset = pd.read_csv(path_to_csv_file)
        self.input_name = input_name
        self.target_name = target_name
        # add transforms as well

    def __getitem__(self, idx):
        item = self.dataset.iloc[idx]
        # add transforms
        return item[self.input_name], item[self.target_name]

    def __len__(self):
        return len(self.dataset)

fmassa on 12 Nov 2017

👍4

I'm working with datasets (like in the face poses tutorial) where the labels exist in a file alongside the images and it would be useful to have a simple ImageFolder-like abstraction which just says "treat these columns as our labels."

I'd imagine that if one column is given, the data is using a simple regression or classification label and if multiple columns are given, the output is a numpy array / torch tensor which needs to be reshaped or post-processed.

It looks like this thread is working towards that, but the issue is closed -- is this abstraction too trivial or too uncommon to go into torchvision?

stites on 19 Jan 2018

I am using this and often times the data loading speed is very slow (inconsistently.. some images take 0.001 second while others take 10 second). When number of workers are N, every N-th batch takes 10 or more second while other batches takes less time. Any ideas?

Yes, I also facing this problem, have you has any idea solve this?
If you solved, please share with us. Many Thanks

PantherYan on 2 Nov 2018

Unfortunately, I have not...

On Thu, Nov 1, 2018 at 6:15 PM, PantherGSU notifications@github.com wrote:

I am using this and often times the data loading speed is very slow
(inconsistently.. some images take 0.001 second while others take 10
second). When number of workers are N, every N-th batch takes 10 or more
second while other batches takes less time. Any ideas?

Yes, I also facing this problem, have you has any idea solve this?
If you solved, please share with us. Many Thanks

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/pytorch/vision/issues/81#issuecomment-435239598, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AOH7IvAOUKvsHr8i4zbT20gu1vg-uyUdks5uq5yZgaJpZM4MPJcE
.