about image data input, some information as follows,
Create Dataset Using RecordIO
RecordIO implements a file format for a sequence of records. We recommend storing images as records and pack them together. The benefits are:
Storing images in compacted format, e.g. JPEG, for records can have different size. Compacted format will greatly reduce the dataset size in disk.
Packing data together allow continous reading on the disk.
RecordIO has a simple way of partition, which makes it easier for distributed setting. Example about this will be provided later.
We provide the “im2rec tool” to create Image RecordIO dataset by yourself. Here’s the walkthrough:
0.Before you start
Make sure you have downloaded the data. You don’t need to resize the images by yourself, currently im2rec could resize it automatically. You could check the promoting message of im2rec for details.
1.Make the image list
After you get the data, you need to make a image list file first. The format is
integer_image_index \t label_index \t path_to_image
In general, the program will take a list of names of all image, shuffle them, then separate them into training files name list and testing file name list. Write down the list in the format.
A sample file is provided here
895099 464 n04467665_17283.JPEG
10025081 412 ILSVRC2010_val_00025082.JPEG
74181 789 n01915811_2739.JPEG
10035553 859 ILSVRC2010_val_00035554.JPEG
10048727 929 ILSVRC2010_val_00048728.JPEG
94028 924 n01980166_4956.JPEG
1080682 650 n11807979_571.JPEG
972457 633 n07723039_1627.JPEG
7534 11 n01630670_4486.JPEG
1191261 249 n12407079_5106.JPEG
can you give some example of “im2rec tool” ?
thanks!
To create the lists:
python im2rec.py --list --recursive --num-thread 4 --chunks 4 --train-ratio 0.6 --test-ratio 0.2 prefix root
To build the .rec files:
python im2rec.py --num-thread 2 --resize 50 --quality 80 --saving-folder SV prefix root
thanks for your help very much!
I think you need prefix and root params
Most helpful comment
To create the lists:
python im2rec.py --list --recursive --num-thread 4 --chunks 4 --train-ratio 0.6 --test-ratio 0.2 prefix root
To build the .rec files:
python im2rec.py --num-thread 2 --resize 50 --quality 80 --saving-folder SV prefix root