Turicreate: Large number of classes/large labels cause too much RAM usage with image classification

Created on 1 Apr 2018 · 14Comments · Source: apple/turicreate

Hi guys,

Finally I came out with a solution to my problem. Turi is working finally great but there is just one problem: if you will rename a folder with slash such as "example/human" and you will recall it in python as 'example:human' Turi will start to have problem during the exportation of the model and instead to spend several seconds on a base of 200 pics and 21 classes it will stay for 2 hours.

I don't know what can be the problem but try to fix it. Have a good easter 👍

bug

Source

PietroMessineo

Most helpful comment

After investigating with the dataset provided privately by @PietroMessineo, I have determined the underlying issue here is RAM usage. On my 8 GB RAM machine, I see this dataset using up to 30 GB of RAM (including 22 GB of swap), which effectively hangs the process and makes the machine unresponsive since swap is so slow (and we are so heavily relying on it). This seems like a bug - we should attempt to stay within a reasonable amount of RAM usage (relative to total system RAM) even on a large training set.

znation on 5 Apr 2018

👍2

All 14 comments

@PietroMessineo Can you please provide some Python code that will reproduce this issue? (Even if you are unable to provide the data, we can probably recreate similar enough data, given a description of the columns + types). Thanks!

znation on 3 Apr 2018

Hi @znation,

Actually I'm using three different python, but I'm encountering the problem during the export, below the code:

import turicreate as tc

**#Load the data**
data = tc.SFrame('alfa.sframe')

**# Make a train-test split**
train_data, test_data = data.random_split(0.8)

tc.config.set_runtime_config('TURI_FILEIO_MAXIMUM_CACHE_CAPACITY', 2*1024*1024*1024)

**# Create a model**
model = tc.image_classifier.create(train_data, target='label', model='squeezenet_v1.1', max_iterations=50)

**# Save predictions to an SFrame (class and corresponding class-probabilities)**
predictions = model.classify(test_data)

**# Evaluate the model and save the results into a dictionary**
results = model.evaluate(test_data)
print "Accuracy         : %s" % results['accuracy']
print "Confusion Matrix : \n%s" % results['confusion_matrix']

**# Save the model for later usage in Turi Create**
model.save('Alfa2.model')

PietroMessineo on 3 Apr 2018

@PietroMessineo How does the folder with a slash in the name relate to the Python code you provided? I don't see where that code uses a folder with a slash in the name.

znation on 3 Apr 2018

Hi @znation, in macOS the slash is recognize as semicolon so the code is as follow:

import turicreate as tc

**# load data**
image_data = tc.image_analysis.load_images('alfaset', with_path=True)

labels = ['A:a', 'B:b', 'C:c', 'D:d', 'E:e', 'F:f', 'G:g', 'H:h','I:i','J:j','K:k','L:l','M:m','N:n','O:o','P:p','Q:q','R:r','S:s','T:t','U:u','V:v','W:w','X:x','Y:y','Z:z']

def get_label(path, labels=labels):
    for label in labels:
        if label in path:
            return label

image_data['label'] = image_data['path'].apply(get_label)

**# save data**
image_data.save('alfa.sframe')

**# explore**
image_data.explore()

The problem is when I'm lunching the code attached on the previous comment, it's staying for too long before to import but when I'm renaming the folder with "aa" instead "a:A" the export time is 100 times faster.

PietroMessineo on 3 Apr 2018

@PietroMessineo With the above code (both data preparation, and training and model saving) with 83 images and 5 classes (my own dataset), I get the following performance results:

# With slashes in filenames/labels
python repro.py  260.04s user 18.68s system 131% cpu 3:32.28 total

# With underscores instead of slashes
python repro_noslash.py  259.83s user 18.08s system 133% cpu 3:28.50 total

I don't see a performance difference between the two cases. Are you able to reproduce this consistently (even with the same data) with slashes vs. any other character, or is it possible something else changed as well that changed the performance drastically?

znation on 4 Apr 2018

Hi @znation that’s pretty strange. I tried it multiple times... will you be interested to try my dataset with small amount of images? Like 10 for each class.

PietroMessineo on 4 Apr 2018

@PietroMessineo Yes, I'd like to try your dataset to see if I can repro with that, if that's ok with you. You can either share it on this thread (if GitHub will allow a file that large), or upload elsewhere and e-mail me a link (email is in my profile). Thanks!

znation on 4 Apr 2018

👍1

@znation Thank you! Check your email 👍

PietroMessineo on 4 Apr 2018

znation on 5 Apr 2018

👍2

@PietroMessineo We finally identified the issue. If you need a temporary work around, please revert to turicreate==4.1