Facenet: ValueError: Cannot have number of splits n_splits=10 greater than the number of samples: 0.

Created on 22 Oct 2017 · 13Comments · Source: davidsandberg/facenet

I got some trouble when I execute the command below:

python src/validate_on_lfw.py ~/datasets/lfw/lfw_mtcnnpy_160 ~/models/facenet/20170512-110547

I have execute the align_dataset_mtcnn.py successfully.

The Error information as the following:

Runnning forward pass on LFW images
Traceback (most recent call last):
File "src/validate_on_lfw.py", line 113, in
main(parse_arguments(sys.argv[1:]))
File "src/validate_on_lfw.py", line 83, in main
actual_issame, nrof_folds=args.lfw_nrof_folds)
File "/Users/andycooper/facenet/src/lfw.py", line 40, in evaluate
np.asarray(actual_issame), nrof_folds=nrof_folds)
File "/Users/andycooper/facenet/src/facenet.py", line 426, in calculate_roc
for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
File "/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_split.py", line 330, in split
n_samples))
ValueError: Cannot have number of splits n_splits=10 greater than the number of samples: 0.

Anyone else has the same problem as me? Please help.

Source

lidgik

Most helpful comment

turns out align_dataset_mtcnn.py is adding a _0 to the end of all the files, so all my files is name like this:
datasets/lfw/lfw_mtcnnpy_160/Slobodan_Milosevic/Slobodan_Milosevic_0002_0.png
which should have been this:
datasets/lfw/lfw_mtcnnpy_160/Slobodan_Milosevic/Slobodan_Milosevic_0002.png

if you don't want to fixed the naming in align_dataset_mtcnn.py, a quick and dirty fix is change lfw.py:

        if len(pair) == 3:
            path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1])+'_0.'+file_ext)
            path1 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[2])+'_0.'+file_ext)
            issame = True
        elif len(pair) == 4:
            path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1])+'_0.'+file_ext)
            path1 = os.path.join(lfw_dir, pair[2], pair[2] + '_' + '%04d' % int(pair[3])+'_0.'+file_ext)

mingrui on 22 Oct 2017

👍6

All 13 comments

I have the same problem, even after re-running the alignment script as suggested here: https://github.com/davidsandberg/facenet/issues/289

mingrui on 22 Oct 2017

if you don't want to fixed the naming in align_dataset_mtcnn.py, a quick and dirty fix is change lfw.py:

        if len(pair) == 3:
            path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1])+'_0.'+file_ext)
            path1 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[2])+'_0.'+file_ext)
            issame = True
        elif len(pair) == 4:
            path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1])+'_0.'+file_ext)
            path1 = os.path.join(lfw_dir, pair[2], pair[2] + '_' + '%04d' % int(pair[3])+'_0.'+file_ext)

mingrui on 22 Oct 2017

👍6

Thank you! It really works. The problem has solved as you said.

lidgik on 23 Oct 2017

HI @mingrui
i met the same problems, can you past your edition

Thanks
Carl Chen

ttjslbz on 23 Oct 2017

@mingrui Hi, I have the same problem, even after modifing the script(lfw.py) as suggested here.

thanks

AppleCoffee on 24 Oct 2017

@mingrui
I changed the codes and run the codes, the previous error is gone but an new error occurs:
ZeroDivisionError: float division by zero

_Traceback (most recent call last):
File "src/validate_on_lfw.py", line 113, in
main(parse_arguments(sys.argv[1:]))
File "src/validate_on_lfw.py", line 83, in main
actual_issame, nrof_folds=args.lfw_nrof_folds)
File "/home/zhen/Projects/FaceNet/facenet-master/src/lfw.py", line 43, in evaluate
np.asarray(actual_issame), 1e-3, nrof_folds=nrof_folds)
File "/home/zhen/Projects/FaceNet/facenet-master/src/facenet.py", line 474, in calculate_val
_, far_train[threshold_idx] = calculate_val_far(threshold, dist[train_set], actual_issame[train_set])
File "/home/zhen/Projects/FaceNet/facenet-master/src/facenet.py", line 496, in calculate_val_far
far = float(false_accept) / float(n_diff)
ZeroDivisionError: float division by zero_

////////////////////////////////////////////////////////////
updated: my fault. I missed a line (the bold line below) in the codes:
elif len(pair) == 4:
path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1])+'_0.'+file_ext)
path1 = os.path.join(lfw_dir, pair[2], pair[2] + '_' + '%04d' % int(pair[3])+'_0.'+file_ext)
issame = False
//////////////////////////////////////////////////////////

NOw it works.

xmuszq on 2 Nov 2017

This patch fixes the file naming in align_dataset_mtcnn.py:

diff --git a/src/align/align_dataset_mtcnn.py b/src/align/align_dataset_mtcnn.py
index a7aaf80..6665dd0 100644
--- a/src/align/align_dataset_mtcnn.py
+++ b/src/align/align_dataset_mtcnn.py
@@ -123,7 +123,7 @@ def main(args):
                                 cropped = img[bb[1]:bb[3],bb[0]:bb[2],:]
                                 scaled = misc.imresize(cropped, (args.image_size, args.image_size), interp='bilinear')
                                 nrof_successfully_aligned += 1
-                                output_filename_n = "{}_{}.{}".format(output_filename.split('.')[0], i, output_filename.split('.')[-1])
+                                output_filename_n = "{}.{}".format(output_filename.split('.')[0], output_filename.split('.')[-1])
                                 misc.imsave(output_filename_n, scaled)
                                 text_file.write('%s %d %d %d %d\n' % (output_filename_n, bb[0], bb[1], bb[2], bb[3]))
                         else:

patrickhwood on 5 Nov 2017

This has been fixed in 4c33d4d64d6908ccf5cfb6616b6dd49776c6d266.

davidsandberg on 11 Nov 2017

According to:
https://q-a-assistant.info/computer-internet-technology/valueerror-cannot-have-number-of-splits-n-splits-10-greater-than-the-number-of-samples-0/1299682

It could be because of png/jpg. If validate_on_lfw.py expect png and your dataset is jpg, it will fail to load image and cause the error.
Adding parameter "--lfw_file_ext jpg" can solve the issue.

chaotaklon on 8 Jan 2018

Hi,
Even after taking latest code, I am getting the same issue.

Saving metagraph Metagraph saved in 34.60 seconds Runnning forward pass on LFW images Traceback (most recent call last): File "src/train_softmax.py", line 446, in <module> main(parse_arguments(sys.argv[1:])) File "src/train_softmax.py", line 227, in main embeddings, label_batch, lfw_paths, actual_issame, args.lfw_batch_size, args.lfw_nrof_folds, log_dir, step, summary_writer) File "src/train_softmax.py", line 326, in evaluate _, _, accuracy, val, val_std, far = lfw.evaluate(emb_array, actual_issame, nrof_folds=nrof_folds) File "C:\dev\DataScience\facenet\src\lfw.py", line 40, in evaluate np.asarray(actual_issame), nrof_folds=nrof_folds) File "C:\dev\DataScience\facenet\src\facenet.py", line 431, in calculate_roc for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)): File "C:\Users\gadginir\AppData\Local\Continuum\anaconda3\envs\tfdeeplearning\lib\site-packages\sklearn\model_selection\_split.py", line 330, in split n_samples)) ValueError: Cannot have number of splits n_splits=10 greater than the number of samples: 0.

I checked the file name and extension. Both are correct. It doesn't contain _0 and it has extension of .png after cropping.

How to fix this?

gadginir on 19 Jan 2018

No metter _0 or jpg/png, if the program fail to read the dataset, the image list length will be zero and cause this error. Print the dataset variable line by line and check when it becomes an empty list or None.

chaotaklon on 19 Jan 2018

Please check if your database path is entered correctly, for example:

~ / datasets/LFW lfw_mtcnnpy_160 correctly

~ / dataset/LFW lfw_mtcnnpy_160 errors
@chaotaklon

tensorflowt on 10 Mar 2018

I was getting this error because the number of hard_triplets became 0 after a few (40~50) epochs. Hence, in the validate_on_lfw.py, the number of samples is 0. It just means that your model is getting better.

I am just getting away with this by adding a try-catch condition.