Hello, thanks for new autotune feature.
I'm trying to test it on my data, but got this exception:
$ fasttext supervised -input fasttext_train.txt -output fasttext_model -autotune-validation fasttext_valid.txt
Progress: 14.0% Trials: 6 Best score: 0.647911 ETA: 0h 4m17s
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)
I'm attaching my train data, so you can reproduce this bug and is there anything info that I can provide also?
Thanks.
Hi @dveselov ,
Thank you for reporting the issue. We will have a look.
Regards,
Onur
Most likely that was caused by insufficient memory, because with -wordNgrams 1 flag everything ends successfully.
have same error with autotune, VM with 6 vCPU and 6Gb RAM
p.s. with 10Gb of RAM same error.
Dataset: 700Kb, 4500 lines, 1 label for each line, 8 different labels total
I can't reproduce on my machines. I am also guessing a memory issue depending on the explored parameters and the machine itself. I will continue my investigations.
@dveselov , @arhipovdv , can you please try those :
autotune.cc, place loggers right before the training : LOG_VAL(Trial, trials_)
printArgs(trainArgs, autotuneArgs);
just before fastText_->train(trainArgs);
-verbose 4and check if it actually tries a much bigger model right before crashing.
Can you also send the core file and your executable (as well as the platform details) ?
@dveselov , @arhipovdv , can you please try those :
- in
autotune.cc, place loggers right before the training :LOG_VAL(Trial, trials_) printArgs(trainArgs, autotuneArgs);just before
fastText_->train(trainArgs);
- compile and launch your executable with
-verbose 4and check if it actually tries a much bigger model right before crashing.
Can you also send the core file and your executable (as well as the platform details) ?
output with -verbose 4:
log.txt
executable:
fasttext.zip
core file - what is it? where to find?
Ubuntu 19.04, VM under Hyper-V, 6 vCPU, 2/6/10 Gb RAM
Linux version 5.0.0-25-generic (buildd@lgw01-amd64-008) (gcc version 8.3.0 (Ubuntu 8.3.0-6ubuntu1)) #26-Ubuntu SMP Thu Aug 1 12:04:58 UTC 2019
System Information:
I ran into similar problem when using python api(build from source return error 1). The RAM usage spike and PC freezed.
By adding logs args before autotune.cc, it did try loading a large model before crashing:
Trial = 10
epoch = 100
lr = 0.0219047
dim = 657
minCount = 1
wordNgrams = 3
minn = 0
maxn = 0
bucket = 10000000
dsub = 2
loss = hs
Since I was forced to restart PC, there's no crash file in /var/crash
Hi @Celebio !
I also had experienced this. In my case, it fails when it tries to set the dsub parameter to 16. By the way, I am a bit confused by this parameter, since it is listed in the quantization options in the docs but I have not set a size constraint. What is it doing?
If I set manually dsub to 2, it seems working. But If I try to set other options together with autotuning, like -epoch 50 or -bucket 200000 , I get an unspecified "floating point exception", after the first trial, probably during the second.
If I set manually dsub to 2, it seems working. But If I try to set other options together with autotuning, like -epoch 50 or -bucket 200000 , I get an unspecified "floating point exception", after the first trial, probably during the second.
I have the same problem.
Same problem, anyone know how to solve it?
Progress: 10.0% Trials: 5 Best score: 0.608669 ETA: 8h59m42s
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Hi @dveselov, @arhipovdv , @DanzaMacabra , @lucaventurini , @sisiruowan , @OpenWaygate ,
Could you please try at the commit https://github.com/facebookresearch/fastText/commit/a54ce12ccc3b5f37fa3758958d946e43f63969f8 ?
Best regards,
Onur
Same issue at head:master with 32 gigs of ram, the application doesn't even approach 4% of available memory yet appears to fail to alloc and then tries to access the resulting null pointer.
Faulting application name: fasttext-latest.exe, version: 0.0.0.0, time stamp: 0x600686f3
Faulting module name: ucrtbased.dll, version: 10.0.19041.1, time stamp: 0xe7caee08
Exception code: 0xc0000005
Fault offset: 0x0000000000156210
It's a result of providing args that autotune wants to work out itself.
fasttext-latest.exe supervised -thread 7 -input Data-Refined\eng_train.txt.train -output Models\out_file -loss ova -autotune-validation Data\eng_train.txt.validation -autotune-modelsize 6M -autotune-duration 600
Warning : loss is manually set to a specific value. It will not be automatically optimized.
If I remove any params that are warned about (like ova loss), then this crash doesn't happen.
Most helpful comment
If I set manually dsub to 2, it seems working. But If I try to set other options together with autotuning, like -epoch 50 or -bucket 200000 , I get an unspecified "floating point exception", after the first trial, probably during the second.