Fasttext: Core dumped when using autotune

Created on 27 Aug 2019  路  14Comments  路  Source: facebookresearch/fastText

Hello, thanks for new autotune feature.
I'm trying to test it on my data, but got this exception:

$ fasttext supervised -input fasttext_train.txt -output fasttext_model -autotune-validation fasttext_valid.txt

Progress:  14.0% Trials:    6 Best score:  0.647911 ETA:   0h 4m17s

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

I'm attaching my train data, so you can reproduce this bug and is there anything info that I can provide also?
Thanks.

fasttext_valid.txt
fasttext_train.txt

Most helpful comment

If I set manually dsub to 2, it seems working. But If I try to set other options together with autotuning, like -epoch 50 or -bucket 200000 , I get an unspecified "floating point exception", after the first trial, probably during the second.

All 14 comments

Hi @dveselov ,
Thank you for reporting the issue. We will have a look.

Regards,
Onur

Most likely that was caused by insufficient memory, because with -wordNgrams 1 flag everything ends successfully.

have same error with autotune, VM with 6 vCPU and 6Gb RAM
p.s. with 10Gb of RAM same error.
Dataset: 700Kb, 4500 lines, 1 label for each line, 8 different labels total

I can't reproduce on my machines. I am also guessing a memory issue depending on the explored parameters and the machine itself. I will continue my investigations.

@dveselov , @arhipovdv , can you please try those :

  • in autotune.cc, place loggers right before the training :
    LOG_VAL(Trial, trials_)
    printArgs(trainArgs, autotuneArgs);

just before fastText_->train(trainArgs);

  • compile and launch your executable with -verbose 4

and check if it actually tries a much bigger model right before crashing.

Can you also send the core file and your executable (as well as the platform details) ?

@dveselov , @arhipovdv , can you please try those :

  • in autotune.cc, place loggers right before the training :
    LOG_VAL(Trial, trials_)
    printArgs(trainArgs, autotuneArgs);

just before fastText_->train(trainArgs);

  • compile and launch your executable with -verbose 4

and check if it actually tries a much bigger model right before crashing.

Can you also send the core file and your executable (as well as the platform details) ?

output with -verbose 4:
log.txt
executable:
fasttext.zip

core file - what is it? where to find?

Ubuntu 19.04, VM under Hyper-V, 6 vCPU, 2/6/10 Gb RAM
Linux version 5.0.0-25-generic (buildd@lgw01-amd64-008) (gcc version 8.3.0 (Ubuntu 8.3.0-6ubuntu1)) #26-Ubuntu SMP Thu Aug 1 12:04:58 UTC 2019

System Information:

  • OS Platform and Distribution
    Ubuntu 16.04
  • Python version:
    Python 3.6.8
  • GCC version:
    GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
  • CPU model and memory:
    Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz
    RAM 15.6G

I ran into similar problem when using python api(build from source return error 1). The RAM usage spike and PC freezed.
By adding logs args before autotune.cc, it did try loading a large model before crashing:
Trial = 10
epoch = 100
lr = 0.0219047
dim = 657
minCount = 1
wordNgrams = 3
minn = 0
maxn = 0
bucket = 10000000
dsub = 2
loss = hs
Since I was forced to restart PC, there's no crash file in /var/crash

Hi @Celebio !

I also had experienced this. In my case, it fails when it tries to set the dsub parameter to 16. By the way, I am a bit confused by this parameter, since it is listed in the quantization options in the docs but I have not set a size constraint. What is it doing?

If I set manually dsub to 2, it seems working. But If I try to set other options together with autotuning, like -epoch 50 or -bucket 200000 , I get an unspecified "floating point exception", after the first trial, probably during the second.

If I set manually dsub to 2, it seems working. But If I try to set other options together with autotuning, like -epoch 50 or -bucket 200000 , I get an unspecified "floating point exception", after the first trial, probably during the second.

I have the same problem.

Same problem, anyone know how to solve it?

Progress:  10.0% Trials:    5 Best score:  0.608669 ETA:   8h59m42s
terminate called after throwing an instance of 'std::bad_alloc'
what():  std::bad_alloc

Hi @dveselov, @arhipovdv , @DanzaMacabra , @lucaventurini , @sisiruowan , @OpenWaygate ,
Could you please try at the commit https://github.com/facebookresearch/fastText/commit/a54ce12ccc3b5f37fa3758958d946e43f63969f8 ?

Best regards,
Onur

Same issue at head:master with 32 gigs of ram, the application doesn't even approach 4% of available memory yet appears to fail to alloc and then tries to access the resulting null pointer.

Faulting application name: fasttext-latest.exe, version: 0.0.0.0, time stamp: 0x600686f3
Faulting module name: ucrtbased.dll, version: 10.0.19041.1, time stamp: 0xe7caee08
Exception code: 0xc0000005
Fault offset: 0x0000000000156210

It's a result of providing args that autotune wants to work out itself.

fasttext-latest.exe supervised -thread 7 -input Data-Refined\eng_train.txt.train -output Models\out_file -loss ova -autotune-validation Data\eng_train.txt.validation -autotune-modelsize 6M -autotune-duration 600
Warning : loss is manually set to a specific value. It will not be automatically optimized.

If I remove any params that are warned about (like ova loss), then this crash doesn't happen.

Was this page helpful?
0 / 5 - 0 ratings