Fasttext: Running on PowerPC64LE (ppc64le)

Created on 27 Jan 2018  路  15Comments  路  Source: facebookresearch/fastText

I am able to compile the stable (0.1.0) version of the code on a powerpc64le (IBM Minsky) without any errors/warnings. However when I run on any dataset (eg stackexchange cooking) using just the defaults ./fasttext supervised -input ... -output ... the program just hangs after displaying Reading ... words. I tried make debug as well. Same problem. (details: make 4.1, Ubuntu 16.04.3 LTS. Any ideas?

Most helpful comment

that's good information. is it stuck in "readWord"? One thing that comes to mind for a difference between x86 and POWER is that the default signedness for "char" types is signed on x86 and unsigned on POWER. I see the use of "char" type in readWord. If EOF is -1, then you might need a "signed char" there instead. (I'm not sure what "sbumpc()" returns.) Anyway, hopefully that area is a good place to look deeper.

All 15 comments

@ironv Which compiler are you using?

I have tried both with c++ (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 and g++ (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609. Same result.

I imagine to debug this, much more information would be helpful:

  • has this ever run successfully on POWER before? If so, what changed? If not, does the exact same command run successfully elsewhere?
  • are there any logs? (I'm completely unfamiliar with "fastText")
  • are there any "DEBUG" compile options or command line parameters to enable more output?
  • do you know where it hangs? any traceback information? (maybe run with or attach using GDB and get backtraces of all of the threads)
  1. This is the first time we are trying to run fasttext on this ppc64le arch. We have it running _as advertised_ on x86_64 boxes.
  2. It is not producing ANY output at all in release or debug modes. I am trying the cooking stackexchange example documented here.
  3. The Makefile which is part of v0.1.0.zip has make debug.
  4. We currently do not have gdb on that box. Installing soon (today) and will report on this.

@ThinkOpenly to your second point (unfamiliarity with fasttext)...this takes a couple of mins:

wget https://github.com/facebookresearch/fastText/archive/v0.1.0.zip
unzip v0.1.0.zip
cd fastText-0.1.0
make

wget https://s3-us-west-1.amazonaws.com/fasttext-vectors/cooking.stackexchange.tar.gz && tar xvzf cooking.stackexchange.tar.gz
head -n 12404 cooking.stackexchange.txt > cooking.train
tail -n 3000 cooking.stackexchange.txt > cooking.valid
./fasttext supervised -input cooking.train -output model_cooking

Output

Read 0M words
Number of words:  14598
Number of labels: 734
Progress: 100.0%  words/sec/thread: 75109  lr: 0.000000  loss: 5.708354  eta: 0h0m

I get NO output at all. It just hangs.

The README indicates there is a "verbose" option, possibly with an (optional?) verbosity level.

The following arguments are optional:
  -verbose            verbosity level [2]

...could be worth a try at least. :-)

The default is 2 (as shown above). That produces the most output. I can set it to verbose 0 when there will be no output to screen. So, when I don't specify it (like in the example above), I should get all output to screen (model and word vectors are always saved to a file).

Oh, I thought that meant the default was 2 _if_ no level was specified, as in "-verbose". So, it hangs quite early it seems, before any output at all. GDB tracebacks will likely be helpful, then.

gdb has not been installed yet (I don't have sudo on the box). Using good ole' std::cout, I have narrowed it to the while loop between lines 223 and 232 in dictionary.cc.

  while (readWord(in, word)) {
    add(word);
    if (ntokens_ % 1000000 == 0 && args_->verbose > 1) {
      std::cerr << "\rRead " << ntokens_  / 1000000 << "M words" << std::flush;
    }
    if (size_ > 0.75 * MAX_VOCAB_SIZE) {
      minThreshold++;
      threshold(minThreshold, minThreshold);
    }
  }

It is not coming out of this loop.

that's good information. is it stuck in "readWord"? One thing that comes to mind for a difference between x86 and POWER is that the default signedness for "char" types is signed on x86 and unsigned on POWER. I see the use of "char" type in readWord. If EOF is -1, then you might need a "signed char" there instead. (I'm not sure what "sbumpc()" returns.) Anyway, hopefully that area is a good place to look deeper.

That was it!!! Changed line 195 in dictionary.cc from char c to signed char c. Have tested it on a few different files now and it works. Thank you @ThinkOpenly

@ironv I'd leave this open so the facebook devs can change that one line, which is a platform agnostic fix, and hopefully no other POWER (or ARM) users will hit this error.

re-opening as suggested by @grooverdan

Hi @ironv, @ThinkOpenly, @grooverdan,

Thank you for reporting and solving this issue. This should be fixed now.

Best,
Edouard.

thanks @EdouardGrave.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

premrajnarkhede picture premrajnarkhede  路  3Comments

danoneata picture danoneata  路  3Comments

pengyu picture pengyu  路  3Comments

AhmedIdr picture AhmedIdr  路  3Comments

leonardgithub picture leonardgithub  路  4Comments