Models: Textsum error: a bytes-like object is required, not 'str'

Created on 20 Apr 2018  路  12Comments  路  Source: tensorflow/models


System information

  • What is the top-level directory of the model you are using: Textsum
  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 1.7
  • Bazel version (if compiling from source): 0.12.0(binary)
  • CUDA/cuDNN version: N/A
  • GPU model and memory: N/A
  • Exact command to reproduce: seq2seq_attention --mode=train --article_key=article --abstract_key=abstract --data_path=D:/TS/data/training-* --vocab_path=D:/TS/data/vocab --log_root=D:/TS/textsum/log_root --train_dir=D:/TS/textsum/log_root/train

Source code / logs

Simply used the toy data provided with Textsum and renamed it to training-0. The example training code (changed dir to use in cmd) caused many of this error:

Exception in thread Thread-xxx:
Traceback (most recent call last):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36\lib\threading.py", line 864, in run
self._target(self._args, self._kwargs)
File "\?\C:\Users\xxx\AppData\Local\Temp\Bazel.runfiles_0br8l4_4\runfiles__main__\textsum\batch_reader.py", line 139, in _FillInputQueue
data.ToSentences(article, include_token=False)]
File "\?\C:\Users\xxx\AppData\Local\Temp\Bazel.runfiles_0br8l4_4\runfiles__main__\textsum\data.py", line 215, in ToSentences
return [s for s in s_gen]
File "\?\C:\Users\xxx\AppData\Local\Temp\Bazel.runfiles_0br8l4_4\runfiles__main__\textsum\data.py", line 215, in
return [s for s in s_gen]
File "\?\C:\Users\xxx\AppData\Local\Temp\Bazel.runfiles_0br8l4_4\runfiles__main__\textsum\data.py", line 189, in SnippetGen
start_p = text.index(start_tok, cur)
*
TypeError: a bytes-like object is required, not 'str'

awaiting maintainer

Most helpful comment

It looks like this issue appears when using python 3 (error does not appear with python 2).

A quick fix would be to decode the bytes in _GetExFeatureText(self, ex, key) in batch_reader.py:

return ex.features.feature[key].bytes_list.value[0].decode('utf-8')

I'll submit a PR with this change if the code runs to completion without error.

All 12 comments

System information

  • __What is the top-level directory of the model you are using__: Textsum
  • __Have I written custom code (as opposed to using a stock example script provided in TensorFlow):__ No
  • __OS Platform and Distribution (e.g., Linux Ubuntu 16.04)__: Linux Ubuntu 16.04
  • __TensorFlow installed from (source or binary)__: conda
  • __TensorFlow version (use command below)__: tensorflow-gpu 1.3.0
  • __Bazel version (if compiling from source)__: 0.12.0(binary)
  • __CUDA/cuDNN version__: 8.0/6.0.21
  • __GPU model and memory__: gtx 1050/2GB
  • __Exact command to reproduce__: bazel-bin/textsum/seq2seq_attention --mode=eval --article_key=article --abstract_key=abstract --data_path=data/data --vocab_path=data/vocab --log_root=textsum/log_root --eval_dir=textsum/log_root/eval

Got the same error. Could it be a problem caused by version.

@peterjliu @nealwu Could you help on the issue? Thanks.

@yhliang2018 Same error here, have you figured out yet? My setup is Ubuntu 06.04 and tf 1.7, seems it removed from models in r1.7

@k-w-w Do you have any idea on this issue? Feel free to add people to this thread if they can help.

It looks like this issue appears when using python 3 (error does not appear with python 2).

A quick fix would be to decode the bytes in _GetExFeatureText(self, ex, key) in batch_reader.py:

return ex.features.feature[key].bytes_list.value[0].decode('utf-8')

I'll submit a PR with this change if the code runs to completion without error.

@k-w-w That trick does not work for me. I still have the same problem. Have you found a way to solve this issue?

@ersinyar Are you getting the same error? Also are you running with python 3?

@k-w-w I am getting a similar error. I was following the post given in https://eilianyu.wordpress.com/2016/10/17/text-summarization-using-sequence-to-sequence-model-in-tensorflow-and-gpu-computing/. When I try to convert from text to binary before training as explained in the post, I get an error stating that

TypeError: "b'AFP'" has type str, but expected one of: bytes

I run on Python 3 and latest Tensorflow. The post I follow uses TF r0.11 and Python 2.7. First, I thought that latest version of TF might be problematic and I tried different versions. But, I kept getting the same error.

@ersinyar The code was most likely intended to run with Python 2, so if that is available to you I would recommend using Py2 instead of applying the fixes. Dealing with differing string encoding can be pretty messy. If you must use Py3, applying the changes in this commit should work.

It looks like this issue appears when using python 3 (error does not appear with python 2).

A quick fix would be to decode the bytes in _GetExFeatureText(self, ex, key) in batch_reader.py:

return ex.features.feature[key].bytes_list.value[0].decode('utf-8')

I'll submit a PR with this change if the code runs to completion without error.

I got the same error using python3. And this WORKS for me!
Thank you!

System information
**What is the top-level directory of the model you are using: Object_Detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):Ubuntu 18
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.0
Bazel version (if compiling from source):
CUDA/cuDNN version: N/A
GPU model and memory: N/A
Exact command to reproduce: python3 legacy/train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.config

I 've configured my model, and when I run this command I get the error:
File "/home/anatoli/my_tensorflow/my_tensorflow/models/research/object_detection/utils/label_map_util.py", line 143, in load_labelmap
TypeError: a bytes-like object is required, not 'str'
Could you give me more explanations to fix my problem? Thank U

You may follow the changes in this commit to fix this. Close this bug for now.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rakashi picture rakashi  路  3Comments

hanzy123 picture hanzy123  路  3Comments

sun9700 picture sun9700  路  3Comments

amirjamez picture amirjamez  路  3Comments

Mostafaghelich picture Mostafaghelich  路  3Comments