We should not do the things described below, otherwise it shall yield very wierd result, as only few data are passing into processing.,
And we shall take a look at data processing, ensure that text_a and label are correctly passed into
and do this at create examples
def _create_examples(self, lines, set_type):
"""Creates examples for the training and dev sets."""
examples = []
for (i, line) in enumerate(lines):
if i == 0:
continue
guid = "%s-%s" % (set_type, i)
label = tokenization.convert_to_unicode(line[0])
text_a = tokenization.convert_to_unicode(line[1])
# text_b = tokenization.convert_to_unicode(line[2])
examples.append(
InputExample(guid=guid, text_a=text_a, text_b=None, label=label))
random.shuffle(examples)
return examples
~~~~~~~~~~~~~~~~
I have encountered some key errors, similar to previous issue I suppose, but I did the same thing accordingly, I didn't manage to solve it.
The following are the things shown on screen
/Paul $ python run_classifier.py --task_name=bosco --do_train=true --do_eval=true --dopredict=true --data_dir=$MY_DATASET --vocab_file=$BERT_BASE_DIR/vocab.txt --bert_config_file=$BERT_BASE_DIR/bert_config.json --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt --max_seq_length=128 --train_batch_size=32 --learning_rate=5e-5 --num_train_epochs=50.0 --output_dir=.data/output
WARNING:tensorflow:Estimator's model_fn (
INFO:tensorflow:Using config: {'_model_dir': '.data/bosco_output', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec':
INFO:tensorflow:_TPUContext: eval_on_tpu True
WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False.
INFO:tensorflow:Writing example 0 of 29206
Traceback (most recent call last):
File "run_classifier.py", line 1010, in
tf.app.run()
File "/home/yuwei/anaconda2/envs/py36/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "run_classifier.py", line 899, in main
train_examples, label_list, FLAGS.max_seq_length, tokenizer, train_file)
File "run_classifier.py", line 518, in file_based_convert_examples_to_features
max_seq_length, tokenizer)
File "run_classifier.py", line 487, in convert_single_example
label_id = label_map[example.label]
KeyError: 'Quality'
Seems it is similar to issue #80
Seems there is some bug on line 489 of run_classifier.py.
I added a tab before features and everything is fine
how did you solve this problem? I am not sure which line is line 489 @PaulZhangIsing
how did you solve this problem? I am not sure which line is line 489 @PaulZhangIsing
about here, I make feature line into the if loop
in the file, run_classifier.py , line 489
for (ex_index, example) in enumerate(examples):
if ex_index % 10000 == 0:
tf.logging.info("Writing example %d of %d" % (ex_index, len(examples)))
feature = convert_single_example(ex_index, example, label_list,
max_seq_length, tokenizer)
Im sorry but would you mind share you code of run_classfier.py to me? I have been stucked in this problem for several days. And if you dont mind, I`d like to see your data set too. I have some worries of my data set. @PaulZhangIsing . My email is [email protected]
I
m sorry but would you mind share you code of run_classfier.py to me? I have been stucked in this problem for several days. And if you dont mind, I`d like to see your data set too. I have some worries of my data set. @PaulZhangIsing . My email is [email protected]
Sorry I unable to do so. But you can send yours to me and I try to edit on it and send it back to u?
send my dataset and code to you,please check your email. @PaulZhangIsing .
I need this weekend to check as my company currently unable to connect to outlook
Sent from my iPhone
On 10 Jan 2019, at 16:55, mice4869 <[email protected]notifications@github.com> wrote:
send my dataset and code to you,please check your email. @PaulZhangIsinghttps://github.com/PaulZhangIsing .
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/google-research/bert/issues/333#issuecomment-453018188, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Aab_R5Z8eLNWYFSyW5U0Glc9dW6coVYlks5vBwAEgaJpZM4ZpNAN.
it`s ok. I got this problem fixed. Thank you man@PaulZhanglsing
So basically what have u done?
Sent from my iPhone
On 11 Jan 2019, at 14:46, mice4869 <[email protected]notifications@github.com> wrote:
it`s ok. I got this problem fixed. Thank you man@PaulZhanglsing
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/google-research/bert/issues/333#issuecomment-453396366, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Aab_R9YgkVPb8jCZN96E3GjlYjKLRdh3ks5vCDM6gaJpZM4ZpNAN.
Hi, I'm having the same problem. I added a tab in line 489 but I am still getting the same error. How did you solved it?
Hi, I'm having the same problem. I added a tab in line 489 but I am still getting the same error. How did you solved it?
You should take a look at data processor instead.
I manged to solved it by adding:
[CLS]
[SEP]
[UNK]
[MASK]
To my vocab file.
@AsafBanana I have done this for [CLS] and [SEP] but I still get: KeyError: '[CLS]'
Most helpful comment
Seems there is some bug on line 489 of run_classifier.py.
I added a tab before features and everything is fine