Skip to content

IndexError: list index out of range in detokenize #4

@loretoparisi

Description

@loretoparisi

I get an error after running

for temp in [1.0]:
    bert_sents = generate(n_samples, seed_text=seed_text, batch_size=batch_size, max_len=max_len,
                          sample=sample, top_k=top_k, temperature=temp, burnin=burnin, max_iter=max_iter,
                          cuda=True)
    out_file = "data/%s-len%d-burnin%d-topk%d-temp%.3f.txt" % (model_version, max_len, burnin, top_k, temp)
    write_sents(out_file, bert_sents, should_detokenize=True)

Stacktrace:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-23-776125cadf25> in <module>()
     18                           cuda=True)
     19     out_file = "data/%s-len%d-burnin%d-topk%d-temp%.3f.txt" % (model_version, max_len, burnin, top_k, temp)
---> 20     write_sents(out_file, bert_sents, should_detokenize=True)

<ipython-input-19-027cb8b83cc4> in write_sents(out_file, sents, should_detokenize)
     15     with open(out_file, "w") as out_fh:
     16         for sent in sents:
---> 17             sent = detokenize(sent[1:-1]) if should_detokenize else sent
     18             out_fh.write("%s\n" % " ".join(sent))

<ipython-input-16-beace4564740> in detokenize(sent)
     20     for i, tok in enumerate(sent):
     21         if tok.startswith("##"):
---> 22             new_sent[len(new_sent) - 1] = new_sent[len(new_sent) - 1] + tok[2:]
     23         else:
     24             new_sent.append(tok)

IndexError: list index out of range

The saved file head

$ head -n3 bert-base-uncased-len40-burnin250-topk100-temp1.000.txt 
sammy harves [ " baby candy " / " dream of baby candy " ( gas station theme ) ) mary ford and baby candy . ( gas station theme ) concept album , featuring mary ford .
3 . contemporary art review ( 2nd ed . october 2008 ) , review with essays on contemporary art , ( london : bateman & partners , february 2009 ) sculpture and the minimalist movement , part .
the truth outside ( matthew greengrass ) psycho ( 1964 ? ) psycho ( orson welles ) monster show ( orson welles ) ( barnacles ) part 3 ( the snare drum ) - narration ;

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions