Train Data Generation¶
Module to generate training data.
Returns an iterable of SentencePairs to be used for training. The inputs document_a and document_b are both lists of Sentences made from a parallel corpus.
- Returns a tuple (scrambled_a, scrambled_b, correct_alignments) where:
scrambled_a is a scrambled version of document_a.
scrambled_b is a scrambled version of document_b.
- correct_alignments are all the correct sentence alignments that exist
between scrambled_a and scrambled_b.