Train Data Generation¶
Module to generate training data.
-
yalign.train_data_generation.
training_alignments_from_documents
(document_a, document_b)¶ Returns an iterable of SentencePairs to be used for training. The inputs document_a and document_b are both lists of Sentences made from a parallel corpus.
-
yalign.train_data_generation.
training_scrambling_from_documents
(document_a, document_b)¶ - Returns a tuple (scrambled_a, scrambled_b, correct_alignments) where:
scrambled_a is a scrambled version of document_a.
scrambled_b is a scrambled version of document_b.
- correct_alignments are all the correct sentence alignments that exist
between scrambled_a and scrambled_b.