11.12. twol.twexamp module¶
A module for reading two-level examples
The examples are assumed to be as space-separated one-level representation and they are compiled into a single automaton. At the same time, the alphabet used in the examples is collected in several forms.
cfg.examples_fst – the transducer which accepts exactly the examples
cfg.symbol_pair_set – a tuple of string pairs suitable for e.g. hfst.rules.restriction
- twol.twexamp.main()[source]¶
The
twexamp.py
module can also be used as a standalone script or command in order to convert examples in pair string format into a finite-state transducer (FST). Examples in pair string format are plain human readable text files, one example per line, where each example is give as a space-separated sequence of pair symbols, e.g.:k a u p {pØ}:Ø {ao}:a s s {aä}:a
The invocation of the program could be e.g.:
$ twol-examp examples.pstr examples.fst
- twol.twexamp.pairs_to_fst(pair_set)[source]¶
Converts a seq of symbol pairs into a fst that accepts any of them
- twol.twexamp.read_examples(filename_lst=['test.pstr'], build_fsts=True)[source]¶
Reads the examples from files whose names are ‘filename_lst’.
The file must contain one example per line and each line consists of a space separated sequence of pair-symbols. The examples are processed to a FST which is a union of all examples.