typos
LeetSpeakGenerator
Bases: TyposGenerator
Source code in badgers/generators/text/typos.py
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
|
__init__(random_generator=default_rng(seed=0))
:param random_generator: a random number generator
Source code in badgers/generators/text/typos.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
|
generate(X, y, replacement_proba=0.1)
:param X: A list of words where we apply leet replacement :param y: :param replacement_proba: the probability of replacing a letter with its leet counterpart :return:
Source code in badgers/generators/text/typos.py
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
|
randomly_replace_letter(letter, replacement_proba)
Randomly replace a letter with its leet counterpart :param letter: :param replacement_proba: the probability of replacing a letter with its leet counterpart :return:
Source code in badgers/generators/text/typos.py
106 107 108 109 110 111 112 113 114 115 116 |
|
SwapLettersGenerator
Bases: TyposGenerator
Swap adjacent letters in words randomly except for the first and the last letters. Example: 'kilogram' --> 'kilogarm'
Source code in badgers/generators/text/typos.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
|
__init__(random_generator=default_rng(seed=0))
:param random_generator: A random generator
Source code in badgers/generators/text/typos.py
33 34 35 36 37 38 39 40 |
|
generate(X, y, swap_proba=0.1)
For each word with a length greater than 3, apply a single swap with probability self.swap_proba
Where the swap happens is determined randomly
:param X: A list of words where we apply typos :param y: not used :param swap_proba: Each word with a length greater than 3 will have this probability to contain a switch (max one per word) :return: the transformed list of words
Source code in badgers/generators/text/typos.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
|
TyposGenerator
Bases: GeneratorMixin
Base class for transformers creating typos in a list of words
Source code in badgers/generators/text/typos.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
__init__(random_generator=default_rng(seed=0))
:param random_generator: numpy.random.Generator, default default_rng(seed=0) A random generator
Source code in badgers/generators/text/typos.py
14 15 16 17 18 19 20 |
|