text_to_word_sequence
Convert text to a sequence of words (or tokens).
Description
Convert text to a sequence of words (or tokens).
Usage
text_to_word_sequence(
text,
filters = "!\"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n",
lower = TRUE,
split = " "
) Arguments
| Arguments | Description |
|---|---|
| text | Input text (string). |
| filters | Sequence of characters to filter out such as punctuation. Default includes basic punctuation, tabs, and newlines. |
| lower | Whether to convert the input to lowercase. |
| split | Sentence split marker (string). |
Value
Words (or tokens)
See Also
Other text preprocessing: make_sampling_table(), pad_sequences(), skipgrams(), text_hashing_trick(), text_one_hot()