library(tfdatasets)
data(hearts)
<- tempfile()
file writeLines(unique(hearts$thal), file)
<- tensor_slices_dataset(hearts) %>% dataset_batch(32)
hearts
# use the formula interface
<- feature_spec(hearts, target ~ thal) %>%
spec step_categorical_column_with_vocabulary_file(thal, vocabulary_file = file)
<- fit(spec)
spec_fit <- hearts %>% dataset_use_spec(spec_fit) final_dataset
step_categorical_column_with_vocabulary_file
Creates a categorical column with vocabulary file
Description
Use this function when the vocabulary of a categorical variable is written to a file.
Usage
step_categorical_column_with_vocabulary_file(
spec,
...,
vocabulary_file, vocabulary_size = NULL,
dtype = tf$string,
default_value = NULL,
num_oov_buckets = 0L
)
Arguments
Arguments | Description |
---|---|
spec | A feature specification created with feature_spec() . |
… | Comma separated list of variable names to apply the step. selectors can also be used. |
vocabulary_file | The vocabulary file name. |
vocabulary_size | Number of the elements in the vocabulary. This must be no greater than length of vocabulary_file , if less than length, later values are ignored. If None, it is set to the length of vocabulary_file . |
dtype | The type of features. Only string and integer types are supported. |
default_value | The integer ID value to return for out-of-vocabulary feature values, defaults to -1 . This can not be specified with a positive num_oov_buckets . |
num_oov_buckets | Non-negative integer, the number of out-of-vocabulary buckets. All out-of-vocabulary inputs will be assigned IDs in the range [vocabulary_size, vocabulary_size+num_oov_buckets) based on a hash of the input value. A positive num_oov_buckets can not be specified with default_value. |
Value
a FeatureSpec
object.
Examples
See Also
steps for a complete list of allowed steps. Other Feature Spec Functions: dataset_use_spec()
, feature_spec()
, fit.FeatureSpec()
, step_bucketized_column()
, step_categorical_column_with_hash_bucket()
, step_categorical_column_with_identity()
, step_categorical_column_with_vocabulary_list()
, step_crossed_column()
, step_embedding_column()
, step_indicator_column()
, step_numeric_column()
, step_remove_column()
, step_shared_embeddings_column()
, steps