library(tfdatasets)
data(hearts)
<- tempfile()
file writeLines(unique(hearts$thal), file)
<- tensor_slices_dataset(hearts) %>% dataset_batch(32)
hearts
# use the formula interface
<- feature_spec(hearts, target ~ thal) %>%
spec step_categorical_column_with_vocabulary_list(thal) %>%
step_embedding_column(thal, dimension = 3)
<- fit(spec)
spec_fit <- hearts %>% dataset_use_spec(spec_fit) final_dataset
step_embedding_column
Creates embeddings columns
Description
Use this step to create ambeddings columns from categorical columns.
Usage
step_embedding_column(
spec,
..., dimension = function(x) {
as.integer(x^0.25)
}, combiner = "mean",
initializer = NULL,
ckpt_to_load_from = NULL,
tensor_name_in_ckpt = NULL,
max_norm = NULL,
trainable = TRUE
)
Arguments
Arguments | Description |
---|---|
spec | A feature specification created with feature_spec() . |
… | Comma separated list of variable names to apply the step. selectors can also be used. |
dimension | An integer specifying dimension of the embedding, must be > 0. Can also be a function of the size of the vocabulary. |
combiner | A string specifying how to reduce if there are multiple entries in a single row. Currently ‘mean’, ‘sqrtn’ and ‘sum’ are supported, with ‘mean’ the default. ‘sqrtn’ often achieves good accuracy, in particular with bag-of-words columns. Each of this can be thought as example level normalizations on the column. For more information, see tf.embedding_lookup_sparse . |
initializer | A variable initializer function to be used in embedding variable initialization. If not specified, defaults to tf.truncated_normal_initializer with mean 0.0 and standard deviation 1/sqrt(dimension) . |
ckpt_to_load_from | String representing checkpoint name/pattern from which to restore column weights. Required if tensor_name_in_ckpt is not NULL . |
tensor_name_in_ckpt | Name of the Tensor in ckpt_to_load_from from which to restore the column weights. Required if ckpt_to_load_from is not NULL . |
max_norm | If not NULL , embedding values are l2-normalized to this value. |
trainable | Whether or not the embedding is trainable. Default is TRUE . |
Value
a FeatureSpec
object.
Examples
See Also
steps for a complete list of allowed steps. Other Feature Spec Functions: dataset_use_spec()
, feature_spec()
, fit.FeatureSpec()
, step_bucketized_column()
, step_categorical_column_with_hash_bucket()
, step_categorical_column_with_identity()
, step_categorical_column_with_vocabulary_file()
, step_categorical_column_with_vocabulary_list()
, step_crossed_column()
, step_indicator_column()
, step_numeric_column()
, step_remove_column()
, step_shared_embeddings_column()
, steps