Construct a Categorical Column that Returns Identity Values

Use this when your inputs are integers in the range [0, num_buckets), and you want to use the input value itself as the categorical ID. Values outside this range will result in default_value if specified, otherwise it will fail.

column_categorical_with_identity(..., num_buckets, default_value = NULL)

Arguments

...

Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns.

num_buckets

Number of unique values.

default_value

If NULL, this column's graph operations will fail for out-of-range inputs. Otherwise, this value must be in the range [0, num_buckets), and will replace inputs in that range.

Value

A categorical column that returns identity values.

Details

Typically, this is used for contiguous ranges of integer indexes, but it doesn't have to be. This might be inefficient, however, if many of IDs are unused. Consider column_categorical_with_hash_bucket() in that case.

For input dictionary features, features$key is either tensor or sparse tensor object. If it's tensor object, missing values can be represented by -1 for int and '' for string. Note that these values are independent of the default_value argument.

Raises

  • ValueError: if num_buckets is less than one.

  • ValueError: if default_value is not in range [0, num_buckets).

See also