tfdatasets

Create efficient and fast data loading pipelines

Creating Datasets

Function(s) Description
text_line_dataset() A dataset comprising lines from one or more text files.
tfrecord_dataset() A dataset comprising records from one or more TFRecord files.
sql_record_spec() sql_dataset() sqlite_dataset() A dataset consisting of the results from a SQL query
tensors_dataset() Creates a dataset with a single element, comprising the given tensors.
tensor_slices_dataset() Creates a dataset whose elements are slices of the given tensors.
sparse_tensor_slices_dataset() Splits each rank-N tf$SparseTensor in this dataset row-wise.
fixed_length_record_dataset() A dataset of fixed-length records from one or more binary files.
file_list_dataset() A dataset of all files matching a pattern
range_dataset() Creates a dataset of a step-separated range of values.
read_files() Read a dataset from a set of files
delim_record_spec() csv_record_spec() tsv_record_spec() Specification for reading a record from a text file with delimited values
make_csv_dataset() Reads CSV files into a batched dataset

Transforming Datasets

Function(s) Description
dataset_map() Map a function across a dataset.
dataset_map_and_batch() Fused implementation of dataset_map() and dataset_batch()
dataset_prepare() Prepare a dataset for analysis
dataset_skip() Creates a dataset that skips count elements from this dataset
dataset_filter() Filter a dataset by a predicate
dataset_shard() Creates a dataset that includes only 1 / num_shards of this dataset.
dataset_shuffle() Randomly shuffles the elements of this dataset.
dataset_shuffle_and_repeat() Shuffles and repeats a dataset returning a new permutation for each epoch.
dataset_prefetch() Creates a Dataset that prefetches elements from this dataset.
dataset_batch() Combines consecutive elements of this dataset into batches.
dataset_repeat() Repeats a dataset count times.
dataset_cache() Caches the elements in this dataset.
dataset_take() Creates a dataset with at most count elements from this dataset
dataset_flat_map() Maps map_func across this dataset and flattens the result.
dataset_padded_batch() Combines consecutive elements of this dataset into padded batches.
dataset_decode_delim() Transform a dataset with delimted text lines into a dataset with named
columns
dataset_concatenate() Creates a dataset by concatenating given dataset with this dataset.
dataset_interleave() Maps map_func across this dataset, and interleaves the results
dataset_prefetch_to_device() A transformation that prefetches dataset values to the given device
dataset_window() Combines input elements into a dataset of windows.
dataset_collect() Collects a dataset
zip_datasets() Creates a dataset by zipping together the given datasets.
sample_from_datasets() Samples elements at random from the datasets in datasets.
with_dataset() Execute code that traverses a dataset

Dataset Properites

Function(s) Description
output_types() output_shapes() Output types and shapes
output_types() output_shapes() Output types and shapes

Dataset Iterators

Function(s) Description
input_fn.tf_dataset() Construct a tfestimators input function from a dataset
make_iterator_one_shot() make_iterator_initializable() make_iterator_from_structure() make_iterator_from_string_handle() Creates an iterator for enumerating the elements of this dataset.
iterator_get_next() Get next element from iterator
iterator_initializer() An operation that should be run to initialize this iterator.
iterator_string_handle() String-valued tensor that represents this iterator
iterator_make_initializer() Create an operation that can be run to initialize this iterator
until_out_of_range() out_of_range_handler() Execute code that traverses a dataset until an out of range condition occurs
next_batch() Tensor(s) for retrieving the next batch from a dataset

Feature Spec API

Function(s) Description
feature_spec() Creates a feature specification.
dense_features() Dense Features
dataset_use_spec() Transform the dataset using the provided spec.
fit() Fits a feature specification.
scaler List of pre-made scalers
scaler_min_max() Creates an instance of a min max scaler
scaler_standard() Creates an instance of a standard scaler
step_bucketized_column() Creates bucketized columns
step_categorical_column_with_hash_bucket() Creates a categorical column with hash buckets specification
step_categorical_column_with_identity() Create a categorical column with identity
step_categorical_column_with_vocabulary_file() Creates a categorical column with vocabulary file
step_categorical_column_with_vocabulary_list() Creates a categorical column specification
step_crossed_column() Creates crosses of categorical columns
step_embedding_column() Creates embeddings columns
step_indicator_column() Creates Indicator Columns
step_numeric_column() Creates a numeric column specification
step_remove_column() Creates a step that can remove columns
step_shared_embeddings_column() Creates shared embeddings for categorical columns
steps Steps for feature columns specification.
all_nominal() Find all nominal variables.
all_numeric() Speciy all numeric variables.
has_type() Identify the type of the variable.
cur_info_env Selectors
layer_input_from_dataset() Creates a list of inputs from a dataset

Data

Function(s) Description
hearts Heart Disease Data Set