tfdatasets

Create efficient and fast data loading pipelines

Creating Datasets

Function(s)	Description
text_line_dataset()	A dataset comprising lines from one or more text files.
tfrecord_dataset()	A dataset comprising records from one or more TFRecord files.
sql_record_spec() sql_dataset() sqlite_dataset()	A dataset consisting of the results from a SQL query
tensors_dataset()	Creates a dataset with a single element, comprising the given tensors.
tensor_slices_dataset()	Creates a dataset whose elements are slices of the given tensors.
sparse_tensor_slices_dataset()	Splits each rank-N `tf$SparseTensor` in this dataset row-wise.
fixed_length_record_dataset()	A dataset of fixed-length records from one or more binary files.
file_list_dataset()	A dataset of all files matching a pattern
range_dataset()	Creates a dataset of a step-separated range of values.
read_files()	Read a dataset from a set of files
delim_record_spec() csv_record_spec() tsv_record_spec()	Specification for reading a record from a text file with delimited values
make_csv_dataset()	Reads CSV files into a batched dataset

Transforming Datasets

Function(s)	Description
dataset_map()	Map a function across a dataset.
dataset_map_and_batch()	Fused implementation of dataset_map() and dataset_batch()
dataset_prepare()	Prepare a dataset for analysis
dataset_skip()	Creates a dataset that skips count elements from this dataset
dataset_filter()	Filter a dataset by a predicate
dataset_shard()	Creates a dataset that includes only 1 / num_shards of this dataset.
dataset_shuffle()	Randomly shuffles the elements of this dataset.
dataset_shuffle_and_repeat()	Shuffles and repeats a dataset returning a new permutation for each epoch.
dataset_prefetch()	Creates a Dataset that prefetches elements from this dataset.
dataset_batch()	Combines consecutive elements of this dataset into batches.
dataset_repeat()	Repeats a dataset count times.
dataset_cache()	Caches the elements in this dataset.
dataset_take()	Creates a dataset with at most count elements from this dataset
dataset_flat_map()	Maps map_func across this dataset and flattens the result.
dataset_padded_batch()	Combines consecutive elements of this dataset into padded batches.
dataset_decode_delim()	Transform a dataset with delimted text lines into a dataset with named
columns
dataset_concatenate()	Creates a dataset by concatenating given dataset with this dataset.
dataset_interleave()	Maps map_func across this dataset, and interleaves the results
dataset_prefetch_to_device()	A transformation that prefetches dataset values to the given `device`
dataset_window()	Combines input elements into a dataset of windows.
dataset_collect()	Collects a dataset
zip_datasets()	Creates a dataset by zipping together the given datasets.
sample_from_datasets()	Samples elements at random from the datasets in `datasets`.
with_dataset()	Execute code that traverses a dataset

Dataset Properites

Function(s)	Description
output_types() output_shapes()	Output types and shapes
output_types() output_shapes()	Output types and shapes

Dataset Iterators

Function(s)	Description
input_fn.tf_dataset()	Construct a tfestimators input function from a dataset
make_iterator_one_shot() make_iterator_initializable() make_iterator_from_structure() make_iterator_from_string_handle()	Creates an iterator for enumerating the elements of this dataset.
iterator_get_next()	Get next element from iterator
iterator_initializer()	An operation that should be run to initialize this iterator.
iterator_string_handle()	String-valued tensor that represents this iterator
iterator_make_initializer()	Create an operation that can be run to initialize this iterator
until_out_of_range() out_of_range_handler()	Execute code that traverses a dataset until an out of range condition occurs
next_batch()	Tensor(s) for retrieving the next batch from a dataset

Feature Spec API

Function(s)	Description
feature_spec()	Creates a feature specification.
dense_features()	Dense Features
dataset_use_spec()	Transform the dataset using the provided spec.
fit()	Fits a feature specification.
scaler	List of pre-made scalers
scaler_min_max()	Creates an instance of a min max scaler
scaler_standard()	Creates an instance of a standard scaler
step_bucketized_column()	Creates bucketized columns
step_categorical_column_with_hash_bucket()	Creates a categorical column with hash buckets specification
step_categorical_column_with_identity()	Create a categorical column with identity
step_categorical_column_with_vocabulary_file()	Creates a categorical column with vocabulary file
step_categorical_column_with_vocabulary_list()	Creates a categorical column specification
step_crossed_column()	Creates crosses of categorical columns
step_embedding_column()	Creates embeddings columns
step_indicator_column()	Creates Indicator Columns
step_numeric_column()	Creates a numeric column specification
step_remove_column()	Creates a step that can remove columns
step_shared_embeddings_column()	Creates shared embeddings for categorical columns
steps	Steps for feature columns specification.
all_nominal()	Find all nominal variables.
all_numeric()	Speciy all numeric variables.
has_type()	Identify the type of the variable.
cur_info_env	Selectors
layer_input_from_dataset()	Creates a list of inputs from a dataset

Data

Function(s)	Description
hearts	Heart Disease Data Set