Using the TensorFlow API from R
The TensorFlow API is composed of a set of Python modules that enable constructing and executing TensorFlow graphs. The tensorflow package provides access to the complete TensorFlow API from within R. This article describes the basic syntax and mechanics of using TensorFlow from R.
Modules
The TensorFlow API is divided into modules. The toplevel entry point to the API is tf
, which provides access to the main TensorFlow module (tf
is exported from the tensorflow package). For example, here is a simple “Hello, World” script:
library(tensorflow)
sess = tf$Session()
hello < tf$constant('Hello, TensorFlow!')
sess$run(hello)
a < tf$constant(10L)
b < tf$constant(32L)
sess$run(a + b)
Functions & Classes
Functions that begin with a lowercase letter (e.g. tf$constant
) are normal R functions. Functions that begin within an uppercase letter (e.g. tf$Session
) are functions that create new instances of TensorFlow classes.
The main TensorFlow module (tf
) includes a wide variety of functions and classes which you can read more more about in the TensorFlow API documentation (see also the Getting Help section below which describes accessing help for the TensorFlow API directly within the RStudio IDE editor and console).
SubModules
The TensorFlow API also includes many submodules, including the tf$nn
module which provides specialized functions for neural networks and the tf$train
module which provides a set of classes and functions that helps train models. You can access these modules the same way that functions are accessed (via the $
operator). For example:
# Call the conv2d function within the nn submodule
tf$nn$conv2d(x, W, strides=c(1L, 1L, 1L, 1L), padding='SAME')
# Create an optimizer from the train submodule
optimizer < tf$train$GradientDescentOptimizer(0.5)
Simple Example
Here’s a simple example of using R to define a TensorFlow model:
library(tensorflow)
# Create 100 phony x, y data points, y = x * 0.1 + 0.3
x_data < runif(100, min=0, max=1)
y_data < x_data * 0.1 + 0.3
# Try to find values for W and b that compute y_data = W * x_data + b
# (We know that W should be 0.1 and b 0.3, but TensorFlow will
# figure that out for us.)
W < tf$Variable(tf$random_uniform(shape(1L), 1.0, 1.0))
b < tf$Variable(tf$zeros(shape(1L)))
y < W * x_data + b
# Minimize the mean squared errors.
loss < tf$reduce_mean((y  y_data) ^ 2)
optimizer < tf$train$GradientDescentOptimizer(0.5)
train < optimizer$minimize(loss)
# Launch the graph and initialize the variables.
sess = tf$Session()
sess$run(tf$global_variables_initializer())
# Fit the line (Learns best fit is W: 0.1, b: 0.3)
for (step in 1:201) {
sess$run(train)
if (step %% 20 == 0)
cat(step, "", sess$run(W), sess$run(b), "\n")
}
If you’ve seen TensorFlow code written in Python you’ll recognize this code as nearly identical save for some minor syntactic differences (e.g. the use of $
rather than .
as an object delimiter). You’ll also note that R numeric vectors and random distribution functions are used, whereas Python code would typically use their NumPy equivalents.
The remainder of this article describes the core patterns used for interacting with the TensorFlow API from R,
Numeric Types
The TensorFlow API is more strict about numeric types than is customary in R (which often automatically casts from integer to float and viceversa as necessary). Many TensorFlow function parameters require integers (e.g. for tensor dimensions) and in those cases it’s important to use an R integer literal (e.g. 1L
). Here’s an example of specifying the strides
parameter for a 4dimensional tensor using integer literals:
tf$nn$conv2d(x, W, strides=c(1L, 1L, 1L, 1L), padding='SAME')
Here’s another example of using integer literals when defining a set of integer flags:
flags$DEFINE_integer('max_steps', 2000L, 'Number of steps to run trainer.')
flags$DEFINE_integer('hidden1', 128L, 'Number of units in hidden layer 1.')
flags$DEFINE_integer('hidden2', 32L, 'Number of units in hidden layer 2.')
Numeric Lists
Some TensorFlow APIs call for lists of a numeric type. Typically you can use the c
function (as illustrated above) to create lists of numeric types. However, there are a couple of special cases (mostly involving specifying the shapes of tensors) where you may need to create a numeric list with an embedded NULL
or a numeric list with only a single item. In those cases you’ll want to use the list
function rather than c
in order to force the argument to be treated as a list rather than a scalar, and to ensure that NULL
elements are preserved. For example:
x < tf$placeholder(tf$float32, list(NULL, 784L))
W < tf$Variable(tf$zeros(list(784L, 10L)))
b < tf$Variable(tf$zeros(list(10L)))
Tensor Shapes
This need to use list
rather than c
is very common for shape arguments (since they are often of one dimension and in the case of placeholders often have a NULL
dimension). For these cases there is a shape
function which you can use to make the calling syntax a bit more more clear. For example, the above code could be rewritten as:
Tensor Values
A tensor is a typed multidimensional array. Tensors can take the form of a single value, a vector, a matrix, or an array in many dimensions. When you initialize the value of a tensor you can use the following R data types for various tensor shapes:
Dimensions  R Type  Example 

1  vector  c(1.0, 2.0, 3.0, 4.0) 
2  matrix  matrix(c(1.0,2.0,3.0,4.0), nrow = 2, ncol = 2) 
3+  array  array(rep(1, 365*5*4), dim=c(365, 5, 4)) 
Correspondingly, when a TensorFlow computation yields a value back to R the appropriate data type (vector, matrix, or array) will be returned. You may see references to NumPy arrays in TensorFlow documentation or examples written in Python. The TensorFlow R API doesn’t make use of NumPy arrays but rather their R analogs as described above.
Tensor Indexes
Tensor indexes within the TensorFlow API are 0based (rather than 1based as R vectors are). This typically comes up when specifying the dimension of a tensor to operate on (e.g with a function like tf$reduce_mean
or tf$argmax
). The first dimension of a tensor is specified as 0L
, the second 1L
, and so on. For example:
# call tf$reduce_mean on the second dimension of the specified tensor
cross_entropy < tf$reduce_mean(
tf$reduce_sum(y_ * tf$log(y_conv), reduction_indices=1L)
)
# call tf$argmax on the second dimension of the specified tensor
correct_prediction < tf$equal(tf$argmax(y_conv, 1L), tf$argmax(y_, 1L))
Tensor Extraction
Tensors can created by extracting elements from other tensors, either using functions like tf$gather
and tf$slice
, or by using [
syntax. Extraction using [
also expects 0based indexes, but otherwise follows R conventions: the index range is inclusive (whereas Python omits the last element of an indexing vector) and blank indices indicate that all elements in that dimension should be retained. For example:
# take the elements in the first 100 rows and first 5 columns of W
W_small < W[0:99, 0:4]
# take the last 3 columns, but all rows, of W
W_smallish < W[, 7:9]
Generics
The tensorflow package provides Tensor implementations for a number of R generic operators and functions. You can use these in place of the equivalent tf
function to make your code more readable and compact. The following generics are implemented:
Generic  TensorFlow 

  tf$neg, tf$sub 
+  tf$add 
*  tf$mul 
/  tf$truediv 
%/%  tf$floordiv 
%%  tf$mod 
^  tf$pow 
%%  tf$mod 
&  tf$logical_and 
  tf$logical_or 
!  tf$logical_not 
==  tf$equal 
!=  tf$not_equal 
<  tf$less 
<=  tf$less_equal 
>  tf$greater 
>=  tf$greater_equal 
[  tf$strided_slice 
abs  tf$abs 
sign  tf$sign 
sqrt  tf$sqrt 
floor  tf$floor 
ceiling  tf$ceiling 
round  tf$round 
exp  tf$exp 
log  tf$log 
cos  tf$cos 
sin  tf$sin 
tan  tf$tan 
acos  tf$acos 
asin  tf$asin 
atan  tf$atan 
lgamma  tf$lgamma 
digamma  tf$digamma 
Dictionaries
Some TensorFlow APIs accept dictionaries as arguments, the most common of which is the feed_dict
argument which feeds training and test data to various functions. If a dictionary is keyed by character string you can simply pass an R named list. However, if a dictionary is keyed by another object type like a tensor (as feed_dict
is) then you should use the dict
function rather than a named list. For example:
sess$run(train_step, feed_dict = dict(x = batch_xs, y_ = batch_ys))
The x
and y_
variables in the above example are tensor placeholders which are substituted for by the specified training data.
Tuples
Some TensorFlow APIs accept tuples as arguments. In many cases if you pass a list the API will automatically convert it to a tuple, however there may be instances where you need to explicitly pass a tuple. For these cases you can use the tuple
function, for example:
tuple(1L, 2L, 3L)
With Contexts
The TensorFlow API includes several functions that yield scoped execution contexts (i.e. blocks of code which execute code on enter and exit, for example to set a default or to open and close a resource).
The R with
generic function can be used with TensorFlow objects that define a scoped execution context. For example:
with(tf$name_scope('input'), {
x < tf$placeholder(tf$float32, shape(NULL, 784L), name='xinput')
y_ < tf$placeholder(tf$float32, shape(NULL, 10L), name='yinput')
})
In this case the tf$name_scope
execution context will automatically prepend "input/"
to the specified names, so they will become "input/xinput"
and "input/yinput"
respectively.
It is sometimes convenient to gain access to the execution context via an R object. For this purpose there is also a custom %as%
operator defined, for example:
with(tf$Session() %as% sess, {
sess$run(hello)
})
The sess
variable will be assigned from the execution context and be available only within the expression passed to with
.
Iterators
The TensorFlow API includes several functions that return Python iterators or generators (these are typically used for streaming back results from training or evaluation). You can use the iterate
function to apply an R function to each item returned from an interator or generator. For example:
iterate(tf$python_io$tf_record_iterator(filename), function(record) {
# Process the record
})
If you just want to collect all of the items into an R list or vector, you can call the iterate
function with just the iterator and no function to apply:
records < iterate(tf$python_io$tf_record_iterator(filename))
The result will be automatically simplified to an R vector if all values returned from the iteration are length 1 vectors of single primitive type “character”, “complex”, “double”, “integer”, or “logical” (otherwise the result will be an R list).
Note that Python iterators and generators are stateful objects. This means that after making a single pass through an iterator it will be empty (i.e. you can’t execute another iteration against the same underlying iterator).
NumPy Types
When you use a TensorFlow API that expects NumPy arrays you can simply use the equivalent R matrix or multidimensional array and it will be automatically converted to the required NumPy array type.
There are however some parts of the TensorFlow API which require direct specification of NumPy data types (e.g. the CSV loading functions in tflearn). For these contexts you can import the NumPy module explicitly and refer to it’s types as needed.
For example, here’s a call to tflearn’s load_csv_with_header
function that specifies data types using the NumPy module:
np < import("numpy")
training_set < tf$contrib$learn$datasets$base$load_csv_with_header(
filename = "iris_training.csv",
target_dtype = np$int,
features_dtype = np$float32
)
Note that we explicitly import NumPy via np < import("numpy")
prior to accessing the np$int
and np$float32
types.
Training Flags
TensorFlow includes a facility for driving training hyperparmaeters using commandline flags. You can use this facility by:
Obtaining a reference to the global
tf$app$flags
object.Defining whatever flags are appropriate using the various
DEFINE
functions provided by the object.Populating a
FLAGS
object which will carry the specified values by calling theparse_flags
function.
Here’s a complete example:
# Basic model parameters as external flags.
flags < tf$app$flags
flags$DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')
flags$DEFINE_integer('max_steps', 5000L, 'Number of steps to run trainer.')
flags$DEFINE_integer('hidden1', 128L, 'Number of units in hidden layer 1.')
flags$DEFINE_integer('hidden2', 32L, 'Number of units in hidden layer 2.')
flags$DEFINE_integer('batch_size', 100L, 'Batch size.')
flags$DEFINE_string('train_dir', 'MNISTdata', 'Directory to put the training data.')
flags$DEFINE_boolean('fake_data', FALSE, 'If true, uses fake data for unit testing.')
FLAGS < parse_flags()
Getting Help
As you use TensorFlow from R you’ll want to get help on the various functions and classes available within the API. If you are running the vary latest Preview Release (v1.0.18 or later) of RStudio IDE you can get code completion and inline help for the TensorFlow API within RStudio. For example:
Inline help is also available for function parameters:
You can press the F1 key while viewing inline help (or whenever your cursor is over a TensorFlow API symbol) and you will be navigated to the location of that symbol’s help within the TensorFlow API documentation.
API Reference
The main TensorFlow API reference documents all of the modules, classes, and functions within TensorFlow. This documentation is for the Python API, however since the R API is based on the Python API the documentation is also easily adapted for use with R.
Python data types in the TensorFlow API map to R as follows:
Python  R  Examples 

Scalar  Singleelement vector 
1 , 1L , TRUE , "foo"

List  Multielement vector 
c(1.0, 2.0, 3.0) , c(1L, 2L, 3L)

Tuple  List of multiple types  list(1L, TRUE, "foo") 
Dict  Named list or dict

list(a = 1L, b = 2.0) , dict(x = x_data)

NumPy ndarray  Matrix/Array  matrix(c(1,2,3,4), nrow = 2, ncol = 2) 
None, True, False  NULL, TRUE, FALSE 
NULL , TRUE , FALSE
