`library(keras)`

# Guide to Keras Basics

Keras is a high-level API to build and train deep learning models. It’s used for fast prototyping, advanced research, and production, with three key advantages:

*User friendly*– Keras has a simple, consistent interface optimized for common use cases. It provides clear and actionable feedback for user errors.*Modular and composable*– Keras models are made by connecting configurable building blocks together, with few restrictions.*Easy to extend*– Write custom building blocks to express new ideas for research. Create new layers, loss functions, and develop state-of-the-art models.

## Import keras

To get started, load the `keras`

library:

## Build a simple model

### Sequential model

In Keras, you assemble *layers* to build *models*. A model is (usually) a graph of layers. The most common type of model is a stack of layers: the `sequential`

model.

To build a simple, fully-connected network (i.e., a multi-layer perceptron):

```
<- keras_model_sequential()
model
%>%
model
# Adds a densely-connected layer with 64 units to the model:
layer_dense(units = 64, activation = 'relu') %>%
# Add another:
layer_dense(units = 64, activation = 'relu') %>%
# Add a softmax layer with 10 output units:
layer_dense(units = 10, activation = 'softmax')
```

### Configure the layers

There are many `layers`

available with some common constructor parameters:

`activation`

: Set the activation function for the layer. By default, no activation is applied.`kernel_initializer`

and`bias_initializer`

: The initialization schemes that create the layer’s weights (kernel and bias). This defaults to the`Glorot uniform`

initializer.`kernel_regularizer`

and`bias_regularizer`

: The regularization schemes that apply to the layer’s weights (kernel and bias), such as L1 or L2 regularization. By default, no regularization is applied.

The following instantiates `dense`

layers using constructor arguments:

```
# Create a sigmoid layer:
layer_dense(units = 64, activation ='sigmoid')
```

`<keras.src.layers.core.dense.Dense object at 0x7f954cab7be0>`

```
# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix:
layer_dense(units = 64, kernel_regularizer = regularizer_l1(0.01))
```

```
Warning in keras$regularizers$l1(l = l): partial argument match of 'l' to
'l1'
```

`<keras.src.layers.core.dense.Dense object at 0x7f954cb74100>`

```
# A linear layer with L2 regularization of factor 0.01 applied to the bias vector:
layer_dense(units = 64, bias_regularizer = regularizer_l2(0.01))
```

```
Warning in keras$regularizers$l2(l = l): partial argument match of 'l' to
'l2'
```

`<keras.src.layers.core.dense.Dense object at 0x7f954cb744f0>`

```
# A linear layer with a kernel initialized to a random orthogonal matrix:
layer_dense(units = 64, kernel_initializer = 'orthogonal')
```

`<keras.src.layers.core.dense.Dense object at 0x7f954cb74070>`

```
# A linear layer with a bias vector initialized to 2.0:
layer_dense(units = 64, bias_initializer = initializer_constant(2.0))
```

`<keras.src.layers.core.dense.Dense object at 0x7f954cb74c40>`

## Train and evaluate

### Set up training

After the model is constructed, configure its learning process by calling the `compile`

method:

```
%>% compile(
model optimizer = 'adam',
loss = 'categorical_crossentropy',
metrics = list('accuracy')
)
```

`compile`

takes three important arguments:

`optimizer`

: This object specifies the training procedure. Commonly used optimizers are e.g.

`adam`

,`rmsprop`

, or`sgd`

.`loss`

: The function to minimize during optimization. Common choices include mean square error (`mse`

),`categorical_crossentropy`

, and`binary_crossentropy`

.`metrics`

: Used to monitor training. In classification, this usually is accuracy.

The following shows a few examples of configuring a model for training:

```
# Configure a model for mean-squared error regression.
%>% compile(
model optimizer = 'adam',
loss = 'mse', # mean squared error
metrics = list('mae') # mean absolute error
)
# Configure a model for categorical classification.
%>% compile(
model optimizer = optimizer_rmsprop(learning_rate = 0.01),
loss = "categorical_crossentropy",
metrics = list("categorical_accuracy")
)
```

### Input data

You can train keras models directly on R matrices and arrays (possibly created from R `data.frames`

). A model is fit to the training data using the `fit`

method:

```
<- matrix(rnorm(1000 * 32), nrow = 1000, ncol = 32)
data <- matrix(rnorm(1000 * 10), nrow = 1000, ncol = 10)
labels
%>% fit(
model
data,
labels,epochs = 10,
batch_size = 32
)
```

```
Epoch 1/10
32/32 - 1s - loss: -2.6513e-01 - categorical_accuracy: 0.0970 - 802ms/epoch - 25ms/step
Epoch 2/10
32/32 - 0s - loss: -3.8759e+00 - categorical_accuracy: 0.1110 - 79ms/epoch - 2ms/step
Epoch 3/10
32/32 - 0s - loss: -1.8322e+01 - categorical_accuracy: 0.1020 - 55ms/epoch - 2ms/step
Epoch 4/10
32/32 - 0s - loss: -5.0000e+01 - categorical_accuracy: 0.1070 - 59ms/epoch - 2ms/step
Epoch 5/10
32/32 - 0s - loss: -1.1058e+02 - categorical_accuracy: 0.0810 - 59ms/epoch - 2ms/step
Epoch 6/10
32/32 - 0s - loss: -1.6624e+02 - categorical_accuracy: 0.1150 - 59ms/epoch - 2ms/step
Epoch 7/10
32/32 - 0s - loss: -2.8676e+02 - categorical_accuracy: 0.1070 - 55ms/epoch - 2ms/step
Epoch 8/10
32/32 - 0s - loss: -4.4149e+02 - categorical_accuracy: 0.1140 - 57ms/epoch - 2ms/step
Epoch 9/10
32/32 - 0s - loss: -6.6879e+02 - categorical_accuracy: 0.1170 - 64ms/epoch - 2ms/step
Epoch 10/10
32/32 - 0s - loss: -8.8606e+02 - categorical_accuracy: 0.1050 - 61ms/epoch - 2ms/step
```

`fit`

takes three important arguments:

`epochs`

: Training is structured into*epochs*. An epoch is one iteration over the entire input data (this is done in smaller batches).`batch_size`

: When passed matrix or array data, the model slices the data into smaller batches and iterates over these batches during training. This integer specifies the size of each batch. Be aware that the last batch may be smaller if the total number of samples is not divisible by the batch size.`validation_data`

: When prototyping a model, you want to easily monitor its performance on some validation data. Passing this argument — a list of inputs and labels — allows the model to display the loss and metrics in inference mode for the passed data, at the end of each epoch.

Here’s an example using `validation_data`

:

```
<- matrix(rnorm(1000 * 32), nrow = 1000, ncol = 32)
data <- matrix(rnorm(1000 * 10), nrow = 1000, ncol = 10)
labels
<- matrix(rnorm(1000 * 32), nrow = 100, ncol = 32) val_data
```

```
Warning in matrix(rnorm(1000 * 32), nrow = 100, ncol = 32): data length
differs from size of matrix: [32000 != 100 x 32]
```

```
<- matrix(rnorm(100 * 10), nrow = 100, ncol = 10)
val_labels
%>% fit(
model
data,
labels,epochs = 10,
batch_size = 32,
validation_data = list(val_data, val_labels)
)
```

```
Epoch 1/10
32/32 - 0s - loss: 377.5257 - categorical_accuracy: 0.1010 - val_loss: -9.1629e+02 - val_categorical_accuracy: 0.1500 - 184ms/epoch - 6ms/step
Epoch 2/10
32/32 - 0s - loss: 268.6779 - categorical_accuracy: 0.1070 - val_loss: -9.0091e+02 - val_categorical_accuracy: 0.1400 - 80ms/epoch - 2ms/step
Epoch 3/10
32/32 - 0s - loss: 17.4030 - categorical_accuracy: 0.1140 - val_loss: -1.0150e+03 - val_categorical_accuracy: 0.1400 - 76ms/epoch - 2ms/step
Epoch 4/10
32/32 - 0s - loss: -5.5530e+01 - categorical_accuracy: 0.1070 - val_loss: -1.1427e+03 - val_categorical_accuracy: 0.1100 - 74ms/epoch - 2ms/step
Epoch 5/10
32/32 - 0s - loss: 49.4249 - categorical_accuracy: 0.0810 - val_loss: -1.1145e+03 - val_categorical_accuracy: 0.1500 - 74ms/epoch - 2ms/step
Epoch 6/10
32/32 - 0s - loss: -8.4104e+01 - categorical_accuracy: 0.1090 - val_loss: -1.6088e+03 - val_categorical_accuracy: 0.0800 - 75ms/epoch - 2ms/step
Epoch 7/10
32/32 - 0s - loss: -2.5985e+02 - categorical_accuracy: 0.0960 - val_loss: -1.9842e+03 - val_categorical_accuracy: 0.1100 - 82ms/epoch - 3ms/step
Epoch 8/10
32/32 - 0s - loss: -6.4850e+02 - categorical_accuracy: 0.0980 - val_loss: -1.8305e+03 - val_categorical_accuracy: 0.0900 - 86ms/epoch - 3ms/step
Epoch 9/10
32/32 - 0s - loss: -6.2961e+02 - categorical_accuracy: 0.1050 - val_loss: -3.0540e+03 - val_categorical_accuracy: 0.1200 - 75ms/epoch - 2ms/step
Epoch 10/10
32/32 - 0s - loss: -1.4109e+03 - categorical_accuracy: 0.0980 - val_loss: -4.7981e+03 - val_categorical_accuracy: 0.0800 - 78ms/epoch - 2ms/step
```

### Evaluate and predict

Same as `fit`

, the `evaluate`

and `predict`

methods can use raw R data as well as a `dataset`

.

To *evaluate* the inference-mode loss and metrics for the data provided:

```
%>% evaluate(test_data, test_labels, batch_size = 32)
model
%>% evaluate(test_dataset, steps = 30) model
```

And to *predict* the output of the last layer in inference for the data provided, again as R data as well as a `dataset`

:

```
%>% predict(test_data, batch_size = 32)
model
%>% predict(test_dataset, steps = 30) model
```

## Build advanced models

### Functional API

The `sequential`

model is a simple stack of layers that cannot represent arbitrary models. Use the Keras functional API to build complex model topologies such as:

- multi-input models,
- multi-output models,
- models with shared layers (the same layer called several times),
- models with non-sequential data flows (e.g., residual connections).

Building a model with the functional API works like this:

- A layer instance is callable and returns a tensor.
- Input tensors and output tensors are used to define a
`keras_model`

instance. - This model is trained just like the
`sequential`

model.

The following example uses the functional API to build a simple, fully-connected network:

```
<- layer_input(shape = (32)) # Returns a placeholder tensor
inputs
<- inputs %>%
predictions layer_dense(units = 64, activation = 'relu') %>%
layer_dense(units = 64, activation = 'relu') %>%
layer_dense(units = 10, activation = 'softmax')
# Instantiate the model given inputs and outputs.
<- keras_model(inputs = inputs, outputs = predictions)
model
# The compile step specifies the training configuration.
%>% compile(
model optimizer = optimizer_rmsprop(lr = 0.001),
loss = 'categorical_crossentropy',
metrics = list('accuracy')
)
# Trains for 5 epochs
%>% fit(
model
data,
labels,batch_size = 32,
epochs = 5
)
```

```
Epoch 1/5
32/32 - 1s - loss: 0.4507 - accuracy: 0.0960 - 586ms/epoch - 18ms/step
Epoch 2/5
32/32 - 0s - loss: 0.1948 - accuracy: 0.1240 - 54ms/epoch - 2ms/step
Epoch 3/5
32/32 - 0s - loss: -3.9052e-03 - accuracy: 0.1500 - 55ms/epoch - 2ms/step
Epoch 4/5
32/32 - 0s - loss: -2.1561e-01 - accuracy: 0.1430 - 126ms/epoch - 4ms/step
Epoch 5/5
32/32 - 0s - loss: -4.4740e-01 - accuracy: 0.1530 - 61ms/epoch - 2ms/step
```

### Custom layers

To create a custom Keras layer, you create an R6 class derived from `KerasLayer`

. There are three methods to implement (only one of which, `call()`

, is required for all types of layer):

`build(input_shape)`

: This is where you will define your weights. Note that if your layer doesn’t define trainable weights then you need not implement this method.`call(x)`

: This is where the layer’s logic lives. Unless you want your layer to support masking, you only have to care about the first argument passed to call: the input tensor.`compute_output_shape(input_shape)`

: In case your layer modifies the shape of its input, you should specify here the shape transformation logic. This allows Keras to do automatic shape inference. If you don’t modify the shape of the input then you need not implement this method.

Here is an example custom layer that performs a matrix multiplication:

```
library(keras)
<- R6::R6Class("CustomLayer",
CustomLayer
inherit = KerasLayer,
public = list(
output_dim = NULL,
kernel = NULL,
initialize = function(output_dim) {
$output_dim <- output_dim
self
},
build = function(input_shape) {
$kernel <- self$add_weight(
selfname = 'kernel',
shape = list(input_shape[[2]], self$output_dim),
initializer = initializer_random_normal(),
trainable = TRUE
)
},
call = function(x, mask = NULL) {
k_dot(x, self$kernel)
},
compute_output_shape = function(input_shape) {
list(input_shape[[1]], self$output_dim)
}
) )
```

In order to use the custom layer within a Keras model you also need to create a wrapper function which instantiates the layer using the `create_layer()`

function. For example:

```
# define layer wrapper function
<- function(object, output_dim, name = NULL, trainable = TRUE) {
layer_custom create_layer(CustomLayer, object, list(
output_dim = as.integer(output_dim),
name = name,
trainable = trainable
)) }
```

You can now use the layer in a model as usual:

```
<- keras_model_sequential()
model %>%
model layer_dense(units = 32, input_shape = c(32,32)) %>%
layer_custom(output_dim = 32)
```

### Custom models

In addition to creating custom layers, you can also create a custom model. This might be necessary if you wanted to use TensorFlow eager execution in combination with an imperatively written forward pass.

In cases where this is not needed, but flexibility in building the architecture is required, it is recommended to just stick with the functional API.

A custom model is defined by calling `keras_model_custom()`

passing a function that specifies the layers to be created and the operations to be executed on forward pass.

```
# define a custom model type
<- new_model_class(
my_model_constructor "MyModel",
initialize = function(output_dim, ...) {
$initialize(...)
super# store our output dim in self until build() is called
$output_dim <- output_dim
self
},
build = function(input_shape) {
# create layers we'll need for the call (this code executes once)
# note: the layers have to be created on the self object!
$dense1 <- layer_dense(units = 64,
selfactivation = 'relu',
input_shape = input_shape)
$dense2 <- layer_dense(units = 64, activation = 'relu')
self$dense3 <- layer_dense(units = self$output_dim, activation = 'softmax')
self
},
# implement call (this code executes during training & inference)
call = function(inputs) {
<- inputs %>%
x $dense1() %>%
self$dense2() %>%
self$dense3()
self
x
},
# define a `get_config()` method in custom objects
# to enable model saving and restoring
get_config = function() {
list(output_dim = self$output_dim)
}
)
<- my_model_constructor(output_dim = 10)
model
%>% compile(
model optimizer = optimizer_rmsprop(learning_rate = 0.001),
loss = 'categorical_crossentropy',
metrics = list('accuracy')
)
# Trains for 5 epochs
%>% fit(
model
data,
labels,batch_size = 32,
epochs = 5
)
```

```
Epoch 1/5
32/32 - 1s - loss: 0.4321 - accuracy: 0.0980 - 653ms/epoch - 20ms/step
Epoch 2/5
32/32 - 0s - loss: 0.1700 - accuracy: 0.1190 - 61ms/epoch - 2ms/step
Epoch 3/5
32/32 - 0s - loss: -3.2296e-02 - accuracy: 0.1220 - 58ms/epoch - 2ms/step
Epoch 4/5
32/32 - 0s - loss: -2.6427e-01 - accuracy: 0.1290 - 56ms/epoch - 2ms/step
Epoch 5/5
32/32 - 0s - loss: -5.0211e-01 - accuracy: 0.1330 - 53ms/epoch - 2ms/step
```

## Callbacks

A callback is an object passed to a model to customize and extend its behavior during training. You can write your own custom callback, or use the built-in `callbacks`

that include:

`callback_model_checkpoint`

: Save checkpoints of your model at regular intervals.`callback_learning_rate_scheduler`

: Dynamically change the learning rate.`callback_early_stopping`

: Interrupt training when validation performance has stopped improving.`callbacks_tensorboard`

: Monitor the model’s behavior using TensorBoard.

To use a `callback`

, pass it to the model’s `fit`

method:

```
<- list(
callbacks callback_early_stopping(patience = 2, monitor = 'val_loss'),
callback_tensorboard(log_dir = './logs')
)
%>% fit(
model
data,
labels,batch_size = 32,
epochs = 5,
callbacks = callbacks,
validation_data = list(val_data, val_labels)
)
```

```
Epoch 1/5
32/32 - 0s - loss: -7.4056e-01 - accuracy: 0.1470 - val_loss: -1.5806e+00 - val_accuracy: 0.0800 - 199ms/epoch - 6ms/step
Epoch 2/5
32/32 - 0s - loss: -1.0141e+00 - accuracy: 0.1400 - val_loss: -1.6356e+00 - val_accuracy: 0.1300 - 96ms/epoch - 3ms/step
Epoch 3/5
32/32 - 0s - loss: -1.2859e+00 - accuracy: 0.1340 - val_loss: -1.7233e+00 - val_accuracy: 0.1100 - 94ms/epoch - 3ms/step
Epoch 4/5
32/32 - 0s - loss: -1.5790e+00 - accuracy: 0.1350 - val_loss: -1.6630e+00 - val_accuracy: 0.0800 - 88ms/epoch - 3ms/step
Epoch 5/5
32/32 - 0s - loss: -1.8551e+00 - accuracy: 0.1450 - val_loss: -1.8464e+00 - val_accuracy: 0.0500 - 89ms/epoch - 3ms/step
```

## Save and restore

### Weights only

Save and load the weights of a model using `save_model_weights_hdf5`

and `load_model_weights_hdf5`

, respectively:

```
# save in SavedModel format
%>% save_model_weights_tf('my_model/')
model
# Restore the model's state,
# this requires a model with the same architecture.
%>% load_model_weights_tf('my_model/') model
```

### Configuration only

A model’s configuration can be saved - this serializes the model architecture without any weights. A saved configuration can recreate and initialize the same model, even without the code that defined the original model. Keras supports JSON and YAML serialization formats:

```
# Serialize a model to JSON format
<- model %>% model_to_json()
json_string
# Recreate the model (freshly initialized)
<- model_from_json(json_string,
fresh_model custom_objects = list('MyModel' = my_model_constructor))
```

### Entire model

The entire model can be saved to a file that contains the weight values, the model’s configuration, and even the optimizer’s configuration. This allows you to checkpoint a model and resume training later —from the exact same state —without access to the original code.

```
# Save entire model to the SavedModel format
%>% save_model_tf('my_model/')
model
# Recreate the exact same model, including weights and optimizer.
<- load_model_tf('my_model/') model
```