library(tensorflow)
library(keras)
set.seed(1234)
Timeseries classification from scratch
Introduction
This example shows how to do timeseries classification from scratch, starting from raw CSV timeseries files on disk. We demonstrate the workflow on the FordA dataset from the UCR/UEA archive.
Setup
<- "https://raw.githubusercontent.com/hfawaz/cd-diagram/master/FordA"
url
<- "FordA_TRAIN.tsv" %>%
train_df get_file(., file.path(url, .)) %>%
::read_tsv(col_names = FALSE)
readr<- as.matrix(train_df[, -1])
x_train <- as.matrix(train_df[, 1])
y_train
<- "FordA_TEST.tsv" %>%
test_df get_file(., file.path(url, .)) %>%
::read_tsv(col_names = FALSE)
readr<- as.matrix(test_df[, -1])
x_test <- as.matrix(test_df[, 1]) y_test
Visualize the data
Here we visualize one timeseries example for each class in the dataset.
library(ggfortify)
Loading required package: ggplot2
autoplot(ts(tibble::tibble(
"Class -1" = x_train[1, ],
"Class 1" = x_train[2, ]
ts.geom = 'line', facets = FALSE) )),
Standardize the data
Our timeseries are already in a single length (500). However, their values are usually in various ranges. This is not ideal for a neural network; in general we should seek to make the input values normalized. For this specific dataset, the data is already z-normalized: each timeseries sample has a mean equal to zero and a standard deviation equal to one. This type of normalization is very common for timeseries classification problems, see Bagnall et al. (2016).
In order to use sparse_categorical_crossentropy
, we will have to count the number of classes beforehand.
<- length(unique(y_train))) (num_classes
[1] 2
Now we shuffle the training set because we will be using the validation_split
option later when training.
<- sample(nrow(x_train))
shuffle_ind <- x_train[shuffle_ind, ]
x_train <- y_train[shuffle_ind, ] y_train
Standardize the labels to positive integers. The expected labels will then be 0 and 1.
== -1] <- 0
y_train[y_train == -1] <- 0 y_test[y_test
Note that the timeseries data used here are univariate, meaning we only have one channel per timeseries example. We will therefore transform the timeseries into a multivariate one with one channel using a simple reshaping. This will allow us to construct a model that is easily applicable to multivariate time series.
# add channel dim of size 1
dim(x_train) <- c(dim(x_train), 1)
dim(x_test) <- c(dim(x_test), 1)
Build a model
We build a Fully Convolutional Neural Network originally proposed in this paper. The implementation is based on the TF 2 version provided here. The following hyperparameters (kernel_size, filters, the usage of BatchNorm) were found via random search using KerasTuner.
<- dim(x_train)[-1] # drop batch dim
input_shape <- layer_input(input_shape)
input_layer
<- input_layer %>%
output_layer
# First convolutional layer
layer_conv_1d(64, 3, padding = "same") %>%
layer_batch_normalization() %>%
layer_activation_relu() %>%
# Second convolutional layer
layer_conv_1d(64, 3, padding = "same") %>%
layer_batch_normalization() %>%
layer_activation_relu() %>%
# Third convolutional layer
layer_conv_1d(64, 3, padding = "same") %>%
layer_batch_normalization() %>%
layer_activation_relu() %>%
layer_global_average_pooling_1d() %>%
layer_dense(num_classes, activation = "softmax")
<- keras_model(input_layer, output_layer) model
Train the model
<- 300
epochs <- 32
batch_size <- list(
callbacks callback_model_checkpoint("best_model.h5", monitor = "val_loss",
save_best_only = TRUE),
callback_reduce_lr_on_plateau(monitor = "val_loss", factor = 0.5,
patience = 20, min_lr = 0.0001),
callback_early_stopping(monitor = "val_loss", patience = 50,
verbose = 1)
)
%>% compile(
model optimizer = "adam",
loss = "sparse_categorical_crossentropy",
metrics = list("sparse_categorical_accuracy")
)
<- model %>%
history fit(x_train, y_train,
batch_size = batch_size,
epochs = epochs,
callbacks = callbacks,
validation_split = 0.2,
verbose = 1)
Evaluate model on test data
<- load_model_hdf5("best_model.h5")
loaded_model <- loaded_model %>% evaluate(x_test, as.matrix(y_test))
result
sprintf("Test loss: %s", result[["loss"]])
[1] "Test loss: 0.103852786123753"
sprintf("Test accuracy: %s", result[["sparse_categorical_accuracy"]])
[1] "Test accuracy: 0.966666638851166"
Plot the model’s training and validation loss
plot(history)
We can see how the training accuracy reaches almost 0.95 after 100 epochs. However, by observing the validation accuracy we can see how the network still needs training until it reaches almost 0.97 for both the validation and the training accuracy after 200 epochs. Beyond the 200th epoch, if we continue on training, the validation accuracy will start decreasing while the training accuracy will continue on increasing: the model starts overfitting.