Subscribe by Email or the RSS feed.

lime v0.4: The kitten picture edition

This is guest post contributed by Thomas Lin Pedersen, creator of the lime package. The post originally appeared on Thomas’ blog at I’m happy to report a new major release of lime has landed on CRAN. lime is an R port of the Python library of the same name by Marco Ribeiro that allows the user to pry open black box machine learning models and explain their outcomes on a per-observation basis. Read more →

Deep Learning for Cancer Immunotherapy

This is a guest post from Leon Eyrich Jessen, a postdoctoral researcher in the Immunoinformatics and Machine Learning Group at the Technical University of Denmark. Introduction Simon Caulton, Adoptive T-cell therapy, CC BY-SA 3.0 In my research, I apply deep learning to unravel molecular interactions in the human immune system. One application of my research is within cancer immunotherapy (Immuno-oncology or Immunooncology) - a cancer treatment strategy, where the aim is to utilize the cancer patient’s own immune system to fight the cancer. Read more →

Analyzing rtweet data with kerasformula

This is guest post contributed by Pete Mohanty, creator of the kerasformula package. Overview The kerasformula package offers a high-level interface for the R interface to Keras. It’s main interface is the kms function, a regression-style interface to keras_model_sequential that uses formulas and sparse matrices. The kerasformula package is available on CRAN, and can be installed with: # install the kerasformula package install.packages("kerasformula") # or devtools::install_github("rdrr1990/kerasformula") library(kerasformula) # install the core keras library (if you haven't already done so) # see ? Read more →

Predicting Fraud with Autoencoders and Keras

Overview In this post we will train an autoencoder to detect credit card fraud. We will also demonstrate how to train Keras models in the cloud using CloudML. The basis of our model will be the Kaggle Credit Card Fraud Detection dataset, which was collected during a research collaboration of Worldline and the Machine Learning Group of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. The dataset contains credit card transactions by European cardholders made over a two day period in September 2013. Read more →

Deep Learning With Keras To Predict Customer Churn

This is guest post contributed by Matt Dancho, CEO of Business Science. The post was originally published on the Business Science blog. Customer churn is a problem that all companies need to monitor, especially those that depend on subscription-based revenue streams. The simple fact is that most organizations have data that can be used to target these individuals and to understand the key drivers of churn, and we now have Keras for Deep Learning available in R (Yes, in R! Read more →

R Interface to Google CloudML

Overview We are excited to announce the availability of the cloudml package, which provides an R interface to Google Cloud Machine Learning Engine. CloudML provides a number of services including: Scalable training of models built with the keras, tfestimators, and tensorflow R packages. On-demand access to training on GPUs, including the new Tesla P100 GPUs from NVIDIA®. Hyperparameter tuning to optmize key attributes of model architectures in order to maximize predictive accuracy. Read more →

Classifying duplicate questions from Quora with Keras

In this post we will use Keras to classify duplicated questions from Quora. The dataset first appeared in the Kaggle competition Quora Question Pairs and consists of approximately 400,000 pairs of questions along with a column indicating if the question pair is considered a duplicate. Our implementation is inspired by the Siamese Recurrent Architecture, with modifications to the similarity measure and the embedding layers (the original paper uses pre-trained word vectors). Read more →

Word Embeddings with Keras

Word embedding is a method used to map words of a vocabulary to dense vectors of real numbers where semantically similar words are mapped to nearby points. Representing words in this vector space help algorithms achieve better performance in natural language processing tasks like syntactic parsing and sentiment analysis by grouping similar words. For example, we expect that in the embedding space “cats” and “dogs” are mapped to nearby points since they are both animals, mammals, pets, etc. Read more →

Time Series Forecasting with Recurrent Neural Networks

Time Series Forecasting with Recurrent Neural Networks In this section, we’ll review three advanced techniques for improving the performance and generalization power of recurrent neural networks. By the end of the section, you’ll know most of what there is to know about using recurrent networks with Keras. We’ll demonstrate all three concepts on a temperature-forecasting problem, where you have access to a time series of data points coming from sensors installed on the roof of a building, such as temperature, air pressure, and humidity, which you use to predict what the temperature will be 24 hours after the last data point. Read more →

Image Classification on Small Datasets with Keras

Training a convnet with a small dataset Having to train an image-classification model using very little data is a common situation, which you’ll likely encounter in practice if you ever do computer vision in a professional context. A “few” samples can mean anywhere from a few hundred to a few tens of thousands of images. As a practical example, we’ll focus on classifying images as dogs or cats, in a dataset containing 4,000 pictures of cats and dogs (2,000 cats, 2,000 dogs). Read more →