Deep learning with TensorFlow and Keras

class: center, middle, inverse, title-slide

# Deep learning with TensorFlow and Keras
## UseR! 2018 Tutorial <br /><br /> <a href='https://kevinykuo.com/talk/2018/07/user-tutorial/'>kevinykuo.com/talk/2018/07/user-tutorial</a>
### Kevin Kuo
### July 2018

---

# Administrative stuff

.Large[Get your RStudio instance at

[http://colorado.rstudio.com:3939/classroom-assignment-deeplearning/](http://colorado.rstudio.com:3939/classroom-assignment-deeplearning/)]

---

# Schedule

.large[
13:30 - 15:00 Fun & Games

15:00 - 15:30 Coffee?

15:30 - 17:00 Fun & Games
]
--

.large[
17&#58;00 - &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; **PARTYTIEM** 🎉
]

---

# "Learning objectives"

.Large[
- Describe what neural networks are and how they work.

- Name some software packges used for deep learning and describe how they relate to each other.

- Build some basic neural network models for predictive modeling problems.
]

.Large[
Overarching goal --- be able to sprinkle AI jargon in a conversation at the bar to sound like you know what you're talking about and actually know enough so you don't get called out
]

---

# Why might you care about this AI stuff

## Academia?

1. "A novel approach to [what you're working on] with [insert fancy AI term]"

2. ???

3. New publication!

## Business?

1. "[Your product or service], now enhanced using [insert fancy AI term]"

2. ???

3. New deal!

---

# Pedagogical considerations

(Keeping in mind we only have 3 hours)

.large[
Breadth first.

- Talk about intuition and skip the maths
- Go through a variety of problem types
- Code and get things done
]

There's a lot of free and high quality stuff online for deep learning (much of which we'll utilize today)... this tutorial is meant to be an opinionated distillation.

We want to hit the ground running ASAP, and know where to refer back to the sources as needed.

.large[Fake it till u make it]

---
class: inverse, center, middle

# Deep learning time!

---

# Deep learning

.large[Really just a rebranding of neural networks...]

.large[...which is a machine learning technique]

.large[...i.e. it's a way to learn a function that takes some inputs to predict some outputs

(if you're familiar with the jargon, we're going to focus on *supervised learning* here)
]

---

# Heuristics with some symbols

Basically, given some inputs `$X$`, we wanna predict `$Y$` via some function `$f$`, like

$$
\widehat{Y} = f(X)
$$

For neural nets, our function `$f$` looks something like

`$$f = f_{N}\circ f_{N-1}\circ\dots\circ f_1$$`

where `$f_1, \dots, f_N$` are functions parameterized by *weights*.

For the most part, you can think of these functions as matrix multiplication followed by some type of thresholding/nonlinearity.

(without the nonlinearity, our `$f$` would just be a simple linear transformation and that won't sound very impressive 🤔)

---

# Deep learning in pictures

Next few images taken from JJ's TensorFlow talk ([repo](https://github.com/jjallaire/rstudio-conf-2018)) which also contains material from the Deep Learning with R book (adapted from Francois Chollet's Deep Learning with Python)

---

# Deep learning in pictures

---

# Deep learning in pictures

---

# Deep learning in pictures

---

# Deep learning in pictures

---

# Deep learning in pictures

---

# Deep learning in pictures

---

# A couple words on software

We'll focus on

- TensorFlow
- Keras

Other things out there

- PyTorch (incl. Caffe2)
- CNTK
- MXNet
- DyNet
- Chainer

---

# Software

Tensorflow: general purpose numerical computing library (with deep learning capabilities, of course!)

Keras: high level APIs for deep learning, supports TensorFlow (and also CNTK, MXNet, ...)

Source: [https://twitter.com/fchollet/status/971863128341323776](https://twitter.com/fchollet/status/971863128341323776)

---

# Quick word on tensors

They're the "data" in TF-land. Really just multidimensional arrays.

| Dimension | R object  | 
|---|---|
| 0D | `42` | 
| 1D | `c(42, 42, 42)` 
| 2D | `matrix(42, nrow = 2, ncol = 2)`  |
| 3D | `array(42, dim = c(2,3,2))` |
| 4D | `array(42, dim = c(2,3,2,3))` |

---

# Tensor examples

| Data | Tensor  | 
|---|---|
| Vector data | 2D tensors of shape `(samples, features)` | 
| Timeseries data | 3D tensors of shape `(samples, timesteps, features)` |
| Images | 4D tensors of shape `(samples, height, width, channels)` |
| Video | 5D tensors of shape `(samples, frames, height, width, channels)` |

<br/>

Note that `samples` is always the first dimension

These tensors "flow" the computation graphs that we've been looking at, hence Tensor*Flow*. Get it?

---

# Tensors are literally flowing

---
class: inverse, center, middle

# Hands-on stuff!

---

# Hello world

MNIST with feedfoward neural net

---

# Conv nets for image classification

Convolutional neural network (CNN) or ConvNet is where it's at when it comes to image processing.

We're gonna ~~shamelessly copy~~ use some slides ([http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture05.pdf](http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture05.pdf)) from a Stanford University course. See [http://vision.stanford.edu/teaching/cs231n/](http://vision.stanford.edu/teaching/cs231n/) for the details, including links to YouTube videos.

---

# CNN hands-on

Notebook time!

---

# Recurrent networks

CNN:image data::RNN:sequential data

These could be things like time series data or natural language data.

Just as before, we're gonna utilize some existing slides! [http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture10.pdf](http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture10.pdf).

Also check out [http://colah.github.io/posts/2015-08-Understanding-LSTMs/](http://colah.github.io/posts/2015-08-Understanding-LSTMs/) and [http://karpathy.github.io/2015/05/21/rnn-effectiveness/](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)

---

# RNN - text classification

Hands-on demo!

---

# RNN - time series forecasting

Since RNNs are good for sequential data, they *should* be useful for doing time series stuff, too.

If you do some research on the topic, you'll see there are a bazillion ways to do forecasting on a univariate time series with neural networks.

We'll go through 4 of these... onto the demo!

---

# Structured data

.large[Most of the time, for most people, we're working with structured data in tables and not images or videos. Let's go through an example of applying deep learning to an old school predictive modeling problem!]

---

# That's a wrap!

.large[
Resources:

- Stanford course on convolutional neural nets [http://cs231n.stanford.edu/](http://cs231n.stanford.edu/)
- Andrew Ng's Coursera courses [https://www.coursera.org/specializations/deep-learning](https://www.coursera.org/specializations/deep-learning)
- JJ/Francois's book [https://www.manning.com/books/deep-learning-with-r](https://www.manning.com/books/deep-learning-with-r)
- Deep Learning book (Goodfellow, Bengio, and Courville) [https://github.com/janishar/mit-deep-learning-book-pdf](https://github.com/janishar/mit-deep-learning-book-pdf)
- RStudio Tensorflow blog [https://tensorflow.rstudio.com/blog.html](https://tensorflow.rstudio.com/blog.html)

Link to slides and demo: [https://kevinykuo.com/talk/2018/07/user-tutorial/](https://kevinykuo.com/talk/2018/07/user-tutorial/)

]