class: center, middle, inverse, title-slide # Deep learning with TensorFlow and Keras ## UseR! 2018 Tutorial
kevinykuo.com/talk/2018/07/user-tutorial
### Kevin Kuo ### July 2018 --- # Administrative stuff .Large[Get your RStudio instance at [http://colorado.rstudio.com:3939/classroom-assignment-deeplearning/](http://colorado.rstudio.com:3939/classroom-assignment-deeplearning/)] --- # Schedule .large[ 13:30 - 15:00 Fun & Games 15:00 - 15:30 Coffee? 15:30 - 17:00 Fun & Games ] -- .large[ 17:00 - **PARTYTIEM** 🎉 ] --- # "Learning objectives" .Large[ - Describe what neural networks are and how they work. - Name some software packges used for deep learning and describe how they relate to each other. - Build some basic neural network models for predictive modeling problems. ] -- .Large[ Overarching goal --- be able to sprinkle AI jargon in a conversation at the bar to sound like you know what you're talking about and actually know enough so you don't get called out ] --- # Why might you care about this AI stuff ## Academia? 1. "A novel approach to [what you're working on] with [insert fancy AI term]" 2. ??? 3. New publication! -- ## Business? 1. "[Your product or service], now enhanced using [insert fancy AI term]" 2. ??? 3. New deal! --- # Pedagogical considerations (Keeping in mind we only have 3 hours) .large[ Breadth first. - Talk about intuition and skip the maths - Go through a variety of problem types - Code and get things done ] -- There's a lot of free and high quality stuff online for deep learning (much of which we'll utilize today)... this tutorial is meant to be an opinionated distillation. We want to hit the ground running ASAP, and know where to refer back to the sources as needed. -- .large[Fake it till u make it] --- class: inverse, center, middle # Deep learning time! --- # Deep learning .large[Really just a rebranding of neural networks...] -- .large[...which is a machine learning technique] -- .large[...i.e. it's a way to learn a function that takes some inputs to predict some outputs (if you're familiar with the jargon, we're going to focus on *supervised learning* here) ] --- # Heuristics with some symbols Basically, given some inputs `\(X\)`, we wanna predict `\(Y\)` via some function `\(f\)`, like $$ \widehat{Y} = f(X) $$ -- For neural nets, our function `\(f\)` looks something like `$$f = f_{N}\circ f_{N-1}\circ\dots\circ f_1$$` where `\(f_1, \dots, f_N\)` are functions parameterized by *weights*. -- For the most part, you can think of these functions as matrix multiplication followed by some type of thresholding/nonlinearity. -- (without the nonlinearity, our `\(f\)` would just be a simple linear transformation and that won't sound very impressive 🤔) --- # Deep learning in pictures Next few images taken from JJ's TensorFlow talk ([repo](https://github.com/jjallaire/rstudio-conf-2018)) which also contains material from the Deep Learning with R book (adapted from Francois Chollet's Deep Learning with Python) <img src="figs/Allaire-DLwithR-HI.png" width="30%" /> --- # Deep learning in pictures <img src="figs/nn-basics-2.png" width="80%" /> --- # Deep learning in pictures <img src="figs/nn-basics-1.jpeg" width="80%" /> --- # Deep learning in pictures <img src="figs/mnist_representations.png" width="80%" /> --- # Deep learning in pictures <img src="figs/deep-learning-in-3-figures-1.png" width="80%" /> --- # Deep learning in pictures <img src="figs/deep-learning-in-3-figures-2.png" width="60%" /> --- # Deep learning in pictures <img src="figs/deep-learning-in-3-figures-3.png" width="60%" /> --- # A couple words on software We'll focus on - TensorFlow - Keras -- Other things out there - PyTorch (incl. Caffe2) - CNTK - MXNet - DyNet - Chainer --- # Software Tensorflow: general purpose numerical computing library (with deep learning capabilities, of course!) Keras: high level APIs for deep learning, supports TensorFlow (and also CNTK, MXNet, ...) -- <img src="https://pbs.twimg.com/media/DXy_uc0VAAAIhKG.jpg" width="40%" /> Source: [https://twitter.com/fchollet/status/971863128341323776](https://twitter.com/fchollet/status/971863128341323776) --- # Quick word on tensors They're the "data" in TF-land. Really just multidimensional arrays. | Dimension | R object | |---|---| | 0D | `42` | | 1D | `c(42, 42, 42)` | 2D | `matrix(42, nrow = 2, ncol = 2)` | | 3D | `array(42, dim = c(2,3,2))` | | 4D | `array(42, dim = c(2,3,2,3))` | --- # Tensor examples | Data | Tensor | |---|---| | Vector data | 2D tensors of shape `(samples, features)` | | Timeseries data | 3D tensors of shape `(samples, timesteps, features)` | | Images | 4D tensors of shape `(samples, height, width, channels)` | | Video | 5D tensors of shape `(samples, frames, height, width, channels)` | <br/> Note that `samples` is always the first dimension -- These tensors "flow" the computation graphs that we've been looking at, hence Tensor*Flow*. Get it? --- # Tensors are literally flowing <img src="figs/tensors_flowing.gif" style="display: block; margin: auto;" /> --- class: inverse, center, middle # Hands-on stuff! --- # Hello world MNIST with feedfoward neural net --- # Conv nets for image classification Convolutional neural network (CNN) or ConvNet is where it's at when it comes to image processing. -- We're gonna ~~shamelessly copy~~ use some slides ([http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture05.pdf](http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture05.pdf)) from a Stanford University course. See [http://vision.stanford.edu/teaching/cs231n/](http://vision.stanford.edu/teaching/cs231n/) for the details, including links to YouTube videos. --- # CNN hands-on Notebook time! --- # Recurrent networks CNN:image data::RNN:sequential data -- These could be things like time series data or natural language data. -- Just as before, we're gonna utilize some existing slides! [http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture10.pdf](http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture10.pdf). Also check out [http://colah.github.io/posts/2015-08-Understanding-LSTMs/](http://colah.github.io/posts/2015-08-Understanding-LSTMs/) and [http://karpathy.github.io/2015/05/21/rnn-effectiveness/](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) --- # RNN - text classification Hands-on demo! --- # RNN - time series forecasting Since RNNs are good for sequential data, they *should* be useful for doing time series stuff, too. -- If you do some research on the topic, you'll see there are a bazillion ways to do forecasting on a univariate time series with neural networks. We'll go through 4 of these... onto the demo! --- # Structured data .large[Most of the time, for most people, we're working with structured data in tables and not images or videos. Let's go through an example of applying deep learning to an old school predictive modeling problem!] --- # That's a wrap! .large[ Resources: - Stanford course on convolutional neural nets [http://cs231n.stanford.edu/](http://cs231n.stanford.edu/) - Andrew Ng's Coursera courses [https://www.coursera.org/specializations/deep-learning](https://www.coursera.org/specializations/deep-learning) - JJ/Francois's book [https://www.manning.com/books/deep-learning-with-r](https://www.manning.com/books/deep-learning-with-r) - Deep Learning book (Goodfellow, Bengio, and Courville) [https://github.com/janishar/mit-deep-learning-book-pdf](https://github.com/janishar/mit-deep-learning-book-pdf) - RStudio Tensorflow blog [https://tensorflow.rstudio.com/blog.html](https://tensorflow.rstudio.com/blog.html) Link to slides and demo: [https://kevinykuo.com/talk/2018/07/user-tutorial/](https://kevinykuo.com/talk/2018/07/user-tutorial/) ]