An overview of machine learning in general and trends in deep learning in particular, followed by suggestions on how to get started with ML on google cloud, then demos of various approaches (colab, kubeflow notebooks, cloud vision api, custom instances w/ swift for tensorflow + TPU).
On a high-level today, we’re going to talk about machine learning in general, then we’ll look at some different ways to do it on Google Cloud. The ultimate goal of my presentation is just to show you how to get started with machine learning on Google Cloud. But towards that end, we’ll spend a bunch of time on the background and the theory of machine learning to try to give you a bird’s eye view of this subject and this field. From there, then we’ll jump over into some different ways to get started. Then finally, we’ll do some Google Cloud demos. Then at the end, I’ll just do a quick recap of everything. If people have questions, I’ll try to answer them to the best of my ability.
What is machine learning? To me, it’s an intersection of this idea of data, statistics, and computing altogether. Historically whenever we’ve wanted to do large scale data analysis, we basically had to use real-world sampling. So to use an example, we might try to calculate the average height of people in the United States. Towards that end, you would go out and find a group of people – say, a hundred, measure their heights, and then try to extrapolate that to cover the entire United States.
But nowadays, we have computers. So basically, we can put everybody’s height into a gigantic database column and then just run the average there. So computers – while statistics is not a new field, I think this ability to use computers to do them is kind of really changed the powers of what we can do, and make many things that were previously very difficult to become almost commonplace.
Three traditional techniques of machine learning are called linear regression, random forests, and then gradient boosted trees. If you’re interested in machine learning, these are the three key techniques you need to know – long before you go after anything fancy like neural networks, or anything like that.
Linear regression is conceptually hopefully easy to understand. It’s basically just putting points to a line. Random forests is a technique that a lot of people probably have not actually seen, but basically, it’s a way of using a computer to randomly sample data. If you have a robust dataset, this is actually a really good way to build fast categorizers, and stuff like that. You’ll see them a lot in the real world. Then the problem with random forests is that they sort of break down sometimes. So we have this concept of gradient boosted trees, it just gives us a way to combine different models together in order to, in theory, get the best of both worlds.
The basic problem with all these approaches has historically been what’s called high dimensional data. A picture is probably a good example of this. You have a red channel, a green channel, and a blue channel. You can analyze the data by looking at the pieces, but really you need to see the interconnections. You’re not looking in the single individual channel, but rather all of them combined together.
So there are a few different ways of dealing with high dimensional data that have historically been used in machine learning. There is a technique called principal component analysis, which is kind of a statistical approach. There is this thing called singular value decomposition, which is kind of like a mathy version of the same thing, we’ll say, to try and collapse high dimensional data down.
The other thing you get into is called kernel methods. This is sort of where you start to use mathematical tricks to try and change the space of the data that you’re analyzing. And then finally you get into full-blown feature engineering, which is where you’re trying to apply human brainpower to try to modify the dataset to make it something easier for the computer to understand. A good example of this would be language recognition. When you think of words being combinations of letters – but the way we actually speak is combinations of what are called phonemes. And so by modeling the speech at that level, you can actually significantly improve your results. But everything up here, we’ll say it’s roughly the state of the art, as of the year 2000 or so.
Today we’re going to talk a lot about neural networks. This is kind of a new area that’s become very popular in the last few years. Conceptually, the very basic form of neural networks is just a mathematical function. It takes weight times an input, and then adds a bias variable. Quite literally, this becomes a[x]+b, if you look at how it’s implemented mathematically. But what’s interesting is by combining many of these different individual neurons together with activation functions, we can actually start to map to infinite data. So many of these problems that historically the traditional machine learning approaches which try to reduce things down to a single dimension, they hit their limitations. This sort of goes in the opposite direction. The problem though is that neural networks are really expensive to build and train. So historically they have not been used as much.
Then we get into deep learning. This is a kind of field that’s blown up in the last few years. I don’t really think there’s one thing that in particular that had happened to cause all this, I sort of think it was all of these things together. Computing power has become very cheap. Multicore CPUs have become really popular and now, we’re using these GPUs as co-processors, so we’re able to do lots and lots of math on demand very cheaply.
There is this whole big data thing. Collecting large datasets used to be difficult. If you had a thousand data samples, or even a million, that was considered a lot. Now there are datasets with billions or even trillions of data points in them. Types are something else we’ll say – a lot of work has gone into improving sensors and things like that, so we’re able to access all sorts of data nowadays that historically it would have been very expensive, but now it’s very easy to gather.
Then the other area to this is that we can do data at much higher resolutions now. Like, you used to have to model audio at a very low resolution – whereas now, modern computers can basically model it at a level that even humans can’t even understand. Another big thing that’s happened is that we’ve become really good at building these large scale software. It used to be difficult to build things larger than say, one computer, but now with Google Cloud and these various other technologies, it’s easy to spin up a cluster to run these sorts of things.
Then finally, I think the big thing that’s happened is that there’s started to become this commercial application in neural networks and machine learning in general. What this means is that while the goalposts are oftentimes a little bit simpler nowadays, there are real-world applications of things. So for example, we might take search – and so now every time Google can make a slightly better language model, they can improve their search engine. So as a result, they’re putting lots and lots of resources into building bigger and better neural networks.
The other thing though that I’ll say is that there is this essay on the internet called The Bitter Lesson in Machine Learning. What it says is basically the last seventy or so years of machine learning has been people trying to build more and more complicated models in order to do all of this stuff. The sort of fundamental recognition of this whole big data thing and deep learning is that simple models that are built up large will usually do way better than complicated approaches. This is a point we’ll come back to again here in a second.
Convolutional neural networks
So now let’s look at some different examples of actual real-world neural networks. Convolutional neural networks are historically one of the most well understood, and the best way to really introduce this field, I think. Conceptually we have some different networks here. Perceptron – this is the actual invention of neural networks from 1957, so this field is not quite as new as you might think. With this feed forward network, we have a layer of neurons in between our input and our output. And if you look at this Deep Feed Forward thing, you can say that we have two layers of neurons between our input and our output.
This sort of approach is really common in this field and you’ll see it all over, so I think this basic idea is important to understand. If we take this Deep Feed Forward network, we can then simply add convolutions on top. And so we take our input picture, run things through a set of convolutions, and then we use our two layers of neural network, hidden layers at the end, in order to actually make sense of what the network sees.
This is the Vgg Image Recognition network. This is from 2014. I think conceptually it’s reasonably easy for you to understand. We take an input way over here on the left side, we map it in and we run some convolutions and then we squash it down a little bit. We run some more convolutions and squash it down again. More convolutions, squash. More convolutions, squash. Then finally at the end, we run it through a couple of layers of densely connected nodes – or fully connected layers, as they’re called – and then finally we can output our prediction for whatever we’re looking at, like a cat or a dog in this example. This network looks pretty simple, but this is actually state of the art in the field, as about five years ago. So I think it’s a really good way to conceptually see where things have come from.
Recurrent neural networks
The other big area I think you need to know about is what’s called recurrent neural networks. Whenever people draw these, they usually use this rolled form over here, input mapping to an output. I really don’t like this way of presenting this, because I feel like it makes it much more complicated than it really should be. So whenever I think about recurrent neural networks, I really like to use this unrolled form. We have an input mapping to an output, we have a second input mapping to a second output, and we have a third input mapping to a third output, and so on and so forth. But then you can see with this set of red lines on the right side that the output on the second level is not only the input for the second spot, but it’s also a little bit of the previous one, and so on and so forth.
I think if you take this idea and you look at the seq2seq paper which came out in 2014, you can see the same idea. We have an input – A, B, C, and then we’re trying to get an output – X, Y, Z, we’ll say. So what they do with this paper is they remap the output as the second set of the data into the input. So the idea then is we can take a sentence like, “I am a student,” and we can map it to the output in Spanish, “Yo soy Estudiante.” And by combining these two together, we can show it a new sentence, “I am a teacher,” and it would be able to correctly predict what the output for that Spanish version of that sentence would be. So this is a really important paper in this field, and I think it’s a really good jumping-off place for recurrent neural networks in general.
DETR is a paper that came out of Facebook last month. I don’t think you really need to understand it per se, but the reason I threw it up here is because you can see that a lot of modern networks use these ideas altogether. If you look over here on the upper left, what they’re literally using is a convolutional neural network just like the vgg network we used before to look at the image initially. Then if you look over here, these encoder/decoder steps, they’re using a transformer which is like a fancy version of our RNN network to actually figure out what the boxes should be for this object detection network.
So by combining these convolutional techniques with these recurrent techniques, they’re able to produce a state of the art object detection network. We might think of this as combining different pieces of neural networks together to make a larger neural network.
Generative adversarial networks
If we can imagine combining pieces of neural networks together, then we can start to imagine combining multiple neural networks together. So generative adversarial networks, or GANs as they are often called, are from 2014. So this is another technique that is about 5 years old. The picture I’m demoing here is StyleGAN, which is a couple of years old now at this point. But conceptually what you’re looking at here is the computer is hallucinating what it thinks a face will look like. What we’re looking at here is the training loss of a GAN network as it trains. The key idea I want you to get out of this is this little star down here where we see the orange network and the blue network, it sort of switches places. Conceptually, one network learns, and after a while, it reaches a stable state, and then the second network can start to learn from it. So that’s what we’re seeing right here where this orange overlaps this blue part by the star.
The reason I think this is important is then we can start to look at full-blown reinforcement learning papers. So this is the Alphazero engine from 2018. But conceptually if you look over here on the left, it’s just a really large convolutional neural network, much like the ones we looked at before. The key concept then, if you look over here on the right is this idea where the star is at. This network is actually not that clever, we’ll say. Or, whenever it tries to learn the game of Go, it actually doesn’t do that good of a job. It’s much simpler than the other approach, which is this purple line over here. But conceptually, this approach is simpler, but as a result, it also scales better. So ultimately – we’ll say 20 hours or a day into training, it’s able to surpass the previous state of the art, which would be the AlphaGo Lee engine. Ultimately, it’s able to learn to play Go on a level beyond any human player has ever done so before. But conceptually, it’s just a lot of reinforcement learning – neural networks battling each other, based around a CNN style architecture.
Then we can also look at the AlphaStar paper, which came out last year. This is where they are training a bot to play the game of StarCraft. To me, if you look at this network over here, while it looks pretty complicated – if you look by the star here, at its core it’s a recurrent neural network with a bunch of fancy inputs and outputs. Conceptually, it’s just a large scale version of the recurrent neural network that we looked at before. If you look at the upper right over here which you can see is the training process of this network, each of these lines is a different bot that learns a strategy and then uses it to beat the other player, so to speak. Then over time, the bot learns strategies to beat the bots that have learned the strategies, so we ultimately get this set of neural network agents that are able to play the game at a level that’s competitive with humans. That’s what the progression of this chart is showing.
Artificial general intelligence
Which then leads us to this question of artificial general intelligence, or AGI as it’s sometimes called – is the key to developing AI just having lots and lots of computing power? I think this is really an interesting question and one that’s going to be really interesting to explore in the upcoming decade. I think the basic question is almost existential, which is – are humans special? We have this ability to think and whatnot, but is that something that’s innate to us? Or is it simply an emergent behavior from having a certain amount of computational density? Having studied a little bit of physics and biology, I would say that mother nature does not give up her secrets easily, to quote a famous scientist. So I don’t know that we’ll necessarily figure this out any time soon. But conceptually, if you think that if AI is possible, then by extension, I think it becomes the most important question of our time. I don’t really know whether or not – to me, honestly, I’m on the fence on all this, we’ll say. But I will say that nothing I’ve shown you today was possible ten years ago. So by extension of that, I think people can’t really even predict ten years ahead in this field, right now. So to me, this upcoming decade is going to be really interesting for the world of machine learning.
We’ll take a quick breath here. Hopefully, I’ve shown you some different areas in machine learning in what’s going on right now, but the flip side of this is I don’t want you to think that all of this is out of your reach. So to quote Douglas Adams, “Don’t panic!”
Anybody can do this stuff. You don’t need a Ph.D. It’s just a very fancy form of programming, but conceptually, it’s accessible to all of you. The second level I would say is that you can learn the basics of this field for free on your own. You don’t need a fancy computer or any of this stuff. You just need a little bit of time. Then broadly, my advice is always to people to just focus on the fundamentals and get the basics down, and then you can slowly add complexity. The other thing I would say would be to follow the herd – there’s a lot of research and stuff going on in this field, and so I think if you just try to keep up with what’s going on, you’ll learn a lot that way. A lot of people try to forge ahead or figure out new things, but I think, to be honest, it’s a much better approach to just try and look at what other people are doing, and see if you can figure out ways to apply it to your own stuff.
If you’re interested in learning machine learning, I think basically you need to do these three things – you need to pick a framework, which I think should be probably narrowed down to TensorFlow and Pytorch, which we’ll talk more about here in a second. You need to pick a tool, which is just a place to actually run your code. Google CoLab, which we’ll show you here in a bit, or you can run it on Google Cloud, which we’ll look at as well. Then finally, you can also run it on your own machine if you know how to build one. Then I think you need to pick a teacher or an approach, or a learning process that works for you.
So, TensorFlow – maybe you are well aware of it – has a pretty well-understood library in this space from Google. TensorFlow 1.15, there’s a bunch of different versions of TensorFlow 1. About a year ago, they came out with TensorFlow 2, and I think in the last six months or so, it’s become pretty solid and hammered on. So my advice to you would be to just start with TensorFlow 2.2, which came out a couple of months ago and then use it with Python 3.
In general, though, I would advise you not to use raw TensorFlow, we’ll say. I think using Keras, which is a high-level API, which is part of TensorFlow is really the best way to use it. What I have over here is a picture example of a simple neural network that you could build with Keras and then run on your computer. There are some different set of courses on this, but Coursera and Andrew Ng have this whole DeepLearning.Ai thing, I think that’s really a solid way to learn TensorFlow. They have a lot of videos and they have little notebooks you can look at. There are a lot of other people taking the course, you can answer questions on the internet, and so in general, I think this is a solid approach if you’re wanting to go with TensorFlow.
And now Google has had this program where they’re starting to have certificates. So basically after doing the course, you can answer a bunch of questions, and then they’ll give you this little thing that you can print out if that’s your style. I would label this approach as a little bit more academic, we’ll say. They kind of explain the math and stuff like that, but I think it’s all very accessible.
Pytorch is the other big machine learning library I think you should be aware of. Over here, we have a different approach, how Pytorch would do a neural network, similar to the one we looked at on the last slide. I think the best way to get going with Pytorch is Jeremy Howard’s and his Fast.ai series, but it’s a little more seat-of-your-pants, perhaps. Pytorch in general is still maturing, but I think on the flip side if you like that sort of ad hoc approach, or getting your hands dirty, then I think this is a really valuable tool to have in your toolbox as well.
Then you get into the frontier. There’s a bunch of other frameworks in this space, but for me, it’s kind of hard to recommend that you start with one of those if you haven’t mastered one of these other ones first. So Amazon has a framework called MXNet – there are various other libraries out there, but I think in general, TensorFlow and Pytorch are going to be your best bet.
There are a couple of interesting projects from Google that I think you should have on your radar. The core of TensorFlow, or the way it actually runs math, is this library called XLA. So Google has this project called JAX which is a NumPy to XLA bridge if you’re familiar with NumPy. This basically lets you write reasonably simple math looking code, and then you can run it on GPUs very easily. So if you like to get down to the raw math, this is a really good trick to have in your toolbox.
The other interesting project that’s out there is Swift for TensorFlow. This is combining with XLA as well. Then because it’s built into the compiler, you get automatic differentiation, and you get some type safety checking and a functional style programming approach to make the process of building these networks a little bit simpler. This is where I’ve spent a lot of time last year. I’ve been working on this book on the subject. If you’re interested in learning more about that, you can go to this website and sign up, and eventually, I’ll send an email out whenever the book gets closer to publication. But hopefully, it will be out later this year.
Google cloud demos
Okay, so let’s do some demos on Google Cloud. I think these are the five basic Google Cloud techniques that you should be aware of, and I think they’re all kind of related, and you should be able to find a way to use one of these on your own. We’ll look at Google Colab proper, which is just their way of running Notebooks in the cloud. We’ll look at Kubeflow, which is kind of a new project to make it easier for you to run Notebooks yourself. We’ll look at the Google Cloud vision tools and their REST API for things, and then finally we’ll look at how you can actually build your own Deep Learning AMI’s, and then actually building your own custom virtual machines.
Here we go. So this is Google Colab. Hopefully, it will load here in a second. What’s nice about this is that it’s just running Python up in the cloud. You can just basically go in here and copy/paste code from the internet and then hit the run button and it will run for you. They have this whole runtime thing right here, so basically you can click this button right here, and now you can get access to GPUs and TPUs, even for free.
What’s nice now is Google is giving us access to a GPU in the cloud for free. So you can actually do a whole bunch of stuff just using these simple CoLab Notebooks without actually spending any money. So for example, you can go to Keras, and their documentation we’ll say – you can just copy/paste things off the internet and hit the run button… and we’ll wait a few seconds for it to get going. But, here we go. So now we’re doing machine learning in the cloud, and all we really have to know how to do is just click around a web browser a little bit.
Kubeflow – this is an open-source project, we’ll say – but basically, it’s a way to run these sorts of Notebooks on top of Kubernetes, which is a platform for building cloud stuff. So basically you can download this and run it on your own machine if you want. Or, Google has it inside their Google Cloud platform as part of their AI platform. So basically you can go on here and create new little bitty Notebooks like this, so you’d go here and create it with a T4 GPU, and it will pop up this thing – what’s nice is all of this logic and bundling all this stuff together has been figured out for you. So you just hit the create button and then you would have a new Cloud instance. So I did that a little while back, so it’s just running here on the cloud. So then we can just open it up, and so just like before, we can copy/paste our code here…and now we can run a demo, and just like before where we were using the Google CoLab, now we’re using a T4, which is a little bit faster. We’ll give it another second or two to get going, and viola – now we’re doing machine learning using the AI Notebook functionality in Google Cloud.
Then finally we get into building your own virtual machines to solve these problems yourself. Then you can type in Deep Learning, and you can get this Deep Learning VM. What’s nice about this is that it has the same kind of pre-configured stacks as the code we were looking at before. So if you can get your code working on a Jupyter Notebook, then you can use this sort of preconfigured framework here to get things running in the cloud very easily. So here is literally all the same options – and then what I also like is that if you have code that’s tied to a specific version of Cuda or a specific version of TensorFlow, with a few clicks, you can get a base install working with this right here.
Then finally, you can actually build your own machines to do your AI stuff on your own. So I’ve preconfigured a machine right here, and it’s going to be running Swift in the Cloud. Then I’ve gone over here and I’ve created a Cloud TPU to work with it. So we have a TPU running, and then we have a virtual machine. Then we can go here and run our same MNIST demo – we’ll give it a second to catch up… But now we’re running our MNIST demo that we were looking at before, but we’re using Swift in order to run it on a TPU in the cloud. So that’s one of the big reasons why I’m excited about all of this. We’ll let it run for a while, and it’ll get up to 98% accuracy.
So to recap – we’ve looked at what machine learning is, we’ve looked at Deep Learning and some different variants of neural networks, I’ve shown you some tools and different ways to get going, and I’ve shown you some different Cloud-based approaches that you can utilize today if you want. And Google Cloud, if you sign up, you get $300 in free credits which can really go a long way if you’re careful with how you spend it. With that, I’ll say thank you all for listening and I’m happy to try to answer some questions.
Nikiya: Thanks, Brett. This was a really great presentation. Thank you so much for putting everything together. I had a question about GCP and the Notebook instances. Did they give you a cost estimate for running each one of those Notebook instances before you run it or do you get an idea of how much it costs to run each one?
That’s a good question. If you go into the VM Creation Tool, it will tell you how much an Instance costs to run an hour, so that’s a good way to know that. In general, I’ve had a lot of luck with using these T4 Instances. It’s a new GPU that came out last year. But Google will charge you about 30c an hour to use one of those, and I think the Virtual Machine will be like 40c an hour altogether, which I think is pretty reasonable for if you’re wanting to get started.
Nikiya: Cool, thank you so much for that. There are some more questions in the channel so let’s sort of just check in and see… Can you combine this with Node/Express?
Yeah. Basically, you would add the Google Cloud AI libraries to your Node project, and then say you can take images, basically, you would call out to a server, the answer would come back, and then you can add AI to your app pretty easily. The only thing to be aware of is some of these services have a certain cost you should be aware of, but yeah that’s exactly how it works.
Ivan: Can you run PyTorch code on Google Code lab?
You can run PyTorch code on Google Codelab. The way that you would do that, basically, is to do a command line and you would do a PIP install of PyTorch. If you look around on the Internet, people have made pre-configured ones from Google Codelab, which you very much can copy/paste some Python initialization code into Google Codelab and make it work with different networks.
[Shyka? 02:33] asked if there was a built-in network for detecting inappropriate pictures?
Google has an AI service that will do that, and then also people will sometimes build their own. But I would usually advise you to use an API or something like this to get started, just so you can proof-of-concept your code and make sure you understand how it’s all going to all work together before you start trying to do something custom.
[Manoman? 03:03] asked - does it come with an API endpoint that you can use an Auth header or an API key?
Yeah, that’s exactly how it works. I didn’t show you the spot, but you would have to set up your API key so that then you would have access to servers and by extension they would know where to send the bill to, as well.
Nikiya: There’s some more questions. Is there any book that you prefer? Like for introduction to machine learning, or is there a specific book?
There’s some books, like there’s a book on Deep Learning by Ian Goodfellow which is pretty well known, but a lot of those books get pretty technical in a hurry and so it depends on if that’s the style you like. As for me personally, I have not had a lot of luck with that style, so I really like the Internet videos. That’s more of my approach and so that’s how I’ve done a lot of this stuff.
But certainly there’s a lot of books out there – O’Reilly and then Apress as well have many different books on getting started with Tensorflow and PyTorch as well.
Which do you think is better for a newbie? Tensorflow or PyTorch?
I like both frameworks, I think it very much depends on what your style is, so I would advise you to sort of do one lesson from the deep learning dot AI thing and see if you like that style, then do one lesson from the Fast AI sequence and see which particular style you think is a good one to have.
[Tenvesh? 05:01] asks if you should do Google’s ML crash course.
The ML crash course is based around Tensorflow 1, but it’s my understanding that they’re going to update it shortly for Tensorflow 2. I think whenever that comes out, I would definitely advise you to check that out as well.
Billy asked, what tangible applications do you see machine learning accomplishing in the near future?
We see a lot of stuff with computer vision. I think computer vision has been out for a while. We’re able to run it on devices in the field now, so I think Computer vision is really big. And then I think NLP is really starting to come on to its’ own, and so I think that’s the other thing that’s going to be really big in the next few years.
Chris: do you recommend any certifications?
I’m not big on certifications. I just like messing around with things on my own and historically that’s been my approach. So yeah, I don’t have any advice there.
Nikiya: Any other questions anybody has before we end the meeting? What are your thoughts on Tensorflow.js?
I think it’s really cool to do all this stuff in the browser, I think this ability to just run stuff on device, I think it’s a little wonky right now. There’s still some stuff to be figured out there, but I think the idea of just adding machine learning to your random web app is certainly something that’s really cool. I’ve done a few demos around it and it’s going to be big I think, pretty soon.
What math would you recommend people study?
And for math topics, I think in general you should study some statistics. A lot of people don’t study as much statistics as I think they should. So just like, learning some basic games like Craps and Roulette, and really understanding what each bet corresponds to and what the odds are of winning. Then I think linear algebra is always valuable to understand. I don’t think you need to go full blown into Jacobian forms and stuff like that, but I think the basics in linear algebra and being able to multiply matrices are a really good trick to have in your toolbox as well.
With that, I think I’ll call it a day. Thank you all for your time.
Nikiya: We appreciate you and the presentation and answering all of our questions. Thank you again.
Hannah: Yeah for inviting us and reaching out to Brett. It was definitely a fun time, and again Brett, thank you for presenting on this for all of us, and everyone else for joining. Super cool. Also, if you don’t know Danny Thompson - he’s the lead for GDG Memphis, so you can go check out his meet up group. He usually has some really good talks very frequently. So I highly recommend him. Anyway, thanks again, everybody - it was nice to see you all.