Definition of the expectation from (3.9) assumes summation across all possible values of x (or taking the integral for all possible x values in the case of a continuous variable). It's only possible if we know the exact form of p(x) (not our case). What we have is a sample of m points, and we could only pick an estimate of the expected value by defining an estimator (see section 5.4 of Deep Learning book for more details on estimators). Estimator is not guaranteed to provide true value of the expectation (that's why it called estimator), but we usually could achieve arbitrary precision by increasing sample size, as stated by the law of large numbers (Wikipedia provides pretty good explanation: Law of large numbers
As I've said, the main weakness of the GBC book (Goodfellow, Bengio, Courvile) is the lack of practical programming exercises left to a companion web site.
So, I dream of he same content in the form of a series of iPython Notebooks with all exercices and code samples using Keras, TensorFlow and Theano.
Recently, I commit a modest contribution to my dream by coding the 6.1 Example: Learning XOR pp. 166 to 171, using TensorFlow (The revenge of Perceptron! — Learning XOR with TensorFlow.) . Ok, I know I should have used Theano which is mainly developed and maintained by the MILA lab at UdeM… Next time!
I've a background in AI, ML and particularly NLP. Below, my general review of the GBC book as published on Amazon.com
The GBC (Goodfellow, Bengio, Courville) book is worth the reading. It's definitely THE authoritative reference on Deep Learning but you should not be allergic to maths. That said reinforcement learning is superficially exposed which is due for an additional chapter.[Note]
The main weakness of this masterpiece is the lack of practical programming exercices left to a companion web site. But to cover all the practical stuff, the book should have exceeded 775 pages that it already has.
I dream of he same content in the form of a series of iPython Notebooks with all exercices and code samples using Keras, TensorFlow and Theano.
[Note] To be completely honest the authors wrote a short disclaimer in the «Machine Learning Basics» chapter 5, page 103 about reinforcement learning. « Such algorithms are beyond the scope of this book ».
Could someone explain what the last sentence on page 342 means:
"In linear algebra notation, we index into arrays using 1 for the first entry"
I'm rather confused, because say if I wanted to find what is Z1,1,1, and if m = n = 0 (i.e. I'm trying to find the contribution of the entry in the first row and first column of the input) , doesn't this yield the entry Vl,0,0 K1,l,0,0as the first term in the sum? Given that rows and columns start from index 1, what do the 0's in the index of V mean?
Hello folks, I am Gourav Agrwal, a 20 years old final year graduate from Indian Institute of Technology, Kanpur. My major is in Mechanical Engineering. I started ML 2 years back through a basic workshop on Introduction to ML. After that I joined Andrew Ng's course on Coursera. Then I pursued courses like Bayesian Machine Learning and Probabilistic Machine Learning in my college. I love probability, stats and linear algebra. I read few research papers on mixture models and non-parametric kernel learning. I love to discuss machine learning algorithms at length.