Kirk, I'm blown away by the range of your expertise and experience. Could you describe for us the one or two projects you found most interesting over your career? Thanks!

Active In

Deep Learning

Machine Learning

Deep Learning Book

Biology

Artificial Intelligence

Featured Contributions

comment in this discussion

Kavita RawatMachine Learning and Deep Learning enthusiast · 1y

Kirk, I'm blown away by the range of your expertise and experience. Could you describe for us the one or two projects you found most interesting over your career? Thanks!

Read more… (30 words)

comment in this discussion

Kavita RawatMachine Learning and Deep Learning enthusiast · 2y

Nikhil, I had some feedback on your post. I loved everything up to the section *Learning Rates and the Delta Rule*. Starting in this section, I think the distinction between calculating the gradient and calculating the delta could be made clearer. For example, the delta rule is always the following,

\Delta w_k = \epsilon \frac{dE}{dw_k}

What changes is the derivative of E w.r.t to the weight, based on the architecture of the neural network and the type of neurons in the neural network.

Also I think its important to note that the gradients can be calculated automatically in any standard deep learning library. Its important to understand what is happening in back-propagation, but the focus isn't on the algebra.

Again, all this being said, I still think that overa...

Read more… (141 words)

comment in this discussion

Kavita RawatMachine Learning and Deep Learning enthusiast · 2y

I found it helpful while reading the tutorial to keep recapping, and reminding myself what the big picture is.

- A
*neuron*takes multiple inputs and outputs a single value. There are many different types of neurons. A linear neuron outputs a weighted sum, a sigmoid neuron applies the logistic function to its weighted sum. - A
*neural network*is formed of many neurons, where the outputs of some neurons serve as the input for other neurons. If there are no cycles in this, then the neural network is called a feed-forward neural network. - The process of
*training*a neural network is how we learn the values of the weights and biases associated with each neuron. These weights and biases for all the neurons combined are also called the*parameters*of the neural network. - Since there is no closed form solution for finding good values for the weights and biases, we use an
*iterative*process called*gradient descent.*We start with rand...

Read more… (274 words)

comment in this discussion

Kavita RawatMachine Learning and Deep Learning enthusiast · 2y

Adit, this series of CNN posts is truly amazing. I already knew quite a bit about CNNs but I decided to read them anyway. Thought I would leave a review and some notes on each one. :)

I was blown away by this part, learnt so much! Again, you covered a lot in this post. I knew about most of the content in part 1 and part 2 previously, but all of this was entirely new to me. Thanks for this comprehensive literature review of CNNs in the years following its resurgence.

Read more… (92 words)

comment in this discussion

Kavita RawatMachine Learning and Deep Learning enthusiast · 2y

This, just like your previous article on CNNs, is very well done. I felt a bit saddened when I saw pooling, ReLU activations and stride length were not covered in part 1 - thought you skipped them. But Part 1 + the first half of this post is definitely a complete introduction to CNNs.

I also like the choice of the sections that you chose to cover in the second half of this tutorial - dropout layers, other related tasks like localization and detection, transfer learning, and data augmentation. Very comprehensive, very useful.

Read more… (93 words)

comment in this discussion

Kavita RawatMachine Learning and Deep Learning enthusiast · 2y

This post is amazing! I **loved** this tutorial. What I like about it the most is how every section has realistic examples. It makes everything so concrete and easy to follow.

- What we see vs what the computer gets as input
- The convolution operation
- How a convolution operations acts as a filter
- The overall architecture of the CNN

It's really nice! Most tutorials I have come across before this add a lot of equations and algebra, and don't use enough examples.

The only flaw in my opinion, would be that back-propagation is simply too tough to cover in the limited amount of space available as a sub-section of a post focusing mostly on CNN. Given that ...

Read more… (180 words)

comment in this discussion

Kavita RawatMachine Learning and Deep Learning enthusiast · 2y

This is a really nice post Noah, thanks for writing it. I specially like how you separated the main idea behind machine learning from its mathematical implementation. Although mathematics is necessary for implementing machine learning models, its definitely not a requirement for understanding the central idea behind it.

Apart from that, I'd also like to suggest a correction / clarification for those reading this post. *Simple linear regression* is linear regression with exactly one feature per data point. In the case of simple linear regression, m and X will be specific numbers. In the case of general *linear regression*, m and X will be an array of numbers (i.e. vector), and m * X is the dot-product of the two vectors. Just wanted to clarify this, since you used the terms simple linear regression and linear regression quite interchangeably.

Read more… (140 words)

Contributed 100%

Contributed 100%