Numpy deep neural network

Thank you for this wonderful example, which helped me understanding the gradient descent implementation. 
I just noticed a minor mistake: 

- dW_curr = np.dot(dZ_curr, A_prev.T) / m
- db_curr = np.sum(dZ_curr, axis=1, keepdims=True) / m

should be:

- dW_curr = np.dot(dZ_curr, A_prev.T) 
- db_curr = np.sum(dZ_curr, axis=1, keepdims=True) 


In addition:

- params_values["W" + str(layer_idx)] -= learning_rate * grads_values["dW" + str(layer_idx)]        
- params_values["b" + str(layer_idx)] -= learning_rate * grads_values["db" + str(layer_idx)]

should also be:

- params_values["W" + str(layer_idx)] -= learning_rate / m * grads_values["dW" + str(layer_idx)]        
- params_values["b" + str(layer_idx)] -= learning_rate / m * grads_values["db" + str(layer_idx)]

Otherwise, the code will not work, for instance if one wants to extend it to implement a regression use-case instead of a classification use-case (i.e. "none" instead of "softmax" in the final layer + court-circuiting the final activation function in the code).





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Numpy deep neural network #29

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Numpy deep neural network #29

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions