Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy deep neural network #29

Open
marxav opened this issue Jun 29, 2020 · 1 comment
Open

Numpy deep neural network #29

marxav opened this issue Jun 29, 2020 · 1 comment

Comments

@marxav
Copy link

marxav commented Jun 29, 2020

Thank you for this wonderful example, which helped me understanding the gradient descent implementation.
I just noticed a minor mistake:

  • dW_curr = np.dot(dZ_curr, A_prev.T) / m
  • db_curr = np.sum(dZ_curr, axis=1, keepdims=True) / m

should be:

  • dW_curr = np.dot(dZ_curr, A_prev.T)
  • db_curr = np.sum(dZ_curr, axis=1, keepdims=True)

In addition:

  • params_values["W" + str(layer_idx)] -= learning_rate * grads_values["dW" + str(layer_idx)]
  • params_values["b" + str(layer_idx)] -= learning_rate * grads_values["db" + str(layer_idx)]

should also be:

  • params_values["W" + str(layer_idx)] -= learning_rate / m * grads_values["dW" + str(layer_idx)]
  • params_values["b" + str(layer_idx)] -= learning_rate / m * grads_values["db" + str(layer_idx)]

Otherwise, the code will not work, for instance if one wants to extend it to implement a regression use-case instead of a classification use-case (i.e. "none" instead of "softmax" in the final layer + court-circuiting the final activation function in the code).

@pranftw
Copy link

pranftw commented Jul 10, 2020

Not necessarily @marxav .If the derivative of the cost function to the activation of the output layer already takes into account "m", ie 1.) d(cost_fn)/d(activation) = (1/m)*((1-y/1-a) - y/a), then there is no need to again divide the parameters or other gradients by "m", because when it gets divided by "m" in 1.) ,it gets propagated to all the parameters and gradients.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants