Skip to content

Numpy deep neural network #29

Open
@marxav

Description

@marxav

Thank you for this wonderful example, which helped me understanding the gradient descent implementation.
I just noticed a minor mistake:

  • dW_curr = np.dot(dZ_curr, A_prev.T) / m
  • db_curr = np.sum(dZ_curr, axis=1, keepdims=True) / m

should be:

  • dW_curr = np.dot(dZ_curr, A_prev.T)
  • db_curr = np.sum(dZ_curr, axis=1, keepdims=True)

In addition:

  • params_values["W" + str(layer_idx)] -= learning_rate * grads_values["dW" + str(layer_idx)]
  • params_values["b" + str(layer_idx)] -= learning_rate * grads_values["db" + str(layer_idx)]

should also be:

  • params_values["W" + str(layer_idx)] -= learning_rate / m * grads_values["dW" + str(layer_idx)]
  • params_values["b" + str(layer_idx)] -= learning_rate / m * grads_values["db" + str(layer_idx)]

Otherwise, the code will not work, for instance if one wants to extend it to implement a regression use-case instead of a classification use-case (i.e. "none" instead of "softmax" in the final layer + court-circuiting the final activation function in the code).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions