Skip to content

Conversation

@SauravP97
Copy link

I was working with micrograd and the loss values were not converging when I was training it on one of my datasets. Looks like the tutorial encourage the use of tanh() as an activation function but the repo lacked implementation.

The PR includes the following changes:

  • Implementation of tanh() activation function and allowing the end users to opt for whether they want to use tanh() or stay with relu() as their choice of activation function at the time of initializing the Multi Layer Perceptron.
  • Enable support of Label for the Value. This helps in debugging when used with Digraph :)

Please review when you get sometime @karpathy.
Big fan! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant