NN code #86

abeerM · 2021-01-12T14:51:18Z

According to the paper "our policies are parameterized by a two-layer ReLU MLP with 64 units per layer. To support discrete communication messages, we use the Gumbel-Softmax estimator [14]." However, I could not find it in the code!
The policy is hardcoded (policy.py )based on the keyboard input, so what if my environment does not require input from the user

Appreciate explaining that point

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NN code #86

NN code #86

abeerM commented Jan 12, 2021 •

edited

Loading

NN code #86

NN code #86

Comments

abeerM commented Jan 12, 2021 • edited Loading

abeerM commented Jan 12, 2021 •

edited

Loading