We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
想请教一下,alpha zero是如何避免在不可行的位置落子的,比如该位置已经被占了,因为mcts在select的时候,每一个动作的概率是跟policy的输出有关,而在一开始的时候,policy是不知道哪些位置可行,哪些不可行,这样是否会产生不可行的动作?
The text was updated successfully, but these errors were encountered:
在產生policy之後把所有不能動的位置的porb改成-INF或0 (有過softmax用0即可 沒有的話用-INF)
Sorry, something went wrong.
明白了,感谢您的解答!
No branches or pull requests
想请教一下,alpha zero是如何避免在不可行的位置落子的,比如该位置已经被占了,因为mcts在select的时候,每一个动作的概率是跟policy的输出有关,而在一开始的时候,policy是不知道哪些位置可行,哪些不可行,这样是否会产生不可行的动作?
The text was updated successfully, but these errors were encountered: