Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sampletimes = 3问题 #15

Open
luoyueyi opened this issue Sep 16, 2018 · 1 comment
Open

sampletimes = 3问题 #15

luoyueyi opened this issue Sep 16, 2018 · 1 comment

Comments

@luoyueyi
Copy link

在rlmodel 中定义sampletimes = 3 ,网络结构并没有改变,只是重复计算prob吗? list_state, list_action三次采样应该都是一样的值。

for j in range(sampletimes):
                      #reset environment
                      state = env.reset( batch_en1, batch_en2,batch_sentence_ebd,batch_reward)
                      list_action = []
                      list_state = []
                      old_prob = []


                      #get action
                      #start = time.time()
                      for i in range(batch_len):

                          state_in = np.append(state[0],state[1])
                          feed_dict = {}
                          feed_dict[myAgent.entity1] = [state[2]]
                          feed_dict[myAgent.entity2] = [state[3]]
                          feed_dict[myAgent.state_in] = [state_in]
                          prob = sess2.run(myAgent.prob,feed_dict = feed_dict)

                          old_prob.append(prob[0])
                          action = get_action(prob)
                          #add produce data for training cnn model
                          list_action.append(action)
                          list_state.append(state)
                          state = env.step(action)
@xuyanfu
Copy link
Owner

xuyanfu commented Sep 16, 2018

get_action函数中,会通过np.random.rand()随机的进行采样,所以三次结果的action是不一样的。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants