You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
user_goals_first_turn_template.part.movie.v1.p: a subset of user goal. [Please use this one, the upper bound success rate on movie_kb.1k.json is 0.9765.]
16
+
17
+
[NLG rule template]
18
+
dia_act_nl_pairs.v6.json: some predefined NLG rule templates for both User simulator and Agent.
19
+
20
+
[Dialog Act Intent]:
21
+
dia_acts.txt
22
+
23
+
[Dialog Act Slot]:
24
+
slot_set.txt
25
+
26
+
27
+
2. Some Parameters:
28
+
29
+
-agt: the agent id
30
+
-usr: the user (simulator) id
31
+
-max_turn: maximum turns
32
+
-episodes: how many dialogues you want to run
33
+
-slot_err_prob: slot level err probability
34
+
-slot_err_mode: which kind of slot err mode
35
+
-intent_err_prob: intent level err probability
36
+
37
+
-movie_kb_path: the movie kb path for agent side
38
+
-goal_file_path: the user goal file path for user simulator side
39
+
40
+
-dqn_hidden_size: hidden size for RL (DQN) agent
41
+
-batch_size: batch size for DQN training
42
+
-simulation_epoch_size: how many dialogue to be simulated in one epoch
43
+
44
+
-warm_start: use rule policy to fill the experience replay buffer at the beginning.
45
+
-warm_start_epochs: how many dialogues to run in the warm start
46
+
47
+
-run_mode: 0 for display mode (NL); 1 for debug mode (dia_act); 2 for debug mode (dia_act and NL); >3 for no display (i.e. training)
48
+
-auto_suggest: 0 for no auto_suggest; 1 for auto_suggest.
49
+
-act_level: 0 for user simulator is dia_act level; 1 for user simulator is NL level
50
+
-cmd_input_mode: 0 for NL input; 1 for Dia_Act input. (this is for AgentCmd only)
51
+
52
+
-write_model_dir: the directory to write the models
53
+
-trained_model_path: the trained RL agent model; load the trained model for prediction purpose.
54
+
55
+
56
+
3. Commands to run the different agents and user simulators
user_goals_first_turn_template.part.movie.v1.p: a subset of user goal. [Please use this one, the upper bound success rate on movie_kb.1k.json is 0.9765.]
14
+
15
+
[NLG rule template]
16
+
dia_act_nl_pairs.v6.json: some predefined NLG rule templates for both User simulator and Agent.
17
+
18
+
[Dialog Act Intent]:
19
+
dia_acts.txt
20
+
21
+
[Dialog Act Slot]:
22
+
slot_set.txt
23
+
24
+
25
+
2. Some Parameters:
26
+
27
+
-agt: the agent id
28
+
-usr: the user (simulator) id
29
+
-max_turn: maximum turns
30
+
-episodes: how many dialogues you want to run
31
+
-slot_err_prob: slot level err probability
32
+
-slot_err_mode: which kind of slot err mode
33
+
-intent_err_prob: intent level err probability
34
+
35
+
-movie_kb_path: the movie kb path for agent side
36
+
-goal_file_path: the user goal file path for user simulator side
37
+
38
+
-dqn_hidden_size: hidden size for RL (DQN) agent
39
+
-batch_size: batch size for DQN training
40
+
-simulation_epoch_size: how many dialogue to be simulated in one epoch
41
+
42
+
-warm_start: use rule policy to fill the experience replay buffer at the beginning.
43
+
-warm_start_epochs: how many dialogues to run in the warm start
44
+
45
+
-run_mode: 0 for display mode (NL); 1 for debug mode (dia_act); 2 for debug mode (dia_act and NL); >3 for no display (i.e. training)
46
+
-auto_suggest: 0 for no auto_suggest; 1 for auto_suggest.
47
+
-act_level: 0 for user simulator is dia_act level; 1 for user simulator is NL level
48
+
-cmd_input_mode: 0 for NL input; 1 for Dia_Act input. (this is for AgentCmd only)
49
+
50
+
-write_model_dir: the directory to write the models
51
+
-trained_model_path: the trained RL agent model; load the trained model for prediction purpose.
52
+
53
+
54
+
3. Commands to run the different agents and user simulators
""" Take the current state and return an action according to the current exploration/exploitation policy
41
+
42
+
We define the agents flexibly so that they can either operate on act_slot representations or act_slot_value representations.
43
+
We also define the responses flexibly, returning a dictionary with keys [act_slot_response, act_slot_value_response]. This way the command-line agent can continue to operate with values
44
+
45
+
Arguments:
46
+
state -- A tuple of (history, kb_results) where history is a sequence of previous actions and kb_results contains information on the number of results matching the current constraints.
47
+
user_action -- A legacy representation used to run the command line agent. We should remove this ASAP but not just yet
48
+
available_actions -- A list of the allowable actions in the current state
49
+
50
+
Returns:
51
+
act_slot_action -- An action consisting of one act and >= 0 slots as well as which slots are informed vs requested.
52
+
act_slot_value_action -- An action consisting of acts slots and values in the legacy format. This can be used in the future for training agents that take value into account and interact directly with the database
0 commit comments