You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have trouble understanding where the list of action’s vector for each agent (that you pass to the MujocoMulti env ) is reassembled into the single agent Mujoco env action vector to match the correct actuators. For example, from line
it seems that the multi-agent action list is simply flattened and then passed to the Mujoco single agent env. I do not see how this could handle both the 2-Agent Ant and 2-Agent Ant Diag setups. If we look at Figure 4 of the FACMAC paper, in Figure 4 H and I we have:
2-Agent Ant (Figure 4 H):
MA action list = [blue agent, green agent] = [[a1, a2, a5, a6], [a3, a4, a7, a8]]
I have trouble understanding where the list of action’s vector for each agent (that you pass to the MujocoMulti env ) is reassembled into the single agent Mujoco env action vector to match the correct actuators. For example, from line
multiagent_mujoco/src/multiagent_mujoco/mujoco_multi.py
Line 111 in 97eab01
it seems that the multi-agent action list is simply flattened and then passed to the Mujoco single agent env. I do not see how this could handle both the 2-Agent Ant and 2-Agent Ant Diag setups. If we look at Figure 4 of the FACMAC paper, in Figure 4 H and I we have:
2-Agent Ant (Figure 4 H):
MA action list = [blue agent, green agent] = [[a1, a2, a5, a6], [a3, a4, a7, a8]]
Flattened single agent action = [a1, a2, a5, a6, a3, a4, a7, a8]
2-Agent Ant Diag (Figure 4 I):
MA action list = [blue agent, green agent] = [[a3, a4, a5, a6], [a1, a2, a7, a8]]
Flattened single agent action = [a3, a4, a5, a6, a1, a2, a7, a8]
We see that the action vectors passed to the single agent mujoco env do not correspond to the same actuators.
I think that this corresponds to agents observing one limb but controlling another.
Am I missing something here?
The text was updated successfully, but these errors were encountered: