You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I am a somewhat beginner using PettingZoo and had some questions about the aec_to_parallel_wrapper. In particular the step method.
def step(self, actions):
rewards = defaultdict(int)
terminations = {}
truncations = {}
infos = {}
observations = {}
for agent in self.aec_env.agents:
if agent != self.aec_env.agent_selection:
if self.aec_env.terminations[agent] or self.aec_env.truncations[agent]:
raise AssertionError(
f"expected agent {agent} got termination or truncation agent {self.aec_env.agent_selection}. Parallel environment wrapper expects all agent death (setting an agent's self.terminations or self.truncations entry to True) to happen only at the end of a cycle."
)
else:
raise AssertionError(
f"expected agent {agent} got agent {self.aec_env.agent_selection}, Parallel environment wrapper expects agents to step in a cycle."
)
obs, rew, termination, truncation, info = self.aec_env.last()
self.aec_env.step(actions[agent])
for agent in self.aec_env.agents:
rewards[agent] += self.aec_env.rewards[agent]
terminations = dict(**self.aec_env.terminations)
truncations = dict(**self.aec_env.truncations)
infos = dict(**self.aec_env.infos)
observations = {
agent: self.aec_env.observe(agent) for agent in self.aec_env.agents
}
while self.aec_env.agents and (
self.aec_env.terminations[self.aec_env.agent_selection]
or self.aec_env.truncations[self.aec_env.agent_selection]
):
self.aec_env.step(None)
self.agents = self.aec_env.agents
return observations, rewards, terminations, truncations, infos
When testing this wrapper i have observed that self.aec_env.observe(agent) is called twice; once in obs, rew, termination, truncation, info = self.aec_env.last() and once in observations = {agent: self.aec_env.observe(agent) for agent in self.aec_env.agents}. So we are computing twice the observation for each agent, which if it is expensive or aren't caching can lead to a big decrease in performance.
Would it be sensible to save the observation from the call obs, rew, termination, truncation, info = self.aec_env.last() in an auxiliary variable and select the appropriate ones in {agent: self.aec_env.observe(agent) for agent in self.aec_env.agents}?
Sorry if i have missed/missunderstood something and thank you for your time.
The text was updated successfully, but these errors were encountered:
Question
Hello! I am a somewhat beginner using PettingZoo and had some questions about the aec_to_parallel_wrapper. In particular the step method.
When testing this wrapper i have observed that self.aec_env.observe(agent) is called twice; once in
obs, rew, termination, truncation, info = self.aec_env.last()
and once inobservations = {agent: self.aec_env.observe(agent) for agent in self.aec_env.agents}
. So we are computing twice the observation for each agent, which if it is expensive or aren't caching can lead to a big decrease in performance.Would it be sensible to save the observation from the call
obs, rew, termination, truncation, info = self.aec_env.last()
in an auxiliary variable and select the appropriate ones in{agent: self.aec_env.observe(agent) for agent in self.aec_env.agents}
?Sorry if i have missed/missunderstood something and thank you for your time.
The text was updated successfully, but these errors were encountered: