-
Notifications
You must be signed in to change notification settings - Fork 486
Description
Hello! Thank you for your amazing work and for sharing this implementation 🙏
I have a small question regarding the Critic Learning section in the DreamerV3 paper.
In this section, the λ-return is computed using the following formula:
However, in the DreamerV2 paper, the λ-return is computed differently, as shown here:
My question is about the use of the value term
From my understanding, the reward
Looking forward to your response, and thank you again for this great work!

