Replies: 2 comments
-
I'm not sure if I understand your question correctly. But the idea behind is that, after each |
Beta Was this translation helpful? Give feedback.
0 replies
-
I'm afraid there's a missunderstanding here. The 1:N loop is to collect the performance of each independent rollout. So I don't think anything should be passed to the next iteration.
…------------------ Original ------------------
From: GrutmanE ***@***.***>
Date: Thu,May 20,2021 6:02 AM
To: JuliaReinforcementLearning/ReinforcementLearning.jl ***@***.***>
Cc: Jun Tian ***@***.***>, Comment ***@***.***>
Subject: Re: [JuliaReinforcementLearning/ReinforcementLearning.jl] Agent training (#299)
Thanks for your reply. Let me add clarify some more.
I do not see how the learner gets updated. How does the information from the run number m get passed to the run number m+1? To rephrase my question, some mutable variable in the scope of the 1:N loop in repeated_run function must be passed to the learner to make it different from the previous iteration?
In case of tabular Q learning there must be a Q matrix with dimensions size(environment), size(action space) that needs to be maintained and updated.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
One
of the Plato notebooks in the Zoo (Chapter06_Cliff_Walking.jl) has a following function:It seams repeated_runs uses create_agent (via calling a constructor) to create an identical agent in 1:N, but clearly this cannot be so. What is the trick please? In other words, how does the information is passed along?
P.S.
For completeness adding create_agent function
Beta Was this translation helpful? Give feedback.
All reactions