Intrinsic reward calculation, sum or mean?

Hi!

I have a question related to how the intrinsic rewards are calculated.
Why do you use the sum(1) instead of mean(1)?
https://github.com/jcwleo/random-network-distillation-pytorch/blob/e383fb95177c50bfdcd81b43e37c443c8cde1d94/agents.py#L76

That would calculate the sum along the 512 output neurons, which is different than calculating the mean along those outputs.

At the original release with tensorflow, they use reduce_mean, and im a little bit confused.
https://github.com/openai/random-network-distillation/blob/f75c0f1efa473d5109d487062fd8ed49ddce6634/policies/cnn_gru_policy_dynamics.py#L241

Hope you could clear me, 
Thank you in advance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Intrinsic reward calculation, sum or mean? #33

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Intrinsic reward calculation, sum or mean? #33

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions