Incorrectly implemented MultiHeadAttention in PseTae #40

sbgeophd · 2021-10-05T19:20:15Z

The implementation of the PseTae MultiHeadAttention appears to have a mistake.

The original implementation applies 2 fully connected layers on the query tensor (see here: https://github.com/VSainteuf/pytorch-psetae/blob/master/models/tae.py#L133)

However, in the eo-flow implementation, while both fully connected layers are defined, the 2nd (defined here: https://github.com/sentinel-hub/eo-flow/blob/master/eoflow/models/pse_tae_layers.py#L50) is not used as would be expected here: https://github.com/sentinel-hub/eo-flow/blob/master/eoflow/models/pse_tae_layers.py#L66 (indeed, it is not used at all in the code).

devisperessutti · 2021-10-06T09:19:56Z

Hi @sbgeophd !

Yes, you are correct, the second connected layer is not used. I can't recall if that was done on purpose or not, but I'd guess we just overlooked this part.

Do you have the capacity to make a pull request fixing this issue? If not, we could do it but it might not happen very soon.

Thanks for reporting the issue.

sbgeophd · 2021-10-06T18:46:33Z

I'm in theory happy to make a pull request, but while I think I've fixed it, the model still doesn't work for me. I'll submit a pull request if/when I get it working.

devisperessutti self-assigned this Oct 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrectly implemented MultiHeadAttention in PseTae #40

Incorrectly implemented MultiHeadAttention in PseTae #40

sbgeophd commented Oct 5, 2021

devisperessutti commented Oct 6, 2021

sbgeophd commented Oct 6, 2021

Incorrectly implemented MultiHeadAttention in PseTae #40

Incorrectly implemented MultiHeadAttention in PseTae #40

Comments

sbgeophd commented Oct 5, 2021

devisperessutti commented Oct 6, 2021

sbgeophd commented Oct 6, 2021