You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
since the encoder_hidden_states and attention_mask won't be changed in the decoder, a new view for them is more memory efficient than repeat_interleave. Because repeat operation in pytorch would copy the data storage as illustrated
using index_select with proper index will be better:
so a simple modification is :
Thanks a lot for the suggestion! I was wondering if you have tried this modification and observed the difference in memory usage? It would be really great if you could provide some statistics about the change in memory usage and create a pull request for your suggested changes :)
Hi, in this part,
BRIO/modeling_bart.py
Lines 1863 to 1869 in 135f0e5
since the
encoder_hidden_states
andattention_mask
won't be changed in the decoder, a new view for them is more memory efficient thanrepeat_interleave
. Becauserepeat
operation in pytorch would copy the data storage as illustratedusing
index_select
with proper index will be better:so a simple modification is :
The text was updated successfully, but these errors were encountered: