-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[fix]: correct condition for restoring layer embeddings in TransformerDecorator/AdaptiveLayerLoss
#3560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ecorator/AdaptiveLayerLoss When training with AdaptiveLayerLoss, the `all_layer_embeddings` are deleted erroneously when `output_hidden_states` is True.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a logic error in the AdaptiveLayerLoss class where the condition for removing all_layer_embeddings from the output was inverted.
- Corrects the conditional logic to properly clean up
all_layer_embeddingswhen it was not originally requested
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if self.transformer.auto_model.config.output_hidden_states and "all_layer_embeddings" in features: | ||
| output["all_layer_embeddings"] = features["all_layer_embeddings"] |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call_use_cache method returns cached data without calling original_forward, so the input features parameter will not contain all_layer_embeddings. This check will always fail because features is the raw input to the forward method, not the output. If all_layer_embeddings needs to be included when output_hidden_states is enabled, it should be reconstructed from the cached self.embeddings data instead of reading from features.
| if self.transformer.auto_model.config.output_hidden_states and "all_layer_embeddings" in features: | |
| output["all_layer_embeddings"] = features["all_layer_embeddings"] | |
| if self.transformer.auto_model.config.output_hidden_states: | |
| # Reconstruct all_layer_embeddings from cached self.embeddings | |
| output["all_layer_embeddings"] = self.embeddings[self.call_idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.embeddings doesn't contain all of the hidden states/layer embeddings. Perhaps call_grow_cache() should also storeself.all_layer_embeddings
This pull request makes a small fix to the logic that restores the original output settings in the
call_grow_cachemethod ofAdaptiveLayerLoss.py. The change ensures that theall_layer_embeddingskey is deleted from the output only if the original setting foroutput_hidden_stateswasFalse.call_grow_cacheto correctly removeall_layer_embeddingsfrom the output only whenoutput_hidden_stateswas originallyFalse(sentence_transformers/losses/AdaptiveLayerLoss.py).