Fix CVE issues #447

nathaliellenaa · 2025-02-05T22:13:39Z

Description

Fix CVE issues by updating torch and transformers version

Issues Resolved

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Yerzhaisang · 2025-02-07T02:50:04Z

Hi @nathaliellenaa,

It looks like there’s a CI failure. Could you please rebase your branch onto opensearch-project:main?

Also, could you review the integration test failure logs? We should ensure backward compatibility with previous versions of torch and transformers.

Let me know if you need any help!

Signed-off-by: Nathalie Jonathan <[email protected]>

Yerzhaisang · 2025-02-12T19:47:08Z

I just tried to understand why integration tests fail. Because here previous and latest torch versions generate models with different structures. The model generated by torch 2.0.6 is very hard to deploy and UT fails

Attached the sturcture of these models. Please, take a look
torch_2_0_1.pdf
torch_2_0_6.pdf

nathaliellenaa · 2025-02-12T21:08:27Z

Thanks for your deep dive @Yerzhaisang. I compared the two files and saw that the model in 2.0.1 version uses MultiHeadSelfAttention, while the model in 2.0.6 version uses DistilBertSdpaAttention. I'm not really familiar with these mechanisms, but isn't Scaled Dot-Product Attention less complex and thus easier to deploy compared to Multi-Head Attention?

Note: I've been working to upgrade the PyTorch version in our ml commons repo as well (ref) and I found that we should use PyTorch 2.5.1 version for compatibility with DJL. I created a cluster with the upgraded version and run the UT again and I can see that now it can deploy successfully. However, I'm still encountering connection errors with these changes. So, currently I'm doing a deeper investigation into our codebase to identify what additional modifications might be necessary to resolve this issue.

Yerzhaisang · 2025-02-13T01:36:37Z

could you please take a look at this commit ?
I just replaced DistilBertSdpaAttention with MultiHeadSelfAttention. I think it would be good also for backward compatability.

nathaliellenaa · 2025-02-13T20:20:28Z

Accidentally pushed the latest commit.

nathaliellenaa · 2025-02-13T22:04:30Z

@dhrubo-os The changes in this commit fixed the compatibility issues with the upgraded PyTorch version, and all integration tests pass. Is there any concern with replacing DistilBertSdpaAttention with MultiHeadSelfAttention?

dhrubo-os · 2025-02-13T22:07:59Z

let me re-run the integ tests again.

nathaliellenaa · 2025-02-13T22:11:44Z

let me re-run the integ tests again.

I haven't include the changes in this commit into this PR. I run the integration tests on my forked repo before (ref)

dhrubo-os · 2025-02-13T22:20:42Z

We need to be backward compatible in both places. ML-Commons + Py-ml

I meant we already have pre-trained models: https://opensearch.org/docs/latest/ml-commons-plugin/pretrained-models/

If we upgrade the pytorch, will we be able to run the models which are in the model server already? We need to do more testing on this.

…MultiHeadSelfAttention Signed-off-by: Nathalie Jonathan <[email protected]>

nathaliellenaa · 2025-02-26T21:26:42Z

I tested all pre-trained models with the upgraded PyTorch version and verified that each model can register, deploy, and predict successfully. I also compared the prediction results for both the old and upgraded PyTorch version, and all results fall within the specified tolerance levels (relative tolerance of 1e-03 and absolute tolerance of 1e-05).

codecov · 2025-02-26T21:39:27Z

Codecov Report

Attention: Patch coverage is 92.59259% with 4 lines in your changes missing coverage. Please review.

Project coverage is 90.89%. Comparing base (529ee34) to head (7ebd86b).
Report is 22 commits behind head on main.

Files with missing lines	Patch %	Lines
...search_py_ml/ml_models/sentencetransformermodel.py	92.59%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #447      +/-   ##
==========================================
- Coverage   91.53%   90.89%   -0.65%     
==========================================
  Files          42       43       +1     
  Lines        4395     4656     +261     
==========================================
+ Hits         4023     4232     +209     
- Misses        372      424      +52

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dhrubo-os · 2025-02-26T21:45:47Z

opensearch_py_ml/ml_models/sentencetransformermodel.py

+            parent = getattr(parent, part)
+        return parent, parts[-1]
+
+    def patch_model_weights(self, model):


let's add more comments to explain what are we doing here.

Signed-off-by: Nathalie Jonathan <[email protected]>

nathaliellenaa · 2025-02-26T23:39:08Z

Hi @Yerzhaisang, since you initiated this commit about the code change in thesentencetransformermodel.py file, can you help review the comments I've added to that file? I want to make sure that the comments are accurate and don't contain any misleading information. Thank you!

Yerzhaisang

LGTM, thanks! (Please, don't forget to fix the lint)

nathaliellenaa requested review from dhrubo-os, greaa-aws, ylwu-amzn, b4sjoo, jngz-es and rbhavna as code owners February 5, 2025 22:13

nathaliellenaa added 2 commits February 7, 2025 09:41

Fix CVE issues

42f9d4f

Signed-off-by: Nathalie Jonathan <[email protected]>

Update PR number

134806f

Signed-off-by: Nathalie Jonathan <[email protected]>

nathaliellenaa requested a review from Yerzhaisang as a code owner February 13, 2025 20:12

Change torch version to 2.5.1, replaced DistilBertSdpaAttention with …

e8c3c40

…MultiHeadSelfAttention Signed-off-by: Nathalie Jonathan <[email protected]>

nathaliellenaa force-pushed the fix-cve branch from 7076d1b to e8c3c40 Compare February 26, 2025 21:25

dhrubo-os reviewed Feb 26, 2025

View reviewed changes

Add more comments in sentencetransformermodel.py

7ebd86b

Signed-off-by: Nathalie Jonathan <[email protected]>

Yerzhaisang approved these changes Mar 2, 2025

View reviewed changes

dhrubo-os approved these changes Mar 19, 2025

View reviewed changes

dhrubo-os merged commit fca546c into opensearch-project:main Mar 19, 2025
13 of 14 checks passed

Fix CVE issues #447

Fix CVE issues #447

Uh oh!

Conversation

nathaliellenaa commented Feb 5, 2025

Description

Issues Resolved

Check List

Uh oh!

Yerzhaisang commented Feb 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Yerzhaisang commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nathaliellenaa commented Feb 12, 2025

Uh oh!

Yerzhaisang commented Feb 13, 2025

Uh oh!

nathaliellenaa commented Feb 13, 2025

Uh oh!

nathaliellenaa commented Feb 13, 2025

Uh oh!

dhrubo-os commented Feb 13, 2025

Uh oh!

nathaliellenaa commented Feb 13, 2025

Uh oh!

dhrubo-os commented Feb 13, 2025

Uh oh!

nathaliellenaa commented Feb 26, 2025

Uh oh!

codecov bot commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dhrubo-os Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

nathaliellenaa Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

nathaliellenaa commented Feb 26, 2025

Uh oh!

Yerzhaisang left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Yerzhaisang commented Feb 7, 2025 •

edited

Loading

Yerzhaisang commented Feb 12, 2025 •

edited

Loading

codecov bot commented Feb 26, 2025 •

edited

Loading

Yerzhaisang left a comment •

edited

Loading