Skip to content

Quantselect-implementation#706

Open
vuductung wants to merge 14 commits intomainfrom
quantselect-implementation
Open

Quantselect-implementation#706
vuductung wants to merge 14 commits intomainfrom
quantselect-implementation

Conversation

@vuductung
Copy link
Copy Markdown
Contributor

No description provided.

…ultiple normalization methods (directLFQ, quantselect)
…uding model, training, optimizer, and prediction configurations.
…ion and enhance parameter handling. Introduced feature dictionary for input data and improved logging for normalization processes. Updated documentation for parameters and added compatibility for legacy configurations.
…zation. Introduced fixtures for MS2 features and PSM files to enhance testing coverage. Updated logging to reflect quantselect processing. Refactored existing tests to accommodate new feature dictionary input structure.
…or required psm_df input and improving configuration handling. Default settings are now merged with user-provided configurations, and the seed for random determinism is dynamically set based on the configuration.
…on method directly in the intensity filtering step. Updated logging for fragment absence and adjusted input handling for lfq method to use the feature dictionary.
…ity. Improved logging format for normalization processes and adjusted DataFrame handling in unit tests for better clarity and maintainability.
…d of 'normalize_lfq' for consistency with recent changes in the quantification process.
…turn a dictionary instead of a tuple, improving clarity in handling fragment data. Update SearchPlanOutput to handle cases where no fragment data is found, enhancing logging for better debugging.
…turn a tuple instead of a dictionary, clarifying the handling of empty data cases. Update SearchPlanOutput to remove redundant checks for fragment data absence, improving code efficiency.
Copy link
Copy Markdown
Collaborator

@mschwoer mschwoer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some upfront comments regarding the integration and overall structure

Comment on lines +337 to +339
# Normalization method for label-free quantification values
# Options: "directlfq", "quantselect", "none" (or false for backwards compatibility)
normalization_method: "quantselect"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please set the default to directlfq (to prevent previous behaviour)

also, add ... .normalize_lfq to config.py : TOLERATED_KEYS

then the change is only breaking for people that used normalize_lfq: False (probably few)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "(or false for backwards compatibility)" can be removed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, set the default as directlfq. - normalize_lfq was kept in defautl_yaml, so adding to TOLERATED_KEYS not necessary?

min_nonnan: 3
# Enable normalization of label-free quantification values
normalize_lfq: True
# Normalization method for label-free quantification values
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, frontend needs to be adapted to use the new key and default

) # here you can chose wether to log the processed proteins or not
# Apply normalization based on the selected method
if normalize == "quantselect":
if psm_df is None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move all code in this if branch to a new class/file

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(cf also @GeorgWa 's recent refactorings(

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"num_samples_quadratic": 50,
"min_nonnan": 1,
"normalize_lfq": True,
"normalization_method": "quantselect",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here, please use directlfq as default

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Comment thread README.md
- Empirical library and fully predicted library search
- End-to-end transfer learning for custom RT, mobility, and MS2 models
- Label free quantification
- Label free quantification with multiple normalization methods (directLFQ, quantselect)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add links to the repective repos?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

min_nonan: int = 1,
num_cores: int = 8,
normalize: bool = True,
normalize: str = "quantselect",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalization_method: str | None = "directlfq"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed normalize.

lfq_df = lfqutils.index_and_log_transform_input_df(intensity_df)
lfq_df = lfqutils.remove_allnan_rows_input_df(lfq_df)

if normalize == "directLFQ":
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please put the strings directLFQ and quantselect into a class

class NormalizationMethods(metaclass=ConstantsClass):
   DIRECTLFQ: str = "directlfq"
...

and use it throughout the code?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, moved class

class NormalizationMethods(metaclass=ConstantsClass):
    String constants for LFQ normalization methods.

    DIRECT_LFQ: str = "directLFQ"
    QUANT_SELECT: str = "QuantSelect"

to key.py

psutil==7.0.0
pyahocorasick==2.1.0
pyarrow==19.0.1
pyarrow==20.0.0
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quantselect=0.1.0 is missing here,
also probably some others?

this file should anyway be autogenerated .. maybe we add the quantselect dependency (together with a bump of all requirement version) in a dedicated PR, @GeorgWa ?



@pytest.fixture
def create_psm_file():
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) as it's a fixture, should be called as what it is, e.g. test_psm_df

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@mschwoer
Copy link
Copy Markdown
Collaborator

this can be closed now I guess @vuductung ? if so, please also delete the associated branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants