Skip to content

Conversation

@baixiac
Copy link
Member

@baixiac baixiac commented Oct 22, 2025

Py3.12 support has also been added and Py3.8 support was deprecated and removed.

Copy link
Collaborator

@kawsarnoor kawsarnoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all looks like it makes sense...might be worth running this past Mart perhaps since he is most knowledgable on v2

@baixiac baixiac force-pushed the medcat2 branch 3 times, most recently from ff8a58e to 3cb5ffb Compare November 3, 2025 12:10
@baixiac
Copy link
Member Author

baixiac commented Nov 3, 2025

all looks like it makes sense...might be worth running this past Mart perhaps since he is most knowledgable on v2

Added @mart-r in here. Just noticed that both MedCAT v2.0 and v2.1 got yanked. Are they the only versions that support Py3.9?

@mart-r
Copy link

mart-r commented Nov 5, 2025

Regarding version support, python 3.9 is end of life now, so we're not supporting that anymore. Everything from now on will be 3.10 and up (currently 3.10 to 3.13).

Copy link

@mart-r mart-r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good overall from medcat v2 side of things! A few nitpicks more than anything else.

There's another few nitpicks I'd like to comment on in general:

  • Since python 3.9, you don't need to use typing.Set (or List, Tuple, Dict, probably a few others) for generic type hinting. You can just use the types themselves (i.e set[str])
  • If you were to drop 3.9 support, you could also use python 3.10-specific type hinting (e.g unions with |: set[str] | None instead of Union[set[str]]; and so on)
  • Both of the above do probably fall outside the scope of this PR, though - just noticed thse changes - even new ones, I guess it's just about consistency at this point

EDIT:
I'd like to also point out that just because I didn't see anything that may break with these changes, doesn't mean these things don't exist. Especially given I only reviewed the changes - so if something was omitted, I didn't review that.

"per_concept_tp": tps.get(cui, 0),
"per_concept_counts": cc.get(cui, 0),
"per_concept_count_train": model.cdb.cui2count_train.get(cui, 0),
"per_concept_count_train": cast(Dict[str, Any], model.cdb.cui2info.get(cui, {})).get("count_train", 0),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why that's cast to a dict? Wouldn't it be an int as before?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

train_count.append(model.cdb.cui2count_train[c] if c in model.cdb.cui2count_train else 0)
concept_names.append(model.cdb.get_name(c))
train_count.append(model.cdb.cui2info.get(c, {}).get("count_train", 0)) # type: ignore
concept_names.append(model.cdb.cui2info.get(c, {}).get("preferred_name", "")) # type: ignore
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDB.get_name still exists. But the behvaiour here seems to be different. I.e before it would return the CUI if the CDB didn't have the concept (and the method still would), but the new implementation returns an empty string instead in those cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

self._tracker_client.send_model_stats(model.cdb.make_stats(), step)
before_cui2count_train = dict(model.cdb.cui2count_train)
self._tracker_client.send_model_stats(dict(model.cdb.get_basic_info()), step)
before_cui2count_train = {c: info["count_train"] for c, info in model.cdb.cui2info.items()}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There does exist a CDB.get_cui2count_train method that builds the cui2count_train mapping (that is a mapping of only CUIs that do in fact have training examples). The difference being that the new implementation here would create a mapping with all concepts (even ones with no training). And that's likely to be considerably bigger since (at least in my experience) most concepts don't have training.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

key=lambda item: item[1],
c: info["count_train"]
for c, info in sorted(
model.cdb.cui2info.items(),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, could use CDB.get_cui2count_train to avoid the untrained concepts in here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for meta_cat in model._meta_cats:
model.config.meta.description = description or model.config.meta.description
meta_cat_addons = [
addon for addon in model.get_addons()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use the CAT.get_addons_of_type method instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

self._tracker_client.log_model_config(self.get_flattened_config(meta_cat, category_name))
assert self._model_service.model is not None, "Model should not be None"
meta_cat_addons = [
addon for addon in self._model_service.model.get_addons()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, could use the CAT.get_addons_of_type method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@baixiac
Copy link
Member Author

baixiac commented Nov 6, 2025

Yeah, the old style of type hinting is mainly there for making older Python versions happy. Alright, will deprecate Py3.9 support for MedCAT v2 in a separate PR then.

@baixiac baixiac merged commit 5acddc8 into main Nov 10, 2025
8 checks passed
@baixiac baixiac deleted the medcat2 branch November 11, 2025 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants