Skip to content

Conversation

@mart-r
Copy link
Collaborator

@mart-r mart-r commented Oct 29, 2025

This PR adds a command line interface to download medcat-scripts.

The way it works is that it:

  • Finds current version of medcat
  • Find the latest tag for the current minor version (e.g medcat-scripts/v2.2.0 for medcat==2.2.0)
  • Downloads the zip
  • Extracts to the specified folder

The CLI can be used as follows:

python -m medcat download-scripts [DEST] [log_level]

Where both arguments are optional.
If [DEST] is not specified, the current working directory us used. The contents of medcat-scripts folder will be copied into this destination.
By specifying log_level you can add or remove the logging level (defaults to INFO).

The idea is that this can be used instead of cloning working_with_cogstack (which is what was used in v1). But with the additional safeguards of getting the correct version of the scripts that are (somewhat) guaranteed to work with your current installation.

@tomolopolis
Copy link
Member

Copy link
Collaborator

@alhendrickson alhendrickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey this code looks good to me.

Definitely I've got wider thoughts about the packaging process in general, but ultimately the python CLI looks like a good plan


def main(*args: str):
if not args:
print("Usage: python -m medcat download-scripts [DEST] [log_level]",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor - would make a const for these usage statements, so it's the same as below

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

with open(target, "wb") as f:
f.write(zf.read(m))

logger.info("Scripts extracted to: %s", dest / SCRIPTS_PATH)
Copy link
Collaborator

@alhendrickson alhendrickson Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor & pure opinion - can we make dest the whole string? Eg default to "./medcat-scripts", but if I provide destination then just use it without forcing SCRIPTS_PATH on me. EG git clone url.git my-folder would put it all in my-folder, not a subfolder in there

To clarify I'd be looking for this behavior

python -m medcat download-scripts /some/file/path

ls /some/file/path
<pure contents of the zip>

# Compared to the current, where ls /some/file/path would always contain  a folder "medcat-scripts"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did look at that as well and go "do we need the extra implicit folder?".


def _get_medcat_version() -> str:
"""Return the installed MedCAT version as 'major.minor'."""
version = importlib.metadata.version("medcat")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not important - what happens if this is run from the main branch?

Copy link
Collaborator

@alhendrickson alhendrickson Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm thinking in future an optional arg that lets me specify a tag would be really useful - eg if I want to run this from my main branch for testing or whatever. Prediction is having this not always force the version lookup will be needed one day.

Copy link
Collaborator Author

@mart-r mart-r Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, if you're running it off main branch, it'll have the version specified in pyproject.toml. It's at 2.0.0 currently, but we can bump it to 2.2.0 so it find at least something.

Though the other option would be to look for the latest tag in the history if we've got a local build instead.
EDIT: That would only work for editable installs - you normally don't have git history available.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats cool - I dont see the suggested arg being useful anytime soon then if it all runs fine, so all good to merge

@mart-r mart-r merged commit fb3b656 into main Oct 29, 2025
20 checks passed
@mart-r mart-r deleted the feat/medcat/CU-869azeyvz-add-scripts-dl-endpoint branch October 29, 2025 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants