-
Notifications
You must be signed in to change notification settings - Fork 721
feat: add inference and evaluation script with dataset transformations #733
base: main
Are you sure you want to change the base?
feat: add inference and evaluation script with dataset transformations #733
Conversation
| tokenizer_vocab_file_path="/mnt/input_data_dir/pretrained_models/OPT/dependencies/gpt2-vocab.json", | ||
| tokenizer_merges_file_path="/mnt/input_data_dir/pretrained_models/OPT/dependencies/gpt2-merges.txt", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Metaseq has a standardized path for the vocab and merges files then we'll need to replace them here :) If not we might need to remove the default value.
tupini07
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments :)
| RUN pip install \ | ||
| aim==3.16.2 \ | ||
| py-rouge==1.1 \ | ||
| rouge_score==0.1.2 \ | ||
| parlai==1.7.1 \ | ||
| evaluate==0.4.0 | ||
|
|
||
| ENV NLTK_DATA="/usr/share/nltk_data" | ||
| RUN python -c "import nltk; nltk.download('punkt', download_dir='${NLTK_DATA}')" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This likely isn't the correct place to make this change.
It is only snippet from our whole Dockerfile which adds the evaluation libraries
| from metaseq.data.datasets.types import CommonDatasetConfiguration, DatasetConfiguration, DatasetConfigurationTeacherGenerated, DatasetModelConfig, DatasetModelHooks, DatasetTeacherGeneratedDataHooks, IdentityDict | ||
|
|
||
| # Visual diagram of where hooks/functions are called during inference or data generation | ||
| # https://excalidraw.com/#json=zoAk_TdynBHQnP9vZufGm,ekcVg_HqiF79cAp58_HKRQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This visualization may be important for understanding
Issue
Solutions
Add script for model inference and evaluation
Add mappings between dataset and pipeline configuration of eval libraries, metrics, and transformation functions
(4) something something->4Added necessary evaluation libraries and re-implemented some metrics
Add PromptGeneratror to create few-shot prompts based on configuration using Jinja templates
This PR is quite large so it may be hard to make sense of.
Originally was only going to be inference.py and few other modifications, but then I kept brining in missing dependencies to avoid gaps and it grew a lot 🤔
Testing
Did not test 😔
Related to: #726
Much of this work was done by @sahajgg, @tupini07, and @anselmwang 🙏