-
Notifications
You must be signed in to change notification settings - Fork 1.7k
feat: v0.1.0 Refactor & Packaging #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
feat: Add CI workflow and remove pre-commit hooks
I ran I was not able to replicate the results in the paper but I am running on a much smaller GPU so I had to decrease the batch size and learning rate which I believe is the core issue with my results. |
@zbloss were you able to run evaluation of their ARC-2 checkpoint I'm getting errors regarding size mismatch |
I did not try to load the existing checkpoints, I couldn't get them to load before these changes with similar issues. |
So this checkpoint works after your changes? |
No it does not work before or with the changes |
@alexander-rakhlin I have opened up a PR to add this model to Huggingface's Transformers library with working checkpoints in safetensor format. I'm waiting on the 🤗 team to review and approve, so you'll have to pull my fork if you want to use it immediately. huggingface/transformers#41272 Weights are here: |
@zbloss thank you. I also trained Sudoku and it works just fine, except for invalid puzzles with multiple solutions. Currently, I am training ARC-1. I think I found the reason why their checkpoint fails and will let you know once I verify it. |
This pull request introduces infrastructure and documentation improvements to streamline development, testing, and deployment for the HRM repository. The most significant changes are the addition of a multi-stage GPU-enabled Dockerfile, a GitHub Actions CI workflow for Python linting and tests, and expanded installation and usage instructions in the
README.md
. There are also updates to dataset and training script usage, and minor code cleanups in the evaluation notebook.Infrastructure & Deployment
Dockerfile
supporting both FlashAttention 2 and 3 for Ampere and Hopper GPUs, including CUDA 12.6, Python 3.12, and optimized dependency installation usinguv
. This enables reproducible GPU builds for both development and production..dockerignore
to exclude unnecessary files and directories (such as caches, data, checkpoints, notebooks, and test artifacts) from Docker build context, reducing image size and improving build performance.Continuous Integration
.github/workflows/ci.yml
) to automatically lint and test the codebase on Python 3.11, 3.12, and 3.13, ensuring code quality and compatibility across multiple Python versions..python-version
for consistent local and CI environments.Documentation & Usage
README.md
with detailed package structure, installation options (includinguv
,pip
, and Docker), FlashAttention setup instructions, Python API usage example, and updated commands for dataset preparation, training, and evaluation to use the newscripts/
directory anduv run
. [1] [2] [3] [4] [5]Notebook Cleanup
arc_eval.ipynb
for clarity and consistency. [1] [2]Issues
Closes #88