Skip to content

[feat] Generalize dynamic config classes #126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jlamypoirier opened this issue Jan 22, 2025 · 5 comments · May be fixed by #245
Open

[feat] Generalize dynamic config classes #126

jlamypoirier opened this issue Jan 22, 2025 · 5 comments · May be fixed by #245
Assignees
Labels
enhancement New feature or request

Comments

@jlamypoirier
Copy link
Collaborator

jlamypoirier commented Jan 22, 2025

🧐 Problem Description

#104 introduced a mechanism tor selecting config classes dynamically. This can be made useful elsewhere, especially for user-made plugins.

💡 Proposed Solution

  • Generalize the config mechanism to all configs classes by moving it to the base Config class
  • Add a mechanism for loading modules dynamically so plugins can be added to the registry (argument to the fast-llm command, or together with the config class?)

🔄 Alternatives Considered

Having tons of forks for custom features is not practical in the long term.

📈 Potential Benefits

More flexible configuration, facilitate user contributions outside the main branch

@jlamypoirier jlamypoirier added the enhancement New feature or request label Jan 22, 2025
@tscholak
Copy link
Collaborator

I think this can work well with Python's implicit namespace packages, https://peps.python.org/pep-0420/
In a nutshell, we can leave the fast_llm namespace open, so that third-party packages can add to it. As long as these packages are in the python path, python will find them and create the full fast_llm namespace dynamically. All that's needed is removing the init.py files in modules that should be open and their parents.

@jlamypoirier
Copy link
Collaborator Author

jlamypoirier commented Jan 22, 2025

Not sure how much this helps though, since we still need to add a way to auto-import. We could import them all, but it would add to the startup time and the config/implementation separation complicates things a bit. I was thinking about somethings that makes the imports explicit, some options:

  1. In the fast-llm command .Maybe not ideal since it's too separate from the config.
    fast-llm --plugins my_package.my_module other_package.other_module train ....

2 Somewhere near the config root. All in one place but distant from the classes that use them

training: ...
plugins: 
  - my_package.my_module
  - other_package.other_module
...
  1. Next to the class name. Fully modular, but a bit scattered over the place.
data:
  datasets:
    Training:
      type: my_dataset
      module: my_package.my_module
...

@tscholak
Copy link
Collaborator

Probably 2. is best.

Do we really need this though, especially now?
I'd rather focus on making the codebase more friendly to work with for people who need or want to directly contribute to Fast-LLM's development.

@jlamypoirier
Copy link
Collaborator Author

I'm not planning to implement plugins now, I think it should come once we have 2-3 places with dynamic classes and/or other teams working on this that would benefit from it.

@jlamypoirier
Copy link
Collaborator Author

Short-term use case: components for testing, ex GPTTestSlowDatasetConfig from #129

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants