Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glum v3.0 #677

Merged
merged 77 commits into from
Apr 27, 2024
Merged
Show file tree
Hide file tree
Changes from 49 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
7f2bbb0
Make tests green with densematrix-refactor branch
stanmart Jul 25, 2023
cc58bbe
Remove most Matrixbase subclass checks
stanmart Jul 26, 2023
cdb564e
Simplify _group_sum
stanmart Jul 26, 2023
6b66d57
Pre-commit autoupdate (#672)
quant-ranger[bot] Aug 7, 2023
bc6a5ec
Use boa in CI. (#673)
jtilly Aug 7, 2023
2b8ae3b
Fix covariance matrix mutating feature names (#671)
stanmart Aug 8, 2023
95af4ff
Add the option to store the covariance matrix to avoid recomputing it…
stanmart Aug 8, 2023
fad75ff
Pre-commit autoupdate (#676)
quant-ranger[bot] Aug 14, 2023
af87010
Fix covariance_matrix dtypes
stanmart Aug 14, 2023
6999e46
Merge branch 'main' into glum-v3
stanmart Aug 16, 2023
049943d
Merge branch 'main' into glum-v3
stanmart Aug 16, 2023
940b260
Make CI use pre-release tabmat
stanmart Aug 16, 2023
fb026c5
Column names à la Tabmat #278 (#678)
stanmart Aug 17, 2023
0622f7d
Merge branch 'main' into glum-v3
stanmart Aug 24, 2023
9a20282
Formula interface (#670)
stanmart Aug 28, 2023
003fcec
Formula- and term-based Wald-tests (#689)
stanmart Aug 28, 2023
91e0408
Support for missing values in categorical columns (#684)
stanmart Aug 28, 2023
44c9cd9
Fix formula context (#691)
stanmart Oct 13, 2023
96d5a30
Merge remote-tracking branch 'origin/main' into glum-v3
MarcAntoineSchmidtQC Oct 13, 2023
dd1a2e8
pyupgrade
MarcAntoineSchmidtQC Oct 13, 2023
fe30a8e
ensure_full_rank != drop_first
MarcAntoineSchmidtQC Oct 16, 2023
ede0c63
Merge branch 'main' into glum-v3
MatthiasSchmidtblaicherQC Dec 8, 2023
b0b2d3e
fix
MatthiasSchmidtblaicherQC Dec 8, 2023
16bd925
move feature name assignment to right spot
MatthiasSchmidtblaicherQC Dec 8, 2023
a000baa
fix
MatthiasSchmidtblaicherQC Dec 8, 2023
2df83c1
remove blank line
MatthiasSchmidtblaicherQC Dec 8, 2023
447c348
bump minimum formulaic version (stateful transforms)
MatthiasSchmidtblaicherQC Dec 8, 2023
bb0a188
improve backward compatibility
MatthiasSchmidtblaicherQC Dec 8, 2023
248c1dc
Merge branch 'main' into glum-v3
MatthiasSchmidtblaicherQC Dec 12, 2023
512740d
Remove code that is not needed in tabmat v4 / glum v3 (#741)
MatthiasSchmidtblaicherQC Dec 12, 2023
ba5597f
Fix formula test: consider presence of intercept in full rankness che…
MatthiasSchmidtblaicherQC Jan 9, 2024
41df2b9
Merge branch 'main' into glum-v3
MatthiasSchmidtblaicherQC Jan 10, 2024
fd943e4
test varying significance level in coef table test (#749)
MatthiasSchmidtblaicherQC Jan 11, 2024
cff6ec4
pin formulaic to 0.6 (#752)
MatthiasSchmidtblaicherQC Jan 15, 2024
f6f5d7c
Add illustration of formula interface to example in README (#751)
MatthiasSchmidtblaicherQC Jan 15, 2024
7d0b8ad
Determine presence of intercept only by `fit_intercept` argument (#747)
MatthiasSchmidtblaicherQC Jan 15, 2024
9cd1d8e
Merge branch 'main' into glum-v3
MatthiasSchmidtblaicherQC Jan 15, 2024
72971f4
consistent linebreaks in docstring
MatthiasSchmidtblaicherQC Jan 15, 2024
6b2b844
remove obsolete arg in docstring
MatthiasSchmidtblaicherQC Jan 22, 2024
1ad8be2
Informative error when encountering categories that were not seen in …
MatthiasSchmidtblaicherQC Jan 29, 2024
20eb1ca
Merge branch 'main' into glum-v3
MatthiasSchmidtblaicherQC Jan 29, 2024
64f2b98
docstring cosmetics
MatthiasSchmidtblaicherQC Jan 29, 2024
b185fe4
even more docstring cosmetics
MatthiasSchmidtblaicherQC Jan 29, 2024
7e86e3f
Do not fail when an estimator misses class members that are new in v3…
MatthiasSchmidtblaicherQC Jan 31, 2024
6816dad
tiny cosmetics [skip ci]
MatthiasSchmidtblaicherQC Jan 31, 2024
137d9fb
No regularization as default (#758)
MatthiasSchmidtblaicherQC Feb 1, 2024
4af7de6
Improve code readability
stanmart Feb 1, 2024
5dfd446
Merge branch 'main' into glum-v3
MarcAntoineSchmidtQC Feb 6, 2024
9c04a08
Merge branch 'main' into glum-v3
stanmart Feb 6, 2024
3ce7fc0
Make arguments to public methods except `X`, `y`, `sample_weight` and…
MatthiasSchmidtblaicherQC Feb 20, 2024
ecc098f
Merge branch 'main' into glum-v3
MatthiasSchmidtblaicherQC Feb 20, 2024
b72379a
fix import
MatthiasSchmidtblaicherQC Feb 20, 2024
1978e12
clean up changelog
MatthiasSchmidtblaicherQC Feb 20, 2024
5bea925
Merge branch 'main' into glum-v3
MatthiasSchmidtblaicherQC Feb 21, 2024
e948e6a
Restructure distributions (#768)
lbittarello Feb 26, 2024
1dc4f30
Explain `scale_predictors` more (#778)
MatthiasSchmidtblaicherQC Mar 8, 2024
fb3a790
Move helpers into `_utils` (#782)
lbittarello Mar 25, 2024
6c83386
Patch docstring
lbittarello Mar 25, 2024
b8f6f8f
Update CHANGELOG.rst
MatthiasSchmidtblaicherQC Apr 12, 2024
326b99c
Apply suggestions from code review
MatthiasSchmidtblaicherQC Apr 12, 2024
b512a5c
shorten docstrings of private functions; typos in defaults; other sug…
MatthiasSchmidtblaicherQC Apr 12, 2024
05fd221
context docstring
MatthiasSchmidtblaicherQC Apr 12, 2024
2acdcbf
kwargs
lbittarello Apr 12, 2024
82cb60c
no context as default; small cleanups
MatthiasSchmidtblaicherQC Apr 15, 2024
517522b
add explanation to get calling scope
MatthiasSchmidtblaicherQC Apr 15, 2024
a121dbe
adjust to tabmat release
MatthiasSchmidtblaicherQC Apr 23, 2024
4412550
keep whitespace
MatthiasSchmidtblaicherQC Apr 23, 2024
18d5b0e
temporarily add tabmat_dev channel again to investigate env solving f…
MatthiasSchmidtblaicherQC Apr 26, 2024
aeeb19e
remove tabmat_dev channel again
MatthiasSchmidtblaicherQC Apr 26, 2024
74550c3
for now, disable conda build test on osx and Python 3.12
MatthiasSchmidtblaicherQC Apr 26, 2024
4b8b84c
Add a different environment for macos (#786)
MatthiasSchmidtblaicherQC Apr 26, 2024
2476652
Merge branch 'main' into glum-v3
MatthiasSchmidtblaicherQC Apr 26, 2024
745a6a0
replace deprecated scipy.sparse.*_matrix.A
MatthiasSchmidtblaicherQC Apr 26, 2024
9f57080
replace other instance of .A
MatthiasSchmidtblaicherQC Apr 26, 2024
891fed2
two more
MatthiasSchmidtblaicherQC Apr 26, 2024
d003e41
simply replace all instances of .A by .toarray() (tabmat knows both)
MatthiasSchmidtblaicherQC Apr 27, 2024
5fd62a0
update CHANGELOG for release
MatthiasSchmidtblaicherQC Apr 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions .github/workflows/conda-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ jobs:
fail-fast: false
matrix:
include:
- { conda_build_yml: linux_64_python3.9.____cpython, os: ubuntu-latest, conda-build-args: '' }
- { conda_build_yml: linux_64_python3.12.____cpython, os: ubuntu-latest, conda-build-args: '' }
- { conda_build_yml: osx_64_python3.9.____cpython, os: macos-latest, conda-build-args: '' }
- { conda_build_yml: osx_64_python3.12.____cpython, os: macos-latest, conda-build-args: '' }
- { conda_build_yml: osx_arm64_python3.10.____cpython, os: macos-latest, conda-build-args: ' --no-test' }
- { conda_build_yml: win_64_python3.9.____cpython, os: windows-latest, conda-build-args: '' }
- { conda_build_yml: win_64_python3.12.____cpython, os: windows-latest, conda-build-args: '' }
- { conda_build_yml: linux_64_python3.9.____cpython, os: ubuntu-latest, conda-build-args: ' -c conda-forge/label/tabmat_dev -c conda-forge' }
- { conda_build_yml: linux_64_python3.12.____cpython, os: ubuntu-latest, conda-build-args: ' -c conda-forge/label/tabmat_dev -c conda-forge' }
- { conda_build_yml: osx_64_python3.9.____cpython, os: macos-latest, conda-build-args: ' -c conda-forge/label/tabmat_dev -c conda-forge' }
- { conda_build_yml: osx_64_python3.12.____cpython, os: macos-latest, conda-build-args: ' -c conda-forge/label/tabmat_dev -c conda-forge' }
- { conda_build_yml: osx_arm64_python3.10.____cpython, os: macos-latest, conda-build-args: ' -c conda-forge/label/tabmat_dev -c conda-forge --no-test' }
- { conda_build_yml: win_64_python3.9.____cpython, os: windows-latest, conda-build-args: ' -c conda-forge/label/tabmat_dev -c conda-forge' }
- { conda_build_yml: win_64_python3.12.____cpython, os: windows-latest, conda-build-args: ' -c conda-forge/label/tabmat_dev -c conda-forge' }
steps:
- name: Checkout branch
uses: actions/checkout@v4
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/daily.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ jobs:
pip install --extra-index-url https://pypi.fury.io/arrow-nightlies/ --prefer-binary --pre --no-deps pyarrow
echo Install tabmat nightly
micromamba remove -y --force tabmat
pip install --no-use-pep517 --no-deps git+https://github.com/Quantco/tabmat
pip install --no-use-pep517 --no-deps git+https://github.com/Quantco/tabmat@tabmat-v4
- name: Install repository
shell: bash -el {0}
run: pip install --no-use-pep517 --no-deps --disable-pip-version-check -e .
Expand Down
21 changes: 18 additions & 3 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,23 @@
Changelog
=========

Unreleased
----------
3.0.0 - UNRELEASED
------------------

**Breaking change:**

- :class:`~glum.GeneralizedLinearRegressor`'s default value for `alpha` is now `0`, i.e. no regularization.

**New features:**

- Added a formula interface for specifying models.
- Improved feature name handling. Feature names are now created for non-pandas input matrices, too. Furthermore, the format of categorical features can be specified by the user.
- Term names are now stored in the model's attributes. This is useful for categorical features, where they refer to the whole variable, not just single levels.
- Added more options for treating missing values in categorical columns. They can either raise a `ValueError` (`"fail"`), be treated as all-zero indicators (`"zero"`) or represented as a new category (`"convert"`).
- `meth:GeneralizedLinearRegressor.wald_test` can now perform tests based on a formula string and term names.

2.7.0 - UNRELEASED
------------------

**Bug fix:**

Expand Down Expand Up @@ -38,7 +53,7 @@ Unreleased

- When computing the covariance matrix, check whether the design matrix is ill-conditioned for all types of input. Furthermore, do it in a more efficient way.
- Pin ``tabmat<4.0.0`` (the new release will bring breaking changes).

- Added the option to specify models using Wilkinson-formulas.

2.5.2 - 2023-06-02
------------------
Expand Down
11 changes: 10 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ Why did we choose the name `glum`? We wanted a name that had the letters GLM and
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> # .report_diagnostics shows details about the steps taken by the iterative solver.
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
Expand All @@ -79,6 +79,15 @@ n_iter
3 0.443681
4 0.443498
5 0.443497
>>>
>>> # Models can also be built with formulas from formulaic.
>>> model_formula = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001,
... formula="bedrooms + np.log(bathrooms + 1) + bs(sqft_living, 3) + C(waterfront)"
... )
>>> _ = model_formula.fit(X=house_data.data, y=y)

```

Expand Down
3 changes: 2 additions & 1 deletion conda.recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ requirements:
- pandas
- scikit-learn >=0.23
- scipy
- tabmat >=3.1.0, <4.0.0
- formulaic >=0.6
- tabmat >=4.0.0a3

test:
requires:
Expand Down
Loading
Loading