Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

method='LU' memory use in spreg.ML_Error()? #82

Open
rsbivand opened this issue Sep 2, 2021 · 7 comments
Open

method='LU' memory use in spreg.ML_Error()? #82

rsbivand opened this issue Sep 2, 2021 · 7 comments

Comments

@rsbivand
Copy link

rsbivand commented Sep 2, 2021

I was trying to fit an ML error model with 71' observations, but memory use grew very quickly. This may be because some kind of parallelization comes into play (it shouldn't), but it feels more like the weights matrix going dense. The messages before I killed the process were:

> py_mlerror <- spr$ML_Error(y, X, w=nb_q0, method="LU")
/usr/lib64/python3.9/site-packages/scipy/optimize/_minimize.py:779: RuntimeWarning: Method 'bounded' does not support relative tolerance in x; defaulting to absolute tolerance.
  warn("Method 'bounded' does not support relative tolerance in x; "
/usr/lib64/python3.9/site-packages/scipy/sparse/_index.py:125: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
  self._set_arrayXarray(i, j, x)
/usr/lib64/python3.9/site-packages/scipy/sparse/linalg/dsolve/linsolve.py:318: SparseEfficiencyWarning: splu requires CSC matrix format
  warn('splu requires CSC matrix format', SparseEfficiencyWarning)
/usr/lib64/python3.9/site-packages/scipy/sparse/linalg/dsolve/linsolve.py:215: SparseEfficiencyWarning: spsolve is more efficient when sparse b is in the CSC matrix format
  warn('spsolve is more efficient when sparse b '

I think the sparse weights matrix is CSR not CSC. Is the problem in the densifying of the variance covariance matrix? About line

a = -self.lam * W
? Could a finite difference Hessian help? Does spinv() go dense on return?

@ljwolf
Copy link
Member

ljwolf commented Sep 2, 2021

Does this happen with other backends? If not, it may be specific to the scipy.sparse.SuperLU solver.

Could you post the gal? I'm happy to generate fake data to correspond to it.

@ljwolf
Copy link
Member

ljwolf commented Sep 2, 2021

Does spinv() go dense on return?

It should remain sparse, since the input remains sparse at line 192, but I can instrument this.

@pedrovma
Copy link
Member

pedrovma commented Sep 2, 2021

Does spinv() go dense on return?

It should remain sparse, since the input remains sparse at line 192, but I can instrument this.

If the input is a sparse object, spinv() will return a sparse object, as can be seen on sputils.py, lines 233-237.

    if SP.issparse(a):
        ai = SPla.inv(a)
    else:
        ai = la.inv(a)
    return ai

However, depending on the matrix, its inverse will be dense, albeit still a sparse object.

@rsbivand
Copy link
Author

rsbivand commented Sep 3, 2021

Yes, (I - \lambda W)^{-1}$ is dense by definition, so spinv() returns a sparse matrix with all elements non-zero. @ljwolf - here is the compressed GAL file.
large_nb.zip

I think that for method='LU', the variance covariance matrix for the betas is all that can be returned if N is large. An LR test of OLS against ML tests for the inclusion of \lambda.

@ljwolf
Copy link
Member

ljwolf commented Sep 3, 2021

Does spinv() go dense on return?

It should remain sparse, since the input remains sparse at line 192, but I can instrument this.

If the input is a sparse object, spinv() will return a sparse object, as can be seen on sputils.py, lines 233-237.

    if SP.issparse(a):
        ai = SPla.inv(a)
    else:
        ai = la.inv(a)
    return ai

However, depending on the matrix, its inverse will be dense, albeit still a sparse object.

Ahh, I see.... If that's the case, then, @rsbivand's suggestion is very reasonable, I think. We should avoid the rest of the baseclass (e.g, avoid taking the spinv(ai)) for folks who request method="LU".

My hope for implementing LU was that it should avoid instantiating a dense inverse, both during the optimization step and elsewhere. The ord and full methods immediately instantiate w.full().

We'd need to do this as well for ml_lag.py, which is also densified in all methods except LU.

@ljwolf
Copy link
Member

ljwolf commented Sep 3, 2021

Yes, (I - \lambda W)^{-1}$ is dense by definition, so spinv() returns a sparse matrix with all elements non-zero. @ljwolf - here is the compressed GAL file.
large_nb.zip

I think that for method='LU', the variance covariance matrix for the betas is all that can be returned if N is large. An LR test of OLS against ML tests for the inclusion of \lambda.

Thank you for the file! Will use for the test case for this.

@rsbivand
Copy link
Author

rsbivand commented Sep 3, 2021

The way around this for large N may be an FD Hessian if one (that works) is available in a Python package you already use. Maybe from a numerical optimizer? Unlike the line search, it starts from the \lambda and \beta we have, and searches close to them to estimate the joint distribution shape. Here we only need \lamba (and maybe \sigma^2) because there are no interactions with the var-covar of the \betas.

While you are looking at that bit of ML_Error, maybe consider adding the Pace-Lesage Hausman test at least for 'full' and 'ord'. For 'LU', it is possible to approximate by powering \lambda W x (if powers guarantee a dampened series, no explosions please!) for an x vector or matrix, as the powered W never gets stored, only cumulated in the target vector (up to an arbitrary cutoff). Unfortunately, as it tests for differences between OLS and *_error \betas, it often finds mis-specification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants