Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate multiple predictors and coefficients and outputs with a single function call #127

Open
kgoldfeld opened this issue Dec 30, 2021 · 3 comments
Assignees
Labels
feature feature request or enhancement

Comments

@kgoldfeld
Copy link
Owner

kgoldfeld commented Dec 30, 2021

I would like to implement some version of a function a created in a blog post a while back.

This is how I started the post: I was contacted about the possibility of creating a simple function in simstudy to generate a large data set that could include possibly 10’s or 100’s of potential predictors and an outcome. In this function, only a subset of the variables would actually be predictors. The idea is to be able to easily generate data for exploring ridge regression, Lasso regression, or other “regularization” methods. Alternatively, this can be used to very quickly generate correlated data (with one line of code) without going through the definition process.

In the post, I created function genMultPred. I would like to implement something similar to this in simstudy.

@kgoldfeld kgoldfeld added the feature feature request or enhancement label Dec 30, 2021
@kgoldfeld kgoldfeld self-assigned this Dec 30, 2021
@assignUser
Copy link
Collaborator

I have skimmed the blog post and it looks interesting, I'm guessing that they wanted to use this in an ML context?

My only issue with this is that it does not adhere to the usual API/workflow of simstudy, which is of course possible but we should think about how to handle these non-definition-table-functions so that we don't add a bunch of different function that all work differently and are hard to remember and maintain.

@kgoldfeld
Copy link
Owner Author

I agree - but that cat is already out of the bag, with functions like genOrdCat, genMarkov, and genSplines. I totally get your point, but this is something that could be quite useful to folks. Are you thinking it would be better in a different package, like simstudyExtra?

@assignUser
Copy link
Collaborator

I agree - but that cat is already out of the bag

😹 That is true,not sure how to improve that situation. I think an extra package is too much at this point, maybe we can homogenize the API of these functions in some way for simstudy 2.0? I'll think about it but I think it is a useful function!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants