Skip to content

Commit

Permalink
Merge pull request #76 from kgoldfeld/joss-submission
Browse files Browse the repository at this point in the history
Joss submission
  • Loading branch information
kgoldfeld authored Oct 26, 2020
2 parents 75b5d8d + b2d3382 commit 529ee03
Show file tree
Hide file tree
Showing 12 changed files with 794 additions and 16 deletions.
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,5 @@
^tests/\.lintr$
^File_management$
^simstudy\.code-workspace$
^codemeta\.json$
^paper$
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Type: Package
Package: simstudy
Title: Simulation of Study Data
Version: 0.2.1.9000
Date: 2020-10-07
Version: 0.2.2
Date: 2020-10-26
Authors@R:
c(person(given = "Keith",
family = "Goldfeld",
Expand Down
3 changes: 2 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# simstudy (development version)
# simstudy 0.2.2
* Improve documentation and vignettes.

# simstudy 0.2.1
* Add 'backports' for compatibility with R < 4.0
Expand Down
4 changes: 3 additions & 1 deletion R/add_correlated_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -292,13 +292,15 @@ addCorFlex <- function(dt, defs, rho = 0, tau = NULL, corstr = "cs",
#' @param method Two methods are available to generate correlated data. (1) "copula" uses
#' the multivariate Gaussian copula method that is applied to all other distributions; this
#' applies to all available distributions. (2) "ep" uses an algorithm developed by
#' Emrich and Piedmonte.
#' Emrich and Piedmonte (1991).
#' @param formSpec The formula (as a string) that was used to generate the binary
#' outcome in the `defDataAdd` statement. This is only necessary when method "ep" is
#' requested.
#' @param periodvar A string value that indicates the name of the field that indexes
#' the repeated measurement for an individual unit. The value defaults to "period".
#' @return Original data.table with added column(s) of correlated data
#' @references Emrich LJ, Piedmonte MR. A Method for Generating High-Dimensional
#' Multivariate Binary Variates. The American Statistician 1991;45:302-4.
#' @examples
#' # Wide example
#'
Expand Down
4 changes: 3 additions & 1 deletion R/generate_correlated_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -250,10 +250,12 @@ genCorFlex <- function(n, defs, rho = 0, tau = NULL, corstr = "cs", corMatrix =
#' @param method Two methods are available to generate correlated data. (1) "copula" uses
#' the multivariate Gaussian copula method that is applied to all other distributions; this
#' applies to all available distributions. (2) "ep" uses an algorithm developed by
#' Emrich and Piedmonte.
#' Emrich and Piedmonte (1991).
#' @param idname Character value that specifies the name of the id variable.
#'
#' @return data.table with added column(s) of correlated data
#' @references Emrich LJ, Piedmonte MR. A Method for Generating High-Dimensional
#' Multivariate Binary Variates. The American Statistician 1991;45:302-4.
#' @examples
#' set.seed(23432)
#' l <- c(8, 10, 12)
Expand Down
4 changes: 3 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ knitr::opts_chunk$set(
<!-- badges: start -->
[![R build status](https://github.com/kgoldfeld/simstudy/workflows/R-CMD-check/badge.svg?branch=main)](https://github.com/kgoldfeld/simstudy/actions){target="_blank"}
[![CRAN status](https://www.r-pkg.org/badges/version/simstudy)](https://CRAN.R-project.org/package=simstudy){target="_blank"}
[![status](https://joss.theoj.org/papers/640fd4333948933b2817343e86df3424/status.svg)](https://joss.theoj.org/papers/640fd4333948933b2817343e86df3424){target="_blank"}
[![CRAN downloads](https://cranlogs.r-pkg.org/badges/grand-total/simstudy)](https://CRAN.R-project.org/package=simstudy){target="_blank"}
[![codecov](https://codecov.io/gh/kgoldfeld/simstudy/branch/main/graph/badge.svg)](https://codecov.io/gh/kgoldfeld/simstudy){target="_blank"}
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://www.tidyverse.org/lifecycle/#stable){target="_blank"}
Expand All @@ -25,7 +26,8 @@ The `simstudy` package is a collection of functions that allow users to generate

Simulation using `simstudy` has two fundamental steps. The user (1) **defines** the data elements of a data set and (2) **generates** the data based on these definitions. Additional functionality exists to simulate observed or randomized **treatment assignment/exposures**, to create **longitudinal/panel** data, to create **multi-level/hierarchical** data, to create datasets with **correlated variables** based on a specified covariance structure, to **merge** datasets, to create data sets with **missing** data, and to create non-linear relationships with underlying **spline** curves.

The overarching philosophy of `simstudy` is to create data generating processes that mimic the typical models used to fit those types of data. So, the parameterization of some of the data generating processes may not follow the standard parameterizations for the specific distributions. For example, in `simstudy` *gamma*-distributed data are generated based on the specification of a mean &mu; (or log(&mu;)) and a dispersion $d$, rather than shape &alpha; and rate &beta; parameters that more typically characterize the *gamma* distribution. When we estimate the parameters, we are modeling &mu; (or some function of &mu;), so we should explicitly recover the `simstudy` parameters used to generate the model, thus illuminating the relationship between the underlying data generating processes and the models.
The overarching philosophy of `simstudy` is to create data generating processes that mimic the typical models used to fit those types of data. So, the parameterization of some of the data generating processes may not follow the standard parameterizations for the specific distributions. For example, in `simstudy` *gamma*-distributed data are generated based on the specification of a mean &mu; (or log(&mu;)) and a dispersion $d$, rather than shape &alpha; and rate &beta; parameters that more typically characterize the *gamma* distribution. When we estimate the parameters, we are modeling &mu; (or some function of &mu;), so we should explicitly recover the `simstudy` parameters used to generate the model, thus illuminating the relationship between the underlying data generating processes and the models. For more details on the
package, use cases, examples, and function reference see the [documentation page](https://kgoldfeld.github.io/simstudy/articles/simstudy.html).


## Installation
Expand Down
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ simstudy
status](https://github.com/kgoldfeld/simstudy/workflows/R-CMD-check/badge.svg?branch=main)](https://github.com/kgoldfeld/simstudy/actions)
[![CRAN
status](https://www.r-pkg.org/badges/version/simstudy)](https://CRAN.R-project.org/package=simstudy)
[![status](https://joss.theoj.org/papers/640fd4333948933b2817343e86df3424/status.svg)](https://joss.theoj.org/papers/640fd4333948933b2817343e86df3424)
[![CRAN
downloads](https://cranlogs.r-pkg.org/badges/grand-total/simstudy)](https://CRAN.R-project.org/package=simstudy)
[![codecov](https://codecov.io/gh/kgoldfeld/simstudy/branch/main/graph/badge.svg)](https://codecov.io/gh/kgoldfeld/simstudy)
Expand Down Expand Up @@ -48,7 +49,9 @@ typically characterize the *gamma* distribution. When we estimate the
parameters, we are modeling μ (or some function of μ), so we should
explicitly recover the `simstudy` parameters used to generate the model,
thus illuminating the relationship between the underlying data
generating processes and the models.
generating processes and the models. For more details on the package,
use cases, examples, and function reference see the [documentation
page](https://kgoldfeld.github.io/simstudy/articles/simstudy.html).

## Installation

Expand Down Expand Up @@ -83,16 +86,16 @@ dd <- trtAssign(dd, nTrt = 4, grpName = "grp", balanced = TRUE)
dd
#> id x y grp
#> 1: 1 11.191960 8.949389 4
#> 2: 2 10.418375 7.372060 2
#> 3: 3 8.512109 6.925844 4
#> 2: 2 10.418375 7.372060 4
#> 3: 3 8.512109 6.925844 3
#> 4: 4 11.361632 9.850340 4
#> 5: 5 9.928811 6.515463 2
#> 5: 5 9.928811 6.515463 4
#> ---
#> 246: 246 8.220609 7.898416 4
#> 247: 247 8.531483 8.681783 4
#> 248: 248 10.507370 8.552350 4
#> 246: 246 8.220609 7.898416 2
#> 247: 247 8.531483 8.681783 2
#> 248: 248 10.507370 8.552350 3
#> 249: 249 8.621339 6.652300 1
#> 250: 250 9.508164 7.083845 4
#> 250: 250 9.508164 7.083845 3
```

## Contributing & Support
Expand Down
Loading

0 comments on commit 529ee03

Please sign in to comment.