This help file describes the distributions used for data creation in simstudy.

Arguments

formula

Desired mean as a Number or an R expression for mean as a String. Variables defined via defData() and variables within the parent environment (prefixed with ..) can be used within the formula. Functions from the parent environment can be used without a prefix.

variance

Number. Default is 0.

link

String identifying the link function to be used. Default is identity.

Details

For details about the statistical distributions please see stats::distributions, any non-statistical distributions will be explained below. Required variables and expected pattern for each distribution can be found in this table:

nameformulaformatvariancelink
betameanString or Numberdispersion valueidentity or logit
binaryprobability for 1String or NumberNAidentity or logit
binomialprobability of successString or Numbernumber of trialsidentity or logit
categoricalprobabilitiesp_1;p_2;..;p_nNANA
exponentialmean (lambda)String or NumberNAidentity or log
gammameanString or Numberdispersion valueidentity or log
mixtureformulax_1 |p_1 + x_2|p_2 ... x_n| p_nNANA
negBinomialmeanString or Numberdispersion valueidentity or log
nonrandomformulaString or NumberNANA
normalmeanString or NumbervarianceNA
noZeroPoissonmeanString or NumberNAidentity or log
poissonmeanString or NumberNAidentity or log
uniformrangefrom;toNANA
uniformIntrangefrom;toNANA

Mixture

The mixture distribution makes it possible to mix to previously defined distributions/variables. Each variable that should be part of the new distribution x_1,...,X_n is assigned a probability p_1,...,p_n. For more information see rdatagen.net.

Examples

ext_var <- 2.9 def <- defData(varname = "external", formula = "3 + log(..ext_var)", variance = .5) def
#> varname formula variance dist link #> 1: external 3 + log(..ext_var) 0.5 normal identity
genData(5, def)
#> id external #> 1: 1 4.864108 #> 2: 2 3.665887 #> 3: 3 5.448120 #> 4: 4 3.531613 #> 5: 5 5.689579