Create correlated data

genCorData(
  n,
  mu,
  sigma,
  corMatrix = NULL,
  rho,
  corstr = "ind",
  cnames = NULL,
  idname = "id"
)

Arguments

n

Number of observations

mu

A vector of means. The length of mu must be nvars.

sigma

Standard deviation of variables. If standard deviation differs for each variable, enter as a vector with the same length as the mean vector mu. If the standard deviation is constant across variables, as single value can be entered.

corMatrix

Correlation matrix can be entered directly. It must be symmetrical and positive semi-definite. It is not a required field; if a matrix is not provided, then a structure and correlation coefficient rho must be specified.

rho

Correlation coefficient, -1 <= rho <= 1. Use if corMatrix is not provided.

corstr

Correlation structure of the variance-covariance matrix defined by sigma and rho. Options include "ind" for an independence structure, "cs" for a compound symmetry structure, and "ar1" for an autoregressive structure.

cnames

Explicit column names. A single string with names separated by commas. If no string is provided, the default names will be V#, where # represents the column.

idname

The name of the index id name. Defaults to "id."

Value

A data.table with n rows and the k + 1 columns, where k is the number of means in the vector mu.

Examples

mu <- c(3, 8, 15)
sigma <- c(1, 2, 3)

corMat <- matrix(c(1, .2, .8, .2, 1, .6, .8, .6, 1), nrow = 3)

dtcor1 <- genCorData(1000, mu = mu, sigma = sigma, rho = .7, corstr = "cs")
dtcor2 <- genCorData(1000, mu = mu, sigma = sigma, corMatrix = corMat)

dtcor1
#> Key: <id>
#>          id       V1        V2       V3
#>       <int>    <num>     <num>    <num>
#>    1:     1 1.916677  7.757074 14.27253
#>    2:     2 4.050262  8.437462 15.40863
#>    3:     3 4.213073 10.506273 20.85371
#>    4:     4 3.273890 10.538799 17.32416
#>    5:     5 2.277470  7.527803 12.02371
#>   ---                                  
#>  996:   996 3.884086 10.527082 16.25926
#>  997:   997 3.291662  6.878752 15.52653
#>  998:   998 3.675540  9.156428 18.98782
#>  999:   999 3.064628  5.729347 16.10881
#> 1000:  1000 3.396324  8.167362 14.20209
dtcor2
#> Key: <id>
#>          id       V1        V2       V3
#>       <int>    <num>     <num>    <num>
#>    1:     1 1.434526 10.638850 12.90710
#>    2:     2 3.688735 10.542666 19.22236
#>    3:     3 4.739764  8.660715 17.35989
#>    4:     4 3.171097  8.833131 16.50428
#>    5:     5 2.891700  8.491296 13.76842
#>   ---                                  
#>  996:   996 3.351541  7.919003 15.17333
#>  997:   997 2.865899  9.738079 17.07419
#>  998:   998 2.404324 11.875803 17.07040
#>  999:   999 2.634460  4.532410 13.98190
#> 1000:  1000 1.564944  9.248632 13.21610

round(var(dtcor1[, .(V1, V2, V3)]), 3)
#>       V1    V2    V3
#> V1 0.985 1.470 2.085
#> V2 1.470 4.310 4.513
#> V3 2.085 4.513 9.105
round(cor(dtcor1[, .(V1, V2, V3)]), 2)
#>      V1   V2   V3
#> V1 1.00 0.71 0.70
#> V2 0.71 1.00 0.72
#> V3 0.70 0.72 1.00

round(var(dtcor2[, .(V1, V2, V3)]), 3)
#>       V1    V2    V3
#> V1 1.002 0.318 2.287
#> V2 0.318 3.740 3.096
#> V3 2.287 3.096 8.169
round(cor(dtcor2[, .(V1, V2, V3)]), 2)
#>      V1   V2   V3
#> V1 1.00 0.16 0.80
#> V2 0.16 1.00 0.56
#> V3 0.80 0.56 1.00