Add correlated data to existing data.table
addCorData(
dtOld,
idname,
mu,
sigma,
corMatrix = NULL,
rho,
corstr = "ind",
cnames = NULL
)
Data table that is the new columns will be appended to.
Character name of id field, defaults to "id".
A vector of means. The length of mu must be nvars.
Standard deviation of variables. If standard deviation differs for each variable, enter as a vector with the same length as the mean vector mu. If the standard deviation is constant across variables, as single value can be entered.
Correlation matrix can be entered directly. It must be symmetrical and positive semi-definite. It is not a required field; if a matrix is not provided, then a structure and correlation coefficient rho must be specified.
Correlation coefficient, -1 <= rho <= 1. Use if corMatrix is not provided.
Correlation structure of the variance-covariance matrix defined by sigma and rho. Options include "ind" for an independence structure, "cs" for a compound symmetry structure, and "ar1" for an autoregressive structure.
Explicit column names. A single string with names separated by commas. If no string is provided, the default names will be V#, where # represents the column.
The original data table with the additional correlated columns
def <- defData(varname = "xUni", dist = "uniform", formula = "10;20", id = "myID")
def <- defData(def,
varname = "xNorm", formula = "xUni * 2", dist = "normal",
variance = 8
)
dt <- genData(250, def)
mu <- c(3, 8, 15)
sigma <- c(1, 2, 3)
dtAdd <- addCorData(dt, "myID",
mu = mu, sigma = sigma,
rho = .7, corstr = "cs"
)
dtAdd
#> Key: <myID>
#> myID xUni xNorm V1 V2 V3
#> <int> <num> <num> <num> <num> <num>
#> 1: 1 18.71272 40.45995 2.7635129 8.619277 17.375267
#> 2: 2 14.41956 28.60571 1.4287025 5.142282 8.795482
#> 3: 3 18.30312 39.64001 3.8848843 8.747522 15.071081
#> 4: 4 15.44651 30.88715 4.2535092 8.770428 19.759517
#> 5: 5 12.37611 22.32004 3.0483489 9.851409 17.994274
#> ---
#> 246: 246 12.53342 25.69416 2.1163018 7.070542 13.418117
#> 247: 247 17.89778 35.65456 1.4176411 5.732233 10.919709
#> 248: 248 16.37297 35.87852 2.3538621 8.981300 13.203232
#> 249: 249 14.28580 25.76195 0.8940699 2.449442 9.448039
#> 250: 250 12.44104 30.32635 2.9852786 8.897935 13.380629
round(var(dtAdd[, .(V1, V2, V3)]), 3)
#> V1 V2 V3
#> V1 0.912 1.320 1.994
#> V2 1.320 3.761 3.969
#> V3 1.994 3.969 8.924
round(cor(dtAdd[, .(V1, V2, V3)]), 2)
#> V1 V2 V3
#> V1 1.00 0.71 0.70
#> V2 0.71 1.00 0.69
#> V3 0.70 0.69 1.00
dtAdd <- addCorData(dt, "myID",
mu = mu, sigma = sigma,
rho = .7, corstr = "ar1"
)
round(cor(dtAdd[, .(V1, V2, V3)]), 2)
#> V1 V2 V3
#> V1 1.00 0.68 0.50
#> V2 0.68 1.00 0.71
#> V3 0.50 0.71 1.00
corMat <- matrix(c(1, .2, .8, .2, 1, .6, .8, .6, 1), nrow = 3)
dtAdd <- addCorData(dt, "myID",
mu = mu, sigma = sigma,
corMatrix = corMat
)
round(cor(dtAdd[, .(V1, V2, V3)]), 2)
#> V1 V2 V3
#> V1 1.00 0.14 0.81
#> V2 0.14 1.00 0.54
#> V3 0.81 0.54 1.00