Add correlated data to existing data.table
addCorData(
dtOld,
idname,
mu,
sigma,
corMatrix = NULL,
rho,
corstr = "ind",
cnames = NULL
)
Data table that is the new columns will be appended to.
Character name of id field, defaults to "id".
A vector of means. The length of mu must be nvars.
Standard deviation of variables. If standard deviation differs for each variable, enter as a vector with the same length as the mean vector mu. If the standard deviation is constant across variables, as single value can be entered.
Correlation matrix can be entered directly. It must be symmetrical and positive semi-definite. It is not a required field; if a matrix is not provided, then a structure and correlation coefficient rho must be specified.
Correlation coefficient, -1 <= rho <= 1. Use if corMatrix is not provided.
Correlation structure of the variance-covariance matrix defined by sigma and rho. Options include "ind" for an independence structure, "cs" for a compound symmetry structure, and "ar1" for an autoregressive structure.
Explicit column names. A single string with names separated by commas. If no string is provided, the default names will be V#, where # represents the column.
The original data table with the additional correlated columns
def <- defData(varname = "xUni", dist = "uniform", formula = "10;20", id = "myID")
def <- defData(def,
varname = "xNorm", formula = "xUni * 2", dist = "normal",
variance = 8
)
dt <- genData(250, def)
mu <- c(3, 8, 15)
sigma <- c(1, 2, 3)
dtAdd <- addCorData(dt, "myID",
mu = mu, sigma = sigma,
rho = .7, corstr = "cs"
)
dtAdd
#> myID xUni xNorm V1 V2 V3
#> 1: 1 16.15367 35.03730 1.883515 5.411752 10.25452
#> 2: 2 13.05035 26.64136 1.833143 7.015956 15.62153
#> 3: 3 14.06781 30.45742 1.467921 8.785178 14.51930
#> 4: 4 14.92061 30.17468 2.215781 6.163279 12.18902
#> 5: 5 16.39664 34.84301 2.677982 6.872958 13.59774
#> ---
#> 246: 246 19.67817 46.16123 2.114978 6.353228 14.72503
#> 247: 247 12.97295 24.62927 2.327359 7.395162 10.68424
#> 248: 248 19.70675 41.63103 3.884023 9.288276 17.62831
#> 249: 249 17.37592 32.50707 2.000462 8.102178 13.31731
#> 250: 250 12.99690 21.80780 4.073380 8.475931 20.29220
round(var(dtAdd[, .(V1, V2, V3)]), 3)
#> V1 V2 V3
#> V1 1.022 1.365 2.068
#> V2 1.365 3.420 3.690
#> V3 2.068 3.690 8.571
round(cor(dtAdd[, .(V1, V2, V3)]), 2)
#> V1 V2 V3
#> V1 1.00 0.73 0.70
#> V2 0.73 1.00 0.68
#> V3 0.70 0.68 1.00
dtAdd <- addCorData(dt, "myID",
mu = mu, sigma = sigma,
rho = .7, corstr = "ar1"
)
round(cor(dtAdd[, .(V1, V2, V3)]), 2)
#> V1 V2 V3
#> V1 1.00 0.66 0.46
#> V2 0.66 1.00 0.68
#> V3 0.46 0.68 1.00
corMat <- matrix(c(1, .2, .8, .2, 1, .6, .8, .6, 1), nrow = 3)
dtAdd <- addCorData(dt, "myID",
mu = mu, sigma = sigma,
corMatrix = corMat
)
round(cor(dtAdd[, .(V1, V2, V3)]), 2)
#> V1 V2 V3
#> V1 1.00 0.24 0.82
#> V2 0.24 1.00 0.60
#> V3 0.82 0.60 1.00