R/generate_data.R
genDataDensity.Rd
Data are generated from an a density defined by a vector of integers
genDataDensity(
n,
dataDist,
varname,
uselimits = FALSE,
id = "id",
na.rm = TRUE
)
Integer. Number of samples to draw from the density.
Numeric vector. Defines the desired density.
Character. Name of the variable.
Logical. If TRUE, the minimum and maximum of the input data vector are used as limits for sampling. Defaults to FALSE, in which case a smoothed density that extends beyond these limits is used.
Character. A string specifying the field that serves as the record ID. The default field is "id".
Logical. If TRUE (default), missing values in `dataDist` are removed. If FALSE, the data will retain the same proportion of missing values.
A data table with the generated data
data_dist <- c(1, 2, 2, 3, 4, 4, 4, 5, 6, 6, 7, 7, 7, 8, 9, 10, 10)
genDataDensity(500, data_dist, varname = "x1", id = "id")
#> Key: <id>
#> id x1
#> <int> <num>
#> 1: 1 6.921986
#> 2: 2 3.706854
#> 3: 3 9.234597
#> 4: 4 6.103531
#> 5: 5 10.658962
#> ---
#> 496: 496 6.691994
#> 497: 497 6.571446
#> 498: 498 10.888954
#> 499: 499 7.930779
#> 500: 500 5.261284
genDataDensity(500, data_dist, varname = "x1", uselimits = TRUE, id = "id")
#> Key: <id>
#> id x1
#> <int> <num>
#> 1: 1 2.379838
#> 2: 2 6.618362
#> 3: 3 3.917192
#> 4: 4 7.605761
#> 5: 5 5.332133
#> ---
#> 496: 496 6.368137
#> 497: 497 6.432943
#> 498: 498 7.343834
#> 499: 499 9.681368
#> 500: 500 4.174617