Simulate a dataset from log-ratio model.
Usage
simu(
n = 100,
p = 200,
model = "linear",
weak = 4,
strong = 6,
weaksize = 0.125,
strongsize = 0.25,
pct.sparsity = 0.5,
rho = 0,
timedep_slope = NULL,
timedep_cor = NULL,
longitudinal_stability = TRUE,
ncov = 0,
betacov = 0,
intercept = FALSE
)
Arguments
- n
An integer of sample size
- p
An integer of number of features (taxa).
- model
Type of models associated with outcome variable, can be "linear", "binomial", "cox", "finegray", or "timedep" (survival endpoint with time-dependent features).
- weak
Number of features with
weak
effect size.- strong
Number of features with
strong
effect size.- weaksize
Actual effect size for
weak
effect size. Must be positive.- strongsize
Actual effect size for
strong
effect size. Must be positive.- pct.sparsity
Percentage of zero counts for each sample.
- rho
Parameter controlling the correlated structure between taxa. Ranges between 0 and 1.
- timedep_slope
If
model
is "timedep", this parameter specifies the slope for the feature trajectories. Please refer to the Simulation section of the manuscript for more details.- timedep_cor
If
model
is "timedep", this parameter specifies the sample-wise correlations between longitudinal features. Please refer to the Simulation section of the manuscript for more details.- longitudinal_stability
If
model
is "timedep", this is a binary indicator which determines whether the trajectories are more stable (TRUE
) or more volatile (FALSE
).- ncov
Number of covariates that are not compositional features.
- betacov
Coefficients corresponding to the covariates that are not compositional features.
- intercept
Boolean. If TRUE, then a random intercept will be generated in the model. Only works for
linear
orbinomial
models.
Value
A list with simulated count matrix xcount
, log1p-transformed count matrix x
, outcome (continuous y
, continuous centered y0
, binary y
, or survival t
, d
), true coefficient vector beta
, list of non-zero features idx
, value of intercept intercept
(if applicable).
References
Fei T, Funnell T, Waters N, Raj SS et al. Enhanced Feature Selection for Microbiome Data using FLORAL: Scalable Log-ratio Lasso Regression bioRxiv 2023.05.02.538599.
Examples
set.seed(23420)
dat <- simu(n=50,p=30,model="linear")