Sample individual level data with joint effects matching a sim_mv object
Source:R/resample_inddata.R
resample_inddata.Rd
Sample individual level data with joint effects matching a sim_mv object
Usage
resample_inddata(
N,
dat = NULL,
genos = NULL,
J = NULL,
R_LD = NULL,
af = NULL,
sim_func = gen_genos_mvn,
new_env_var = NULL,
new_h2 = NULL,
new_R_E = NULL,
new_R_obs = NULL,
calc_sumstats = FALSE
)
Arguments
- N
Sample size, scalar, vector, or special sample size format data frame, see details.
- dat
An object of class
sim_mv
(produced bysim_mv
). If `dat` is omitted, the function will generate a matrix of genotypes only. If `dat` is provided, phenotypes for the traits in `dat` will also be included.- genos
Optional matrix of pre-generated genotypes. If
genos
is supplied,resample_inddata
will only generate phenotypes.- J
Optional number of variants.
J
is only required ifdat
is missing.- R_LD
LD pattern (optional). See
?sim_mv
for more details.- af
Allele frequencies.
af
is required unless unlessgenos
is supplied.- new_env_var
Optional. The environmental variance in the new population. If missing the function will assume the environmental variance is the same as in the old population.
- new_h2
Optional. The heritability in the new population. Provide at most one of
new_env_var
andnew_h2
.- new_R_E
Optional, specify environmental correlation in the new population. If missing, the function will assume the environmental correlation is the same as in the original data.
- new_R_obs
Optional, specify overall trait correlation in the new population. Specify at most one of
new_R_E
ornew_R_obs
. If missing, the function will assume the environmental correlation is the same as in the original data.- calc_sumstats
If
TRUE
, associations between genotypes and phenotypes will be calculated and returned.
Details
This function can be used to generate individual level genotype and phenotype data. It can be used in three modes:
To generate genotype data only: No sim_mv
object needs to be included. Supply only N
as a single integer for the
number of individuals, J
for the number of variants, af
, and R_LD
if desired. All other
parameters are not relevant if there is no phenotype, so if they are supplied, you will get an error. The returned object will include a
N x J
matrix of genotypes and a vector of allele frequencies.
To generate both genotype and phenotype data: Supply dat
(a sim_mv
object) and leave genos
missing. N
and af
are required and all other options are optional.
To generate phenotype data only: Supply dat
(a sim_mv
object) and provide a matrix of genotypes to the genos
argument. The number of
rows in genos
must be equal to the total number of individuals implied by N
.
So for example, if there are two traits with 10 samples each and no overlap, genos
should have 20 rows. The R_LD
and af
arguments should contain the population
LD and allele frequencies used to produce the genotypes. These are used to compute the genetic variance-covariance matrix.
N
and af
are required and all other options are optional.
Examples
# Use resample_inddata to generate genotypes only
simple_ld <- matrix(0.5, nrow = 5, ncol = 5)
diag(simple_ld) <- 1
genos_only <- resample_inddata(N = 8,
J = 20,
R_LD = list(simple_ld),
af = rep(0.3, 5))
#> Loading required package: hapsim
#> Loading required package: MASS
#> Generating genotype matrix only.
# generate genotypes and phenotypes
dat <- sim_mv(N = 0,
G = 1,
J = 20,
pi = 0.5,
h2 = 0.05,
R_LD = list(simple_ld),
af = rep(0.3, 5))
#> SNP effects provided for 20 SNPs and 1 traits.
genos_and_phenos <- resample_inddata(dat = dat,
N = 8,
R_LD = list(simple_ld),
af = rep(0.3, 5))
#> Generating both genotypes and phenotypes.
#> SNP effects provided for 20 SNPs and 1 traits.
#> Genetic variance in the new population differs from the genetic variance in the old population.
#> I will assume that the environmental variance is the same in the old and new population.
#> I will assume that environmental correlation is the same in the old and new population. Note that this could result in different overall trait correlations.
# generate phenos only
phenos_only <- resample_inddata(dat = dat,
genos = genos_only$X,
N = 8,
R_LD = list(simple_ld),
af = rep(0.3, 5))
#> Generating phenotypes only.
#> SNP effects provided for 20 SNPs and 1 traits.
#> Genetic variance in the new population differs from the genetic variance in the old population.
#> I will assume that the environmental variance is the same in the old and new population.
#> I will assume that environmental correlation is the same in the old and new population. Note that this could result in different overall trait correlations.