Convert GWAS summary statistics to a standard format

Format GWAS summary statistics for CAUSE

Usage

gwas_format(
  X,
  snp,
  beta_hat,
  se,
  A1,
  A2,
  chrom,
  pos,
  p_value,
  sample_size,
  allele_freq,
  output_file,
  compute_pval = TRUE
)

Arguments

X: data.frame
snp: Column name containing SNP ID
beta_hat: Column name containing effect estimate
se: Column name containing standard error of beta_hat
A1: Column name containing effect allele
A2: Column name containing other allele
chrom: Chromosome column (optional)
pos: Position column (optional)
p_value: p-value column (optional)
sample_size: Sample size column (optional) or an integer
output_file: File to write out formatted data. If missing formatted data will be returned.
compute_pval: Logical, compute the p-value using a normal approximation if missing? Defaults to TRUE.

Value

A data frame with columns chrom, pos, snp, A1, A2, beta_hat, se, p_value, and sample_size with all SNPs aligned so that A is the effect allele. This is ready to be used with gwas_merge with formatted = TRUE.

Details

This function will try to merge data sets X1 and X2 on the specified columns. Where necessary, it will flip the sign of effects so that the effect allele is the same in both data sets. It will remove variants with ambiguous alleles or where the alleles (G/C or A/T) or with alleles that do not match between data sets (e.g A/G in one data set and A/C in the other). It will not remove variants that are simply strand flipped between the two data sets (e. g. A/C in one data set, T/G in the other).