Genetic models#

Bayesian alphabet framework#

Summary of Effect Size Distributions for Polygenic Modeling#

Models

Effect Size Distribution

Formula

BayesA

\(\beta_i \sim t(\nu, \sigma^2_{\alpha})\)

t Distribution

BayesB, BayesDπ

\(\beta_i \sim \pi t(0, \nu, \sigma^2_{\alpha}) + (1 - \pi) \delta_0\)

Point-t Distribution

BayesC

\(\beta_i \sim \pi t(0, \nu, \sigma^2_{\alpha}) + (1 - \pi) t(0, \nu, 0.01\sigma^2_{\alpha})\)

t Mixture

BayesCπ , BVSR

\(\beta_i \sim \pi N(0, \sigma^2_{\alpha}) + (1 - \pi) \delta_0\)

Point-Normal Distribution

Bayesian Lasso

\(\beta_i \sim DE(0, \theta)\)

Double Exponential Distribution

BayesR , SBayesR , SBayesRC

\[\beta_i \sim \pi_1 N(0, \sigma^2_{\alpha}) + \pi_2 N(0, 0.1\sigma^2_{\alpha}) + \pi_3 N(0, 0.01\sigma^2_{\alpha}) + (1 - \sum^{3}_{c = 1} \pi_c) \delta_0\]

Point-Normal Mixture

LMM, BLUP, Ridge Regression Ridge theory

\(\beta_i \sim N(0, \sigma^2_{\alpha})\)

Normal Distribution

NEG

\(\beta_i \sim NEG(0, \kappa, \theta)\)

Normal-Exponential-Gamma

BSLMM

\(\beta_i \sim \pi N(0, \sigma^2_{\alpha} + \sigma^2_{\beta}) + (1 - \pi) N(0, \sigma^2_{\beta})\)

Normal Mixture

SBayesS

\(\beta_i \sim \pi N(0,[1p_j(1-p_j)]^S \sigma^2_{\beta}) + \phi(1 - \pi)\)

Normal Mixture

Bayesian Polygenic Models and Their Inference Methods#

Models

Inference

BayesA

Gibbs sampling with Student-t prior on SNP effects. Variance components sampled via inverse-gamma.

BayesB

Gibbs sampling with a mixture prior (point mass at zero + Student-t). Requires latent variable updates.

BayesC

Gibbs sampling with a mixture prior (point mass at zero + Normal). SNP effects and variance components are sampled iteratively.

BayesCπ

Similar to BayesC, but includes estimation of π (the proportion of nonzero SNPs) within the Gibbs sampling framework.

BayesR

Gibbs sampling with a multi-component normal mixture prior (e.g., strong, moderate, weak effects + zero).

Bayesian Lasso

Uses Gibbs sampling with an auxiliary variable formulation for the Laplace prior.

NEG (Normal-Exponential-Gamma)

Hierarchical shrinkage prior estimated via Gibbs sampling.

BSLMM (Bayesian Sparse Linear Mixed Model)

Uses a mix of Gibbs sampling and variational inference, depending on implementation.

LMM (Linear Mixed Model)

Typically solved using REML (Restricted Maximum Likelihood) or Expectation-Maximization (EM), not Gibbs.

LDSC framework and its extension#

LD score regression and its extension#

Method

Formula

Key contribution

LDSC

\(E[\chi^2 | \ell_j] = N h^2 \frac{\ell_j}{M} + N \alpha + 1\)

\(\chi^{2}_{j}=N\hat{\beta}^2_j,\quad\ell_j := \sum_{k=1}^{M} r_{jk}^2\)

HDL

\(\text{Cov}[\mathbf{z}] = \frac{N h^2}{M} \mathbf{L} + \mathbf{R}\)

\(\mathbf{L} := \mathbf{R}^\top \mathbf{R}\)

HDL-L

\(\text{Cov}[\mathbf{z}_1, \mathbf{z}_2] = \frac{\sqrt{N_1 N_2} h_{12}}{M} \mathbf{L}\)

\(\mathcal{R}(h_{12})\equiv \frac{\mathcal{L} \left( h_{12} \mid \mathbf{z}, \hat{h}_1^2, \hat{h}_2^2 \right)}{\mathcal{L} \left( \hat{h}_{12} \mid \mathbf{z}, \hat{h}_1^2, \hat{h}_2^2 \right)}\)

S-LDSC

\(E\left[\chi^2_j\right] = N \sum_{C} \tau_C \ell(j, C) + N \alpha + 1\)

\(\ell(j,C) = \sum_{k \in C} r_{jk}^2\)

MESC

\(E[\alpha_k^2 \mid \beta_{1k}, \dots, \beta_{Gk}] = E[\alpha^2] \sum_{i}^{G} \beta_{ik}^2 + E[\gamma^2]\)

\(\ell(j,C) = \sum_{k \in C} r_{jk}^2\)

FMR

\(\frac{d}{d t} E \left( e^{it z + \frac{1}{2} t^2 \hat{\sigma}^2} \right)\approx it N \sigma^2 \sum_K w_K \ell_K (t)\)

\(\ell_K (t) = \sum_j r_{j}^2 \phi_K (r_j t)\)

Non-uniform model and its extension#

LD score regression and its extension#

Method

Formula

Key contribution

LDAK

\(\mathbb{E}[h_j^2] \sim [f_j(1 - f_j)]^{1 + \alpha} \times w_j \times r_j\)

\(\beta_j \sim \mathcal{N}\left(0, r_j w_j \frac{\sigma_g^2}{W} \right) \quad W = \sum_j r_j w_j \left[ 2 f_j (1 - f_j) \right]^{1 + \alpha}\)