Mathmatical Statistical Bayesian Liear Regression

Table of Contents

1. Some key concepts

1.1. PIP (Posterior Inclusion Probability)

1.2. FDR

1.3. Centered and scaled genotype matrix W

Ref: W The additive genomic relationship matrix \(\mathbf{G}\) (VanRaden PM. 2008. J Dairy Sci. 91:4414-4423) is constructed using all genetic markers as follows: \(\mathbf{G}=\mathbf{W}\mathbf{W}^{\intercal}/m\), where \(W\) is the centered and scaled genotype matrix, and \(m\) is the total number of markers. Each column vector of \(\mathbf{W}\) was calculated as follows: \(\mathbf{w}_i = (m_i -2p_i)/\sqrt{2p_i(1-p_i)}\) , where \(p_i\) is the minor allele frequency of the \(i\) -th genetic marker and \(\mathbf{m}_i\) is the ith column vector of the allele count matrix, \(\mathbf{M}\), which contains the genotypes coded as 0, 1 or 2 counting the number of minor allele.

How to scale and center genotype:

we assume the allele frequency of allele \(a\) is \(1 - p_i\),the allele frequence of other allele \(A\) is \(p_i\). According to Hardy-Weinberg principle, the genotypes \(aa\),\(Aa\) and \(AA\) (coded as 0, 1 or 2) follow the following distribution:

Marker 0 1 2
Frequency \((1-p_i)^2\) \(2p_i(1-p_i)\) \(p_i^2\)

The mean of the genotype is

\begin{aligned} E(\mathbf{m}_i) = 0 * (1-p_i)^2 + 1 * 2p_i(1-p_i) + 2 * p_i^2 = 2p_i \end{aligned}

The variace of the genotype is

\begin{align} Var(\mathbf{m}_i) &= E(\mathbf{m}_i^2) - E(\mathbf{m}_i)^2 \\\ &=\big [ 0 * (1-p_i)^2 + 1 * 2p_i(1-p_i) + 4 * p_i^2 \big ] - \\\ &\quad \big [ 0 * (1-p_i)^2 + 1 * 2p_i(1-p_i) + 2 * p_i^2 \big ]^2 \\\ &= 2p_i(1-p_i) + 4 * p_i^2 - (2p_i)^2 \\\ &= 2p_i(1-p_i) \end{align}

So after centering and scaling genotype,we get

\begin{align} \mathbf{w}_i = (m_i -2p_i)/\sqrt{2p_i(1-p_i)} \end{align}

2. SBayesR

2.1. Model

\[\mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \mathbf{e} \tag{1.1.1}\]

2.2. Assumption

\[\mathbf{e}|\sigma^2_e \sim \mathcal{N}(\mathbf{0},\mathbf{R}\sigma^2_e), \mathbf{R}=\mathbf{I} \tag{1.2.1} \] \[f(\sigma_e^2;\nu_e,S^2_e)=\frac{(S^2_e\frac{\nu_e}{2})^{\frac{\nu_e}{2}}}{\Gamma(\frac{\nu_e}{2})}\bullet \frac{exp(-\frac{S^2_e}{\sigma_e^2}\frac{\nu_e}{2})}{(\sigma_e^2)^{1+\frac{\nu_e}{2}}}\tag{1.2.2}\] \[\beta_j|\boldsymbol{\pi},\sigma^2_{\beta} =\begin{cases} 0 , & \pi_1 \\\ \vdots & \vdots \\\ \sim \mathcal{N}(0,\gamma_c\sigma^2_{\beta}), & \pi_c \\\ \vdots & \vdots \\\ \sim \mathcal{N}(0,\gamma_C\sigma^2_{\beta}), & \pi_C = 1- \sum^{C-1}_{c=1} \pi_c \end{cases} \tag{1.2.3}\] \[f(\sigma_{\beta}^2;\nu_{\beta},S^2_{\beta})=\frac{(S^2_{{\beta}}\frac{\nu_{\beta}}{2})^{\frac{\nu_{\beta}}{2}}}{\Gamma(\frac{\nu_{\beta}}{2})} \bullet \frac{exp(-\frac{S^2_{\beta}}{\sigma_{\beta}^2}\frac{\nu_{\beta}}{2})}{(\sigma_{\beta}^2)^{1+\frac{\nu_{\beta}}{2}}}\tag{1.2.4}\] \[\boldsymbol{\delta}|\boldsymbol{\pi}\sim Categorical(C,\boldsymbol{\pi})\tag{1.2.5}; f(\boldsymbol{\delta}|\boldsymbol{\pi})=\prod^p_{j=1}\prod^C_{c=1} \pi_c^{[\delta_j=c]}, where \quad [\delta_j=c]=\begin{cases} 1, & \delta_j=c \\\ 0,&\delta_j \neq c \end{cases}\] \[\boldsymbol{\pi}|\boldsymbol{\alpha} \sim Dirichlet(C,\boldsymbol{\alpha}),\boldsymbol{\alpha}=\mathbf{1}_C^T; \boldsymbol{\pi}|\boldsymbol{\alpha}=\frac{1}{B(\boldsymbol{\alpha})}\prod^C_{c=1} \pi_c^{\alpha_c-1} \tag{1.2.6}\]

Here, Target parameter vector is \(\boldsymbol{\theta}=[\boldsymbol{\beta}^T \vdots \boldsymbol{\pi}^T \vdots \sigma_{\beta}^2 \vdots \sigma_e^2]^T\), and ohter unknown parameter vector are \([\boldsymbol{\gamma}\vdots \boldsymbol{\pi}\vdots \boldsymbol{\delta}\vdots \boldsymbol{\alpha}]\). Currently, \(\boldsymbol{\gamma}\) and \(\boldsymbol{\alpha}\) are pre-specified.

2.3. Bayesian inference

2.3.1. Gibbs sampling scheme

The full conditional of \(\theta_i\) can be expressed as \[f(\theta_i|\boldsymbol{\theta}_{-i},y) \propto f(\theta_i,\boldsymbol{\theta}_{-i},\mathbf{y})=f(\mathbf{y})f(\theta_i)f(\boldsymbol{\theta}_{-i})\tag{2.1.1} \]

2.3.1.1. Derive posterior update for \(\mathbf{\beta}\)

\[f(\mathbf{y}|\boldsymbol{\theta})=(2\pi \sigma_e^2)^{-n/2} exp \bigg [-\frac{(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})^T(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})}{2\sigma_e^2} \bigg ]\tag{2.1.2}\]

\[f(\boldsymbol{\beta}|\boldsymbol{\delta}=c,\gamma_c\sigma_{\beta}^2)=\prod^{k_c}_{j=1}(2\pi\gamma_c\sigma_{\beta}^2)^{-1/2}exp \bigg[ -\frac{\beta_j^2}{2\gamma_c\sigma^2_{\beta}}\bigg]\tag{2.1.3}\] In terms of \((2.1.1)\), based on formulas \((1.2.2)\),\((1.2.4)\),\((2.1.2)\) and \((2.1.3)\), the full conditional for \(\beta_j\) can be expressed as

\begin{align} \tag{2.1.4} f(\beta_j|\delta_j =c, \boldsymbol{\theta}_{-\beta_j},\mathbf{y}) &= exp \bigg [-\frac{(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})^T(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})}{2\sigma_e^2} \bigg ] exp \bigg[ -\frac{\beta_j^2}{2\gamma_c\sigma^2_{\beta}}\bigg] \bullet Constant \\\ &\propto exp \bigg \{-\frac{1}{2\sigma_e^2}\bigg [(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})^T(\mathbf{y}-\mathbf{X}\boldsymbol{\beta}) +\frac{\beta_j^2\sigma_e^2}{\gamma_c\sigma^2_{\beta}}\bigg] \bigg \} \\\ & \propto exp \bigg \{-\frac{1}{2\sigma_e^2}\bigg [(\mathbf{w}-\mathbf{x}_j\beta_j)^T(\mathbf{w}-\mathbf{x}_j\beta_j) +\frac{\beta_j^2\sigma_e^2}{\gamma_c\sigma^2_{\beta}}\bigg] \bigg \} \\\ & \propto exp \bigg \{-\frac{1}{2\sigma_e^2}\bigg [(\mathbf{w}^T\mathbf{w}-2\bullet\mathbf{x}^T_j \mathbf{w}\bullet\beta_j +\color{red}{\underbrace{ (\mathbf{x}^T_j\mathbf{x}_j + \frac{\sigma_e^2}{\gamma_c\sigma^2_{\beta}})}_{l_{jc}}} \bullet \beta_j^2\bigg] \bigg \} \\\ & \propto exp \bigg \{-\frac{1}{2\sigma_e^2}\bigg [(\mathbf{w}^T\mathbf{w}+ l_{jc} \bullet (\beta_j^2-2\bullet \frac{\mathbf{x}^T_j \mathbf{w}}{l_{jc}}\bullet\beta_j+ \frac{\mathbf{x}^T_j \mathbf{w}}{l_{jc}}\bullet \frac{\mathbf{x}^T_j \mathbf{w}}{l_{jc}} - \frac{\mathbf{x}^T_j \mathbf{w}}{l_{jc}}\bullet \color{green}{ \underbrace{ \frac{\mathbf{x}^T_j \mathbf{w}}{l_{jc}}}_{ \hat{\beta}_j } }) \bigg] \bigg \} \\\ & \propto exp \bigg \{-\frac{1}{2\sigma_e^2}\bigg [(\mathbf{w}^T\mathbf{w}+ l_{jc} \bullet (\beta_j^2-\hat{\beta}_j)^2 -l_{jc}\bullet\hat{\beta}_j^2 \bigg] \bigg \} \\\ &\propto exp \bigg[ -\frac{1}{2}\bullet \frac{(\beta_j -\hat{\beta}_j)^2}{\frac{\sigma_e^2}{l_{jc}} } \bigg ] \end{align}
2.3.1.2. Derive posterior update for \(\boldsymbol{\pi}\)

In terms of \((2.1.1)\), based on formulas \((1.2.5)\) and \((1.2.6)\), the full conditional for \(\boldsymbol{\pi}\) can be expressed as \[f(\boldsymbol{\pi}|\boldsymbol{\delta},\boldsymbol{\alpha}) \propto f(\boldsymbol{\delta}|\boldsymbol{\pi}) f(\boldsymbol{\pi}|\boldsymbol{\alpha})=f(\boldsymbol{\delta}|\boldsymbol{\pi})=\prod^p_{j=1}\prod^C_{c=1} \pi_c^{[\delta_j=c]}\prod^C_{c=1}\pi_c^{\alpha_c-1}=\prod^p_{j=1}\prod^C_{c=1} \pi_c^{[\delta_j=c]+\alpha_c-1} \tag{2.1.5}\]

2.3.1.3. Derive posterior update for \(\sigma_{\beta}^2\)

In terms of \((2.1.1)\), based on formulas \((1.2.4)\) and \((2.1.3)\), the full conditional for \(\boldsymbol{\pi}\) can be expressed as

\begin{align} \tag{2.1.6} f(\sigma_{\beta}^2|\boldsymbol{\beta},\nu_{\beta},S^2_{\beta}) &\propto f(\boldsymbol{\beta}|\boldsymbol{\delta}=c,\gamma_c\sigma_{\beta}^2)f(\sigma_{\beta}^2;\nu_{\beta},S^2_{\beta}) \\\ &\propto f(\sigma_{\beta}^2;\nu_{\beta},S^2_{\beta}) \bullet \prod^q_{j=1}(2\pi\gamma_{\delta_j}\sigma_{\beta}^2)^{-1/2}exp \bigg[ -\frac{\beta_j^2}{2\gamma_{\delta_j}\sigma^2_{\beta}}\bigg] \\\ &\propto \frac{(S^2_{{\beta}}\frac{\nu_{\beta}}{2})^{\frac{\nu_{\beta}}{2}}}{\Gamma(\frac{\nu_{\beta}}{2})}\bullet \frac{exp(-\frac{S^2_{\beta}}{\sigma_{\beta}^2}\frac{\nu_{\beta}}{2})}{(\sigma_{\beta}^2)^{1+\frac{\nu_{\beta}}{2}}} \bullet \prod^q_{j=1}(2\pi\gamma_{\delta_j})^{-1/2}\bullet (\sigma_{\beta}^2)^{-q/2}exp \bigg[ -\frac{1}{2\sigma_{\beta}^2} \sum^q_{j=1} \frac{\beta_j^2}{\gamma_{\delta_j}} \bigg] \\\ &\propto \frac{exp(-\frac{1}{2\sigma_{\beta}^2}\nu_{\beta}S^2_{\beta})}{(\sigma_{\beta}^2)^{1+\frac{\nu_{\beta}}{2}}} \bullet (\sigma_{\beta}^2)^{-q/2}exp \bigg[ -\frac{1}{2\sigma_{\beta}^2} \sum^q_{j=1} \frac{\beta_j^2}{\gamma_{\delta_j}} \bigg] \\\ &\propto \frac{exp \bigg [ -\frac{1}{2\sigma_{\beta}^2}\bullet \frac{ (\color{green}{ \nu_{\beta}+q}) (\color{red}{ \nu_{\beta}S^2_{\beta} +\sum^q_{j=1} \frac{\beta_j^2}{\gamma_{\delta_j}} }) }{\color{green}{ \nu_{\beta}+q}} \bigg ] }{(\sigma_{\beta}^2)^{1 +\frac{ \color{green}{ \nu_{\beta}+q} }{2}}} \\\ &\propto \frac{exp(-\frac{1}{2\sigma_{\beta}^2} \color{green}{ \tilde{\nu}_{\beta}} \color{red}{ \tilde{S}^2_{\beta}} )}{(\sigma_{\beta}^2)^{1+\frac{ \color{green}{ \tilde{\nu}_{\beta}} }{2}}} \end{align}
2.3.1.4. Derive posterior update for \(\sigma_e^2\)
2.3.1.5. Joint sampling of \(\delta_j\) and \(\beta_j\)

3. LDPred

3.1. Posterior of \(\pi_i\) conditioned on marginal effect \(\tilde \beta\)

3.1.1. Assumption:

The “data” consist of the univariate linear regression estimates, i.e,. \[\color{red}{\tilde{\beta_j}|\beta_j \sim \mathcal{N}(\beta_j,N^{-1})} \tag{3.1}\].

3.1.2. Conclusion:

\[E(\beta_j | \tilde \beta_j)= \bigg ( \frac{1}{1+ \frac{Mp}{h^2N}}\bigg )\bar p_j \tilde \beta_j \tag{3.2}\] Where \(\bar p_j = P(\beta_j \sim \mathcal{N}(\bullet,\bullet)|\tilde \beta_j)\).

Proof:

Let \(\sigma^2_c= \gamma_c\sigma^2_{\beta}\)

\begin{align} \tag{3.3} f(\tilde \beta_j|p) &= \int_{-\infty}^{+\infty} f(\tilde \beta_j , \beta_j|p)\mathrm{d}\beta_j = \int_{-\infty}^{+\infty} f(\tilde \beta_j | \beta_j,p) f(\beta_j |p) \mathrm{d}\beta_j \\\ &= \int_{-\infty}^{+\infty} (2 \pi N^{-1})^{-\frac{1}{2}}. exp [-\frac{1}{2}\frac{(\tilde \beta_j - \beta_j)^2}{N^{-1}}] . \Bigg \{ \sum^C_{c=2} \bigg [ p_c \cdot (2 \pi \sigma_c^2)^{-\frac{1}{2}}. exp (-\frac{1}{2}\frac{\beta_j^2}{\sigma_c^2}) \bigg ]+ p_1 \cdot \delta_{\beta_j} \Bigg \} \mathrm{d}\beta_j \\\ &=\int_{-\infty}^{+\infty} \sum^C_{c=2} \bigg [ p_c (2 \pi N^{-1})^{-\frac{1}{2}}. exp [-\frac{1}{2}\frac{(\tilde \beta_j - \beta_j)^2}{N^{-1}}] . (2 \pi \sigma_c^2)^{-\frac{1}{2}}. exp (-\frac{1}{2}\frac{\beta_j^2}{\sigma_c^2}) \bigg ] \mathrm{d}\beta_j + \color{red}{p_1 \cdot \int_{-\infty}^{+\infty} (2 \pi N^{-1})^{-\frac{1}{2}}. exp [-\frac{1}{2}\frac{(\tilde \beta_j - \beta_j)^2}{N^{-1}}] \delta_{\beta_j} \mathrm{d}\beta_j} \\\ &= \sum^C_{c=2} \bigg [ p_c \cdot (2 \pi \frac{N\sigma_c^2+1}{N})^{-\frac{1}{2}}.exp(-\frac{1}{2}.\frac{N}{N\sigma_c^2+1} \tilde \beta^2_j) \bigg ]+ \color{red}{p_1 \cdot (2 \pi N^{-1})^{-\frac{1}{2}}. exp(-\frac{1}{2}\frac{\tilde \beta_j^2}{N^{-1}}) } \\\ &= (2 \pi)^{-\frac{1}{2}} \Bigg \{ \sum^C_{c=2} \bigg[ p_c ( \frac{N\sigma_c^2+1}{N})^{-\frac{1}{2}}.exp(-\frac{1}{2}.\frac{N}{N\sigma_c^2+1} \tilde \beta^2_j) \bigg ]+ \color{red}{p_1 \cdot ( N^{-1})^{-\frac{1}{2}}. exp (-\frac{1}{2}\frac{\tilde \beta_j^2}{N^{-1}}) } \Bigg \} \end{align} \begin{align} \tag{3.4} \int_{-\infty}^{+\infty} \beta_j f(\tilde \beta_j | \beta_j) f(\beta_j)\mathrm{d} \beta_j &= \int_{-\infty}^{+\infty} \beta_j \cdot (2 \pi N^{-1})^{-\frac{1}{2}}. exp [-\frac{1}{2}\frac{(\tilde \beta_j - \beta_j)^2}{N^{-1}}] . \Bigg \{ \sum^C_{c=2} \bigg [ p_c \cdot (2 \pi \sigma_c^2)^{-\frac{1}{2}}. exp (-\frac{1}{2}\frac{\beta_j^2}{\sigma_c^2}) \bigg ] + p_1 \cdot \delta_{\beta_j} \Bigg \} \mathrm{d}\beta_j \\\ &= \sum^C_{c=2} \Bigg [ p_c \cdot (2 \pi \frac{N\sigma_c^2+1}{N})^{-\frac{1}{2}}.exp(-\frac{1}{2}. \frac{N}{N\sigma_c^2+1} \tilde \beta^2_j). \int_{-\infty}^{+\infty} \beta_j \cdot (2 \pi \frac{\sigma_c^2}{N\sigma_c^2+1})^{-\frac{1}{2}}. exp[ -\frac{1}{2} . \frac{N\sigma_c^2 +1}{\sigma_c^2}(\beta_j - \frac{N\sigma_c^2}{N\sigma_c^2 +1} \tilde \beta_j)^2 ] \mathrm{d}\beta_j \Bigg ] \\\ &= \sum^C_{c=2} \Bigg [ p_c \cdot (2 \pi \frac{N\sigma_c^2+1}{N})^{-\frac{1}{2}}.exp(-\frac{1}{2}. \frac{N}{N\sigma_c^2+1} \tilde \beta^2_j).E[\beta_j; \beta_j| \tilde \beta_j \sim \mathcal{N}_{FIM_c}(\frac{N \sigma_c^2}{N \sigma_c^2 +1}\tilde \beta_j, \frac{\sigma_c^2}{N \sigma_c^2 +1})] \Bigg ] \\\ &= \sum^C_{c=2} \Bigg [ p_c \cdot (2 \pi \frac{N\sigma_c^2+1}{N})^{-\frac{1}{2}}.exp(-\frac{1}{2}.\frac{N}{N\sigma_c^2+1} \tilde\beta^2_j) \cdot \frac{N \sigma_c^2}{N \sigma_c^2 +1} \tilde \beta_j \Bigg ] \\\ &= \sum^C_{c=2} \Bigg [ p_c \cdot (2 \pi)^{-\frac{1}{2} } (\frac{N\sigma_c^2+1}{N})^{-\frac{1}{2}}.exp(-\frac{1}{2}.\frac{N}{N\sigma_c^2+1} \tilde\beta^2_j) \cdot \frac{N \sigma_c^2}{N \sigma_c^2 +1} \tilde \beta_j \Bigg ] \end{align} \begin{align} \tag{3.5} E(\beta_j | \tilde \beta_j) &= \frac{\int_{-\infty}^{\infty} \beta_j f(\tilde\beta_j | \beta_j) f(\beta_j)\mathbf{d}\beta_j} {f(\tilde \beta_j)} \\\ &= \frac{ \sum^C_{c=2} \bigg \{ p_c \cdot (\frac{N\sigma_c^2+1}{N})^{-\frac{1}{2}}.exp(-\frac{1}{2}.\frac{N}{N\sigma_c^2+1} \tilde\beta^2_j) \cdot \frac{N \sigma_c^2}{N \sigma_c^2 +1} \tilde \beta_j \bigg \} } { \sum^C_{c=2} \bigg[ p_c ( \frac{N\sigma_c^2+1}{N})^{-\frac{1}{2}}.exp(-\frac{1}{2}.\frac{N}{N\sigma_c^2+1} \tilde \beta^2_j) \bigg ]+ \color{red}{p_1 \cdot ( N^{-1})^{-\frac{1}{2}}. exp (-\frac{1}{2}\frac{\tilde \beta_j^2}{N^{-1}}) } } \\\\\\ \end{align}

Alternatively, we can rewrite \(E(\beta_j | \tilde \beta_j)\) as follows:

\begin{align} \tag{3.6} E(\beta_j | \tilde \beta_j) = \bigg ( \frac{1}{1 + \frac{1}{\sigma^2N}} \bigg) \bar p_j \tilde \beta_j \end{align}

Where

\begin{align} \tag{3.7} \bar{p}_j &= \frac{\frac{p}{\sqrt{\sigma^2 + \frac{1}{N}} } exp \Bigg [-\frac{1}{2}\bigg (\frac{\tilde \beta_j^2}{\sigma^2 + \frac{1}{N} }\bigg ) \Bigg ]} {\frac{p}{\sqrt{\sigma^2 + \frac{1}{N}} } exp \Bigg [-\frac{1}{2}\bigg (\frac{\tilde \beta_j^2}{\sigma^2 + \frac{1}{N} }\bigg ) \Bigg ] + \frac{1-p}{\frac{1}{\sqrt{N}}} exp ( - \frac{1}{2}N \tilde \beta_j^2 ) } \bullet \frac{\sqrt{2\pi}}{\sqrt{2\pi}} \\\ &= \frac{f \bigg (\tilde \beta_j | \beta_j \sim \mathcal{N}(\bullet,\bullet)\bigg) f(\beta_j \sim \mathcal{N}(\bullet,\bullet))}{f(\tilde \beta_j)} = P \bigg (\beta_j \sim \mathcal{N}(\bullet, \bullet) | \tilde \beta_j \bigg) \end{align}

3.2. Gibbs sampling in new model

\[\beta_j|\mathbf{b},\boldsymbol{\beta},else \sim \mathcal{N}\bigg(\frac{r_j^{\star}}{C_j^{\star}},\frac{\sigma_{e_j}^{2\star}}{C_j^{\star} } \bigg ) \sim \mathcal{N}\bigg(\frac{ r_j^{2\star} / \sigma_{e_j}^{2\star} }{ C_j^{\star}/ \sigma_{e_j}^{2\star} }, \frac{1}{C_j^{\star} /\sigma_{e_j}^{2\star} } \bigg ) \]

Where \(\sigma_{e_j}^{2\star} = (\frac{n_j}{m}s_j^2 + \frac{m^0_j}{m}) \sigma_g^2 + \sigma_e^2\). \(n_j\) is per-SNP sample size, \(m\) is the number of total markers, \(m_j^0\) is the number of SNPs not in population LD with SNP \(j\), \(s_j^2 = \sum_{k=1}^{m_j} \frac{(1-\rho_{jk}^2)^2}{n_j}\), \(\rho_{jk} \approx \tilde \beta_{jk}\).

\[\mathbb{P}(\delta_j=c|\boldsymbol{\theta},\mathbf{y}) = \alpha \times \mathbb{P}(\delta_j=c|\boldsymbol{\theta},\mathbf{y}) \]

Author: ShouyeLiu

Created: 2023-06-30 Fri 23:06