Concepts#
Contents
Some key concepts#
PIP (Posterior Inclusion Probability)#
The PIP for a variable \(X_i\) is the proportion of MCMC samples in which \(X_i\) was included in the model. Mathmatically, the PIP for \(X_i\) can be calculated as
Where \(N\) is the number of MCMC iterations and \(N_i\) is the number of iterations where \(X_i\) was included.
Centered and scaled genotype matrix W#
Ref: W The additive genomic relationship matrix \(\mathbf{G}\) (VanRaden PM. 2008. J Dairy Sci. 91:4414-4423) is constructed using all genetic markers as follows: \(\mathbf{G}=\mathbf{W}\mathbf{W}^{\intercal}/m\), where \(W\) is the centered and scaled genotype matrix, and \(m\) is the total number of markers. Each column vector of \(\mathbf{W}\) was calculated as follows: \(\mathbf{w}_i = (m_i -2p_i)/\sqrt{2p_i(1-p_i)}\) , where \(p_i\) is the minor allele frequency of the \(i\) -th genetic marker and \(\mathbf{m}_i\) is the ith column vector of the allele count matrix, \(\mathbf{M}\), which contains the genotypes coded as 0, 1 or 2 counting the number of minor allele.
How to scale and center genotype:
we assume the allele frequency of allele \(a\) is \(1 - p_i\),the allele frequence of other allele \(A\) is \(p_i\). According to Hardy-Weinberg principle, the genotypes \(aa\),:math:Aa and \(AA\) (coded as 0, 1 or 2) follow the following distribution:
Marker |
0 |
1 |
2 |
---|---|---|---|
Frequency |
\((1-p_i)^2\) |
\(2p_i(1-p_i)\) |
\(p_i^2\) |
The mean of the genotype is
The variace of the genotype is
So after centering and scaling genotype,we get