Logistic regression#
An outline#
Random component: \(Y_i\) follows \(Bernoulli(\pi_i)\) with mean \(\mathbb{E}[Y_i] = \pi_i\) (Summary or Wikipedia).
For \(i\)-th individual,
\[
Pr(Y_i = y_i|\mathbf{X}\boldsymbol{\beta}) =
\left\{
\begin{array}{
@{}% no padding
l@{\quad}% some padding
r@{}% no padding
>{{}}r@{}% no padding
>{{}}l@{}% no padding
}
\pi_i , & y_i = 1 \\
1- \pi_i , & y_i = 0
\end{array}
\right.
\]
or
\[
Pr(Y_i = y_i|\mathbf{X}\boldsymbol{\beta}) = \pi^{y_i}_i(1-\pi_i)^{(1-y_i)}
\]
Where, \(\boldsymbol{\pi}\) is responsible probability vector.
Systematic component: covariate \(\mathbf{x}_1,\mathbf{x}_2,\cdots,\mathbf{x}_p\) produce a linear predictor \(\boldsymbol{\eta}\) given by
\[\boldsymbol{\eta} = \sum_{j=1}^p \mathbf{x}_j \beta_j = \mathbf{X}\boldsymbol{\beta}\]
Link function: \(g(.)\) :
\[g(\pi_i) = \eta_i = \sum_{j=1}^p \mathbf{x}_{ij} \beta_j\]
Common used link function:
the logit or logistic function \(g(\pi) = log\{\pi/ (1-\pi)\} = log(odds)\)
Therefore,
\[
\begin{align}
&log(\frac{\pi_i}{1-\pi_i}) = \sum_{j=1}^p \mathbf{x}_{ij} \beta_j \\
& \equiv \pi_i = \frac{e^{\sum_{j=1}^p \mathbf{x}_{ij}\beta_j}}{1+e^{\sum_{j=1}^p \mathbf{x}_{ij}\beta_j} }
\end{align}
\]
Derivative#
Based on \(Pr(Y_i|\mathbf{X}\boldsymbol{\beta}) = \pi^{y_i}_i(1-\pi_i)^{(1-y_i)}\), we have
\[
\begin{align}
L(\theta) = \prod_{i=1}^n Pr(Y_i|\mathbf{X}\boldsymbol{\beta})
= \prod_{i=1}^n \pi^{y_i}_i(1-\pi_i)^{(1-y_i)}
\end{align}
\]
its log-likelihood is
\[
\begin{align}
\ell(\theta) & = log \bigg \{
\prod_{i=1}^n \pi^{y_i}_i(1-\pi_i)^{(1-y_i)} \bigg \} \\
& = \sum^{n}_{i=1} \bigg \{ y_i log(\pi_i) + (1-y_i)log(1-\pi_i) \bigg\} \\
& = \sum^{n}_{i=1} \bigg \{log(1-\pi_i) + y_i log(\frac{\pi_i}{1-\pi_i}) \bigg\}
\end{align}
\]
The derivative is
\[
\begin{align}
\frac{\partial\ell(\theta)}{\partial\beta_j}
& = \frac{\partial\ell(\theta)}{\partial\pi_j} \frac{\partial\pi_j}{\partial\beta_j} \\
& = \sum^{n}_{i=1} \bigg \{
- \frac{e^{\sum_{j=1}^p \mathbf{x}_{ij}\beta_j}}{(1+e^{\sum_{j=1}^p \mathbf{x}_{ij}\beta_j})}
\mathbf{x}_{ij} + y_i\mathbf{x}_{ij}
\bigg\}
\end{align}
\]