Cox regression#

Basic concepts#

In survival data analysis, we assumed that \(T\) is a continuous variable related to time \(t\), and its probability density function is \(f(t)\). The cumulative distribution function of event occurrences at a given time \(t\) is

\[ \begin{align} F(t) = Pr(T \leq t) = \int^t_{- \infty} f(u)du \end{align} \]

Usually, we use Survival function and Hazard function to analyze survival data.

Survival function:

\[ \begin{align} \label{survivalFun} S(t) = Pr(T > t) = 1 - F(t) \end{align} \]

Hazard function:

\[ \begin{align} \label{hazardFun} h(t) = \lim_{\Delta t \to 0}\frac{Pr(t + \Delta t > T \geq t | T \geq t)}{\Delta t} \end{align} \]

Combine survival and hazard function, we can get

\[ \begin{align} h(t) &= \lim_{\Delta t \to 0}\frac{Pr(t + \Delta t > T \geq t)}{\Delta t} \times \frac{1}{Pr(T \geq t)} \\ &= \lim_{\Delta t \to 0}\frac{Pr(T < t + \Delta t) - Pr(T < t)}{\Delta t } \times \frac{1}{Pr(T \geq t)} \\ &= \frac{f(t)}{S(t)} = \frac{d[1- S(t)]}{dt} \times \frac{1}{S(t)} = - \frac{d[log(S(t))]}{dt} \end{align} \]

Based on link function of Cox proportional hazards model \(ln [h(t,\beta)] = ln [h_0(t)] + x^t\beta\), we can get relationship among survival function \(S(t)\), hazard function \(H(t)\) and survival effect \(\beta\):

\[ \begin{align} S(t) &= exp[-H(t)] \\ &= exp[exp(x^T \beta)\int^t_0 -h_0(u)du ] \\ &= exp[-exp(x^T\beta) \times H_0(t)] \end{align} \]

where, \(h_0(t)\) is the baseline hazard function, \(H_0(t)\) is the baseline cumulative hazard function, and \(x\) is the corresponding coefficient of the influencing factor \(\beta\).

Maximum Likelihood Estimation for Cox Proportional Hazards Model#