A.2 Asymptotic theory
A.2.1 Partial likelihood
Here is a very brief summary of the asymptotics concerning the partial likelihood. Once defined, it turns out that you may treat it as an ordinary likelihood function . The setup is as follows.
Let \(t_{(1)}, t_{(2)}, \ldots, t_{(k)}\) the ordered observed event times and let \(R_i = R(t_{(i)})\) be the risk set at \(t_{(i)}, \; i = 1, \ldots, k\), see equation . At \(t_{(i)}\), condition with respect to the composition of \(R_i\) and that one event occurred (for tied events, a correction is necessary).
Then the contribution to the partial likelihood from \(t_{(i)}\) is \[\begin{multline*} L_i(\boldsymbol{\beta}) = P(\mbox{No. $m_i$ dies} \mid \mbox{one event occur}, R_i) \\ = \frac{h_0(t_{(i)}) \exp(\boldsymbol{\beta} \mathbf{x}_{m_i})} {\sum_{\ell \in R_i}h_0(t_{(i)})\exp(\boldsymbol{\beta} \mathbf{x}_\ell)} = \frac{\exp(\boldsymbol{\beta} \mathbf{x}_{m_i})} {\sum_{\ell \in R_i}\exp(\boldsymbol{\beta} \mathbf{x}_\ell)} \end{multline*}\] and the full partial likelihood is \[\begin{equation*} L(\boldsymbol{\beta}) = \prod_{i=1}^k L_i(\boldsymbol{\beta}) = \prod_{i=1}^k \frac{ \exp(\boldsymbol{\beta} \mathbf{x}_{m_i})} {\sum_{\ell \in R_i}\exp(\boldsymbol{\beta} \mathbf{x}_\ell)} \end{equation*}\] This is where the doubt about the partial likelihood comes in; the conditional probabilities multiplied together do not have a proper interpretation as a conditional probability. Nevertheless, it is prudent to proceed as if the expression really is a likelihood function. The log partial likelihood becomes \[\begin{equation}\label{eq:logplA} \log\big(L(\boldsymbol{\beta})\big) = \sum_{i=1}^k \left\{\boldsymbol{\beta} \mathbf{x}_{m_i} - \log\left(\sum_{\ell \in R_i} \exp(\boldsymbol{\beta} \mathbf{x}_\ell)\right)\right\}, \end{equation}\] and the components of the score vector are \[\begin{equation}\label{eq:scoreA} \frac{\partial}{\partial \beta_j} \log L(\boldsymbol{\beta}) = \sum_{i=1}^k \mathbf{x}_{m_i j} - \sum_{i=1}^k \frac{\sum_{\ell \in R_i} x_{\ell j} \exp(\boldsymbol{\beta} \mathbf{x}_\ell)} {\sum_{\ell \in R_i} \exp(\boldsymbol{\beta} \mathbf{x}_\ell)}, \quad j = 1, \ldots, s. \end{equation}\] The maximum partial likelihood (MPL) estimator of \(\boldsymbol{\beta}\), \(\hat{\boldsymbol{\beta}}\), is found by setting equal to zero and solve for \(\boldsymbol{\beta}\).
For inference, we need to calculate the inverse of minus the Hessian, evaluated at \(\hat{\boldsymbol{\beta}}\). This gives the estimated covariance matrix. The Hessian is the matrix of the second partial derivatives. The expectation of minus the Hessian is called the information matrix. The observed information matrix is \[\begin{equation*} \hat{I}(\hat{\boldsymbol{\beta}})_{j,m} = -\frac{\partial^2 \log L(\boldsymbol{\beta})} {\partial \beta_j \partial \beta_m}\mid_{\boldsymbol{\beta} = \hat{\boldsymbol{\beta}}} \end{equation*}\] and asymptotic theory says that \[\begin{equation*} \hat{\boldsymbol{\beta}} \sim N(\boldsymbol{\beta}, \hat{I}^{-1}(\hat{\boldsymbol{\beta}})) \end{equation*}\] This is to say that \(\hat{\boldsymbol{\beta}}\) is asymptotically unbiased and normally distributed with the given covariance matrix (or the limit of it). Further, \(\hat{\boldsymbol{\beta}}\) is a consistent estimator of \(\boldsymbol{\beta}\). These results are used for for hypothesis testing, confidence intervals, and variable selection.
Note that these are only asymptotic results, i.e., useful in large to medium sized samples. In small samples, bootstrapping is a possibility. This option is available in the R package eha.
Here a warning is in order: Tests based on standard errors (Wald) tests) may be highly unreliable, as in all non-linear regression . A better alternative is the likelihood ratio test.