3.3 Proportional Hazards Regression Models

The definition in Section 3.1 is valid for models in continuous time. Figure 3.6 shows the relationships between the cumulative hazards functions, the density functions, and the survival functions when the hazard functions are proportional. Note that the cumulative hazards functions are proportional by implication, with the same proportionality constant (\(\psi = 2\) in this case). On the other hand, for the density and survival functions, proportionality does not hold; it is in fact theoretically impossible except in the trivial case that the proportionality constant is unity.

FIGURE 3.6: The effect of proportional hazards on the density and survival functions.

3.3.1 Two Groups

The definition of proportionality, given in equation (3.1), can equivalently be written as

\[\begin{equation} h_x(t) = \psi^x h_0(t), \quad t > 0, \; x = 0, 1, \; \psi > 0. \tag{3.3} \end{equation}\]

It is easy to see that this “trick” is equivalent to (3.1): When \(x = 0\) it simply says that \(h_0(t) = h_0(t)\), and when \(x = 1\) it says that \(h_1(t) = \psi h_0(t)\). Since \(\psi > 0\), we can calculate \(\beta = \log(\psi)\), and rewrite (3.3) as

\[\begin{equation*} h_x(t) = e^{\beta x} h_0(t), \quad t > 0; \; x = 0, 1; \; -\infty < \beta < \infty, \end{equation*}\] or with a slight change in notation,

\[\begin{equation} h(t; x) = e^{\beta x} h_0(t), \quad t > 0; \; x = 0, 1; \; -\infty < \beta < \infty. \tag{3.4} \end{equation}\]

The sole idea by this rewriting is to pave the way for the introduction of Cox’s regression model (Cox 1972), which in its elementary form is a proportional hazards model. In fact, we can already interpret equation (3.4) as a Cox regression model with an explanatory variable \(x\) with corresponding regression coefficient (to be estimated from data) \(\beta\). The covariate \(x\) is still only a dichotomous variate, but we will now show how it is possible to generalize this to a situation with explanatory variables of any form. The first step is to go from the two-sample situation to a \(k\)-sample one.

3.3.2 Many Groups

Can we generalize to \((k + 1)\) groups, \(k \ge 2\)? Yes, by expanding the procedure in the previous subsection:

\[\begin{eqnarray*} h_0(t) &\sim& \mbox{group 0} \\ h_1(t) &\sim& \mbox{group 1} \\ \cdots & & \cdots \\ h_k(t) &\sim& \mbox{group k} \end{eqnarray*}\]

The underlying model is: \(h_j(t) = \psi_j h_0(t), \quad t \ge 0\), \(j = 1, 2, \ldots, k\). That is, with \((k+1)\) groups, we need \(k\) proportionality constants \(\psi_1, \ldots, \psi_k\) in order to define proportional hazards. Note also that in this formulation (there are others), one group is “marked” as a reference group, that is, to this group there is no proportionality constant attached. All relations are relative to the reference group. Note also that it essentially doesn’t matter which group is chosen as the reference. This choice does not change the model itself, only its representation.

With \((k + 1)\) groups, we need \(k\) indicators. Let

\[\begin{equation*} \mathbf{x} = (x_{1}, x_{2}, \ldots, x_{k}). \end{equation*}\] Then

\[\begin{eqnarray*} \mathbf{x} = (0, 0, \ldots, 0) & \Rightarrow & \mbox{group 0} \\ \mathbf{x} = (1, 0, \ldots, 0) & \Rightarrow & \mbox{group 1} \\ \mathbf{x} = (0, 1, \ldots, 0) & \Rightarrow & \mbox{group 2} \\ \cdots & & \cdots \\ \mathbf{x} = (0, 0, \ldots, 1) & \Rightarrow & \mbox{group k} \end{eqnarray*}\]

and

\[\begin{equation*} h(t; \mathbf{x}) = h_0(t) \prod_{\ell = 1}^k \psi_{\ell}^{x_{\ell}} = \left\{ \begin{array}{ll} h_0(t), & \mathbf{x} = (0, 0, \ldots, 0) \\ h_0(t) \psi_j, & x_{j} = 1, \quad j = 1, \ldots k \end{array}\right. \end{equation*}\] With \(\psi_j = e^{\beta_j}, \; j = 1, \ldots, k\), we get

\[\begin{equation} h(t; \mathbf{x}) = h_0(t) e^{x_{1}\beta_1 + x_{2} \beta_2 + \cdots + x_{k} \beta_k} = h_0(t) e^{\mathbf{x} \boldsymbol{\beta}}, \tag{3.5} \end{equation}\]

where \(\boldsymbol{\beta} = (\beta_1, \beta_2, \ldots, \beta_k)\).

3.3.3 The General Proportional Hazards Regression Model

We may now generalize (3.5) by letting the components of \(\mathbf{x}_i\) take any value. Let data and model take the following form:

Data:

\[\begin{equation} (t_{i0}, t_i, d_i, \mathbf{x}_i), \; i = 1, \ldots, n, \end{equation}\]

where \(t_{i0}\) is the left truncation time point (if \(y_{i0} = 0\) for all \(i\), then this variable may be omitted, \(t_i\) is the end time point, \(d_i\) is the “event indicator” (\(1\) or TRUE if event, else \(0\) or FALSE), and \(\mathbf{x}_i\) is a vector of explanatory variables.

Model:

\[\begin{equation} h(t; \mathbf{x}_i) = h_0(t) e^{\mathbf{x}_i \boldsymbol{\beta}}, \quad t > 0. \tag{3.6} \end{equation}\]

This is a regression model where the response variable is \((t_0, t, d)\) (we will call it a survival object) and the explanatory variable is \(\mathbf{x}\), possibly (often) vector valued.

In (3.6) there are two components to estimate, the regression coefficients \(\boldsymbol{\beta}\), and the baseline hazard function \(h_0(t), \; t > 0\). For the former task, the partial likelihood (Cox 1975) is used. See Appendix A for a brief summary.

References

Cox, D. R. 1972. “Regression Models and Life Tables.” Journal of the Royal Statistical Society Series B (with Discussion) 34: 187–220.

Cox, D. 1975. “Partial Likelihood.” Biometrika 62: 269–76.