3.2 The Log-Rank Test
The log-rank test is a \(k\)-sample test of equality of survival functions. We first look at the two-sample case, that is, \(k = 2\).
Suppose that we have the small data set illustrated in
Figure 3.3. There are two samples, the letters (A, B,
C, D, E) and the numbers (1, 2, 3, 4, 5).
FIGURE 3.3: Two-sample data, the letters (dashed) and the numbers (solid). Circles denote censored observations, plusses events.
The data in Figure 3.3 can be presented i tabular form, see Table 3.1.
| group | time | event |
|---|---|---|
| numbers | 4.0 | TRUE |
| numbers | 2.0 | FALSE |
| numbers | 6.0 | TRUE |
| numbers | 1.0 | TRUE |
| numbers | 3.5 | FALSE |
| letters | 5.0 | TRUE |
| letters | 3.0 | TRUE |
| letters | 6.0 | FALSE |
| letters | 1.0 | TRUE |
| letters | 2.5 | FALSE |
We are interested in investigating whether letters and numbers
have the same survival chances or not. Therefore, the hypothesis
\[\begin{equation*} H_0: \text{No difference in survival between numbers and letters} \end{equation*}\] is formulated. In order to test \(H_0\), we make five tables, one for each observed event time, see Table 3.2, where the the first table, relating to failure time \(t_{(1)} = 1\), is shown.
| Deaths | Survivals | Total | |
|---|---|---|---|
| numbers | 1 | 4 | 5 |
| letters | 1 | 4 | 5 |
| Total | 2 | 8 | 10 |
Let us look at the table at failure time \(t_{(1)} = 1\), i.e., Table 3.2,
from the viewpoint of the numbers.
- The observed number of deaths among
numbers: \(1\). - The expected number of deaths among
numbers: \(2 \times 5 / 10 = 1\).
The expected number is calculated under \(H_0\), i.e., as if there is
no difference between letters and numbers regarding mortality. It is further
assumed that the two margins (Total) are given (fixed).
Then, given two deaths in
total and five out of ten observations are from the group numbers,
the expected number of deaths is calculated as above.
This procedure is repeated for each of the five tables, and the results are summarized in Table 3.3.
| Observed | Expected | Difference | Variance | |
|---|---|---|---|---|
| t(1) | 1 | 1.0 | 0.0 | 0.44 |
| t(2) | 0 | 0.5 | -0.5 | 0.25 |
| t(3) | 1 | 0.5 | 0.5 | 0.25 |
| t(4) | 0 | 0.3 | -0.3 | 0.22 |
| t(5) | 1 | 0.5 | 0.5 | 0.25 |
| Sum | 3 | 2.8 | 0.2 | 1.41 |
Finally, the observed test statistic \(T\) is calculated as
\[\begin{equation*}
T = \frac{0.2^2}{1.41} \approx 0.028
\end{equation*}\]
Under the null hypothesis, this is an observed value from a
\(\chi^2(1)\)
distribution, and \(H_0\) should be rejected for large values of
\(T\). Using a level of significance of 5%, the cutting point for the
value of \(T\) is 3.84, far from our observed value of 1.41. The
conclusion is
therefore that there is no (statistically significant) difference in
survival chances between letters and numbers.
Note, however, that this result depends on
asymptotic (large sample)
properties, and
in this toy example, these properties are not valid.
For more detail about the underlying theory, see Appendix A.
In R, the log-rank test is performed by the
coxph function in the
package survival (there are other options).
Let us now look at a real data example, the old age mortality data set
oldmort in eha. See Table 3.4 for a sample of five records with selected columns.
| id | enter | exit | event | sex | civ |
|---|---|---|---|---|---|
| 793001208 | 66.498 | 67.988 | 0 | male | married |
| 793001208 | 67.988 | 72.820 | 0 | male | married |
| 793001208 | 72.820 | 75.542 | 1 | male | widow |
| 793001209 | 66.446 | 76.568 | 1 | female | married |
| 793001210 | 66.446 | 67.936 | 0 | female | married |
We are interested in comparing male and female mortality in the ages 60–85 with a logrank test, and for that purpose we run a Cox regression analysis:
The result is given by summary(fit):
Call:
coxph(formula = Surv(enter, exit, event) ~ sex, data = om)
n= 6456, number of events= 1823
coef exp(coef) se(coef) z Pr(>|z|)
sexfemale -0.20635 0.81354 0.04718 -4.374 1.22e-05
exp(coef) exp(-coef) lower .95 upper .95
sexfemale 0.8135 1.229 0.7417 0.8924
Concordance= 0.532 (se = 0.007 )
Likelihood ratio test= 18.95 on 1 df, p=1e-05
Wald test = 19.13 on 1 df, p=1e-05
Score (logrank) test = 19.2 on 1 df, p=1e-05
Obviously, we got a lot of information here, more than we actually need. We have in fact performed
a Cox regression slightly ahead of schedule! The result of the
logrank test is displayed on the last line of output. The \(p\)-value is
\(1.179 \times 10^{-5}\), a very small number. Thus, there is a very significant difference in mortality between men and women.
But how large is the difference? The answer is found at exp(coef) = 0.8135, which tells us that the female risk of dying is about 81% of the male risk, at each age between 60 and 85.
Remember that this result depends on the proportional hazards assumption. We can graphically check it as follows.
sf <- survfit(Surv(enter, exit, event) ~ strata(sex),
data = om, start.time = 60)
plot(sf, xlab = "Age", fun = "cumhaz")
Note that the grouping factor (sex) is given through the
function strata in the formula. The result is shown in Figure 3.4.
FIGURE 3.4: Old age mortality, women vs. men, cumulative hazards.
The proportionality assumption seems to be a good description from 60 to 85–90 years of age, but it seems more doubtful in the very high ages. One reason for this may be that the high-age estimates are based on few observations (most of the individuals in the sample died earlier), so random fluctuations have a large impact in the high ages.
3.2.1 Several samples
The result for the two-sample case is easily extended to the \(k\)-sample
case. Instead of one \(2 \times 2\) table per observed event time we get one
\(k\times 2\) table per observed event time and we have to calculate expected
and observed numbers of events for \((k-1)\) groups at each failure time. The
resulting test statistic will have \((k-1)\) degrees of freedom and still be
approximately \(\chi^2\) distributed. This is illustrated with the same data
set, oldmort, as above, but with the covariate civ,
which is a factor with three levels (unmarried, married, widow), instead of sex.
Furthermore, the investigation is limited to male mortality.
Call:
coxph(formula = Surv(enter, exit, event) ~ civ, data = om[om$sex ==
"male", ])
n= 2872, number of events= 811
coef exp(coef) se(coef) z Pr(>|z|)
civmarried -0.5164 0.5967 0.1440 -3.587 0.000335
civwidow -0.2636 0.7683 0.1496 -1.762 0.078110
exp(coef) exp(-coef) lower .95 upper .95
civmarried 0.5967 1.676 0.450 0.7912
civwidow 0.7683 1.302 0.573 1.0301
Concordance= 0.536 (se = 0.009 )
Likelihood ratio test= 18.63 on 2 df, p=9e-05
Wald test = 19.63 on 2 df, p=5e-05
Score (logrank) test = 19.86 on 2 df, p=5e-05
The degrees of freedom for the score test
is now 2, equal to the number
of levels in civ
minus one. Being unmarried seem to have great impact on
old age mortality. It is however recommended to check the proportionality
assumption graphically, see Figure 3.5.
FIGURE 3.5: Old age male mortality by civil status, cumulative hazards.
There is obviously nothing that indicates non-proportionality in this case
either. Furthermore, the unmarried have have significantly higher mortality than both married
and widows.
We do not go deeper into this matter here, mainly because the logrank test generally is a special case of Cox regression, which will be described in detail later in this chapter.