8.3 The shared frailty model

Frailty models work best when there is a natural grouping of the data, so that observations from the same group are dependent, while two individual survival times from different groups can be regarded as independent. Such a model may be described as \[\begin{equation} h_i(t; \mathbf{x}) = h_{i0}(t) e^{\boldsymbol{\beta}\mathbf{x}}, \quad i = 1, \ldots, s;\; t > 0, \tag{8.2} \end{equation}\] which simply is a stratified Cox regression model. By assuming
\[\begin{equation} h_{i0}(t) = Z_i h_0(t), \quad i = 1, \ldots s;\; t > 0, \tag{8.3} \end{equation}\] the traditional multivariate frailty model emerges. Here it is assumed that \(Z_1, \ldots, Z_s\) are independent and identically distributed (iid), usually with a lognormal distribution. From (8.2) and (8.3) we get, with \(U_i = \log(Z_i)\), \[\begin{equation*} h_i(t; \mathbf{x}) = h_{0}(t) e^{\boldsymbol{\beta}\mathbf{x} + U_i}, \quad i = 1, \ldots, s;\; t > 0. \end{equation*}\] In this formulation, \(U_1, \ldots, U_s\) are iid normal with mean zero and unknown variance \(\sigma^2\). Another popular choice of distribution for the \(Z\):s is the gamma distribution.

In R, the package coxme (Therneau 2011) fits frailty models. We look at the fertility data set in the R package eha.

library(eha)
head(fert)

##   id parity age year next.ivl event prev.ivl     ses parish
## 1  1      0  24 1825    0.411     1       NA  farmer    SKL
## 2  1      1  25 1826   22.348     0    0.411  farmer    SKL
## 3  2      0  18 1821    0.304     1       NA unknown    SKL
## 4  2      1  19 1821    1.837     1    0.304 unknown    SKL
## 5  2      2  21 1823    2.546     1    1.837 unknown    SKL
## 6  2      3  23 1826    2.541     1    2.546 unknown    SKL

It seems natural to assume that the lengths of birth intervals vary with mother, so we try a frailty model with id as the grouping variable. Also notice that the first interval for each woman is measured from marriage (only married women are included in this data set) to first birth, so we will start by removing them. They are characterized by parity being 0.

fe <- fert[fert$parity != 0, ]
library(coxme)
fit <- coxme(Surv(next.ivl, event) ~ age + ses + parity + 
             (1 | id), data = fe)
summary(fit, digits = 3)

## Cox mixed-effects model fit by maximum likelihood
##   Data: fe
##   events, n = 8458, 10312
##   Iterations= 15 65 
##                     NULL Integrated    Fitted
## Log-likelihood -71308.31  -69905.17 -68211.87
## 
##                   Chisq   df p  AIC   BIC
## Integrated loglik  2806    6 0 2794  2752
##  Penalized loglik  6193 1212 0 3770 -4764
## 
## Model:  Surv(next.ivl, event) ~ age + ses + parity + (1 | id) 
## Fixed coefficients
##               coef exp(coef) se(coef)      z       p
## age        -0.0796     0.923  0.00417 -19.11 0.00000
## sesunknown -0.0779     0.925  0.06067  -1.28 0.20000
## sesupper    0.1009     1.106  0.15799   0.64 0.52000
## seslower   -0.1703     0.843  0.04967  -3.43 0.00061
## parity     -0.1141     0.892  0.01014 -11.26 0.00000
## 
## Random effects
##  Group Variable  Std Dev Variance
##  id    Intercept 0.752   0.565

The estimates of the fixed effects have the same interpretation as in ordinary Cox regression. The question is if the results point to the significance of including frailty terms? In the last line of the output we get the estimate of the frailty variance, \(\sigma^2 = 0.565\), but no \(p\)-value for the test of the null hypothesis \(H_0: \sigma = 0\). One explanation to this is that ordinary asymptotic theory does not hold for parameter values at the boundary of the parameter space.

One way to get a feeling for the impact of the frailty effect is to fit the same model but without frailty, i.e., the term (1 | id).

fit0 <- coxreg(Surv(next.ivl, event) ~ age + ses + parity, 
               data = fe)
summary(fit0)

## Covariate           Mean       Coef     Rel.Risk   S.E.    LR p
## age                33.088    -0.080     0.923     0.003    0.000 
## ses                                                        0.000 
##           farmer    0.491     0         1 (reference)
##          unknown    0.186    -0.085     0.919     0.030
##            upper    0.018     0.147     1.158     0.083
##            lower    0.305    -0.083     0.920     0.026
## parity              4.242     0.013     1.013     0.007    0.068 
## 
## Events                    8458 
## Total time at risk         29806 
## Max. log. likelihood      -70391 
## LR test statistic         1834.39 
## Degrees of freedom        5 
## Overall p-value           0

We can compare the two “max log likelihoods”, in the frailty model the “Integrated” value -69905, and in the fixed effects case -70391. The difference is so large (486) that we safely can reject the hypothesis that the frailty model is not needed. As an “informal” test, you could take twice that difference and treat it as as a \(\chi^2\) statistic with 1 degree of freedom, calculate a \(p\)-value and take as real \(p\)-value one half of that (all this because ordinary asymptotic theory does not hold for parameter values on the boundary of the parameter space!). This gives an approximation of the true \(p\)-value that is not too bad.

As a final example, let us look at old age mortality in the R package eha. This example also shows a dangerous situation that is too easy to overlook. It has nothing to do with frailty, but with a problem caused by missing data.

Example 8.1 (Old age mortality for siblings.) Take a look at the variables in oldmort:

head(oldmort)

##          id  enter   exit event birthdate m.id f.id    sex       civ  ses.50
## 1 765000603 94.510 95.813  TRUE  1765.490   NA   NA female     widow unknown
## 2 765000669 94.266 95.756  TRUE  1765.734   NA   NA female unmarried unknown
## 3 768000648 91.093 91.947  TRUE  1768.907   NA   NA female     widow unknown
## 4 770000562 89.009 89.593  TRUE  1770.991   NA   NA female     widow unknown
## 5 770000707 89.998 90.211  TRUE  1770.002   NA   NA female     widow  middle
## 6 771000617 88.429 89.762  TRUE  1771.571   NA   NA female     widow unknown
##   birthplace imr.birth   region
## 1     remote  22.20000    rural
## 2     parish  17.71845 industry
## 3     parish  12.70903    rural
## 4     parish  16.90544 industry
## 5     region  11.97183    rural
## 6     parish  13.08594    rural

The variable m.id is mother’s id. This means that siblings will have the same value on that variable, and we can check whether we find a ``sibling effect’’ in the sense that siblings tend to have a similar risk of dying.

fit <- coxme(Surv(enter, exit, event) ~ sex + civ + (1 | m.id), 
             data = oldmort)
fit

## Cox mixed-effects model fit by maximum likelihood
##   Data: oldmort
##   events, n = 888, 3340 (3155 observations deleted due to missingness)
##   Iterations= 16 84 
##                     NULL Integrated    Fitted
## Log-likelihood -5702.662   -5693.43 -5646.252
## 
##                    Chisq    df          p   AIC     BIC
## Integrated loglik  18.46  4.00 1.0012e-03 10.46   -8.69
##  Penalized loglik 112.82 49.23 6.7155e-07 14.36 -221.39
## 
## Model:  Surv(enter, exit, event) ~ sex + civ + (1 | m.id) 
## Fixed coefficients
##                  coef exp(coef)   se(coef)     z       p
## sexfemale  -0.2026325 0.8165783 0.07190559 -2.82 0.00480
## civmarried -0.4653648 0.6279060 0.12144548 -3.83 0.00013
## civwidow   -0.3305040 0.7185615 0.11964284 -2.76 0.00570
## 
## Random effects
##  Group Variable  Std Dev    Variance  
##  m.id  Intercept 0.23719205 0.05626007

Now, compare with the corresponding fixed effects model:

fit0 <- coxreg(Surv(enter, exit, event) ~ sex + civ, 
               data = oldmort)
summary(fit0)

## Covariate           Mean       Coef     Rel.Risk   S.E.    LR p
## sex                                                        0.000 
##             male    0.406     0         1 (reference)
##           female    0.594    -0.243     0.784     0.047
## civ                                                        0.000 
##        unmarried    0.080     0         1 (reference)
##          married    0.530    -0.397     0.672     0.081
##            widow    0.390    -0.261     0.770     0.079
## 
## Events                    1971 
## Total time at risk         37824 
## Max. log. likelihood      -13558 
## LR test statistic         41.23 
## Degrees of freedom        3 
## Overall p-value           5.83367e-09

Note that we now got a very much smaller value of the maximized log likelihood, -13558 compared to -5693! Something is wrong, and the big problem is that the two analyzes were performed on different data sets. How is that possible, we used oldmort on both occasions? The variable m.id has a lot of missing values, almost 50 per cent are missing (NA), and the standard treatment of NA:s in R is to simply remove each record that contains an NA on any of the variables in the analysis. So, in the first case, the frailty model, a lot of records are removed before analysis, but not in the second. To be able to compare the models we must remove all records with m.id = NA from the second analysis.

olm <- oldmort[!is.na(oldmort$m.id), ]
fit0 <- coxreg(Surv(enter, exit, event) ~ sex + civ, 
               data = olm)
summary(fit0)

## Covariate           Mean       Coef     Rel.Risk   S.E.    LR p
## sex                                                        0.006 
##             male    0.418     0         1 (reference)
##           female    0.582    -0.196     0.822     0.070
## civ                                                        0.001 
##        unmarried    0.076     0         1 (reference)
##          married    0.555    -0.443     0.642     0.118
##            widow    0.369    -0.310     0.733     0.116
## 
## Events                    888 
## Total time at risk         19855 
## Max. log. likelihood      -5693.8 
## LR test statistic         17.71 
## Degrees of freedom        3 
## Overall p-value           0.000504597

This is another story! We now got very similar values of the maximized log likelihoods, -5693.81 compared to -5693.43! The conclusion is that in this case, there is no frailty effect whatsoever.

One lesson to learn from this example is that you have to be very cautious when a data set contains missing values. Some functions, like drop1, give a warning when a situation like this is detected, but especially when comparisons are made in more than on step, it is too easy to forget the dangers.

Also note the warning that is printed in the results of coxme: 3163 observations deleted due to missingness. This is a warning that should be taken seriously. \(\Box\)

References

Therneau, T. 2011. Coxme: Mixed Effects Cox Models. http://CRAN.R-project.org/package=coxme.