8.3 The shared frailty model
Frailty models work best when there is a natural grouping of the data, so
that observations from the same group are dependent, while two individual
survival times from different groups can be regarded as independent. Such a
model may be described as
\[\begin{equation}
h_i(t; \mathbf{x}) = h_{i0}(t) e^{\boldsymbol{\beta}\mathbf{x}}, \quad i = 1, \ldots, s;\; t > 0,
\tag{8.2}
\end{equation}\]
which simply is a stratified Cox regression model. By
assuming
\[\begin{equation}
h_{i0}(t) = Z_i h_0(t), \quad i = 1, \ldots s;\; t > 0,
\tag{8.3}
\end{equation}\]
the traditional multivariate frailty model emerges. Here it is assumed that
\(Z_1, \ldots, Z_s\) are independent and identically distributed
(iid), usually with a lognormal distribution. From (8.2)
and (8.3) we get, with \(U_i = \log(Z_i)\),
\[\begin{equation*}
h_i(t; \mathbf{x}) = h_{0}(t) e^{\boldsymbol{\beta}\mathbf{x} + U_i}, \quad i = 1, \ldots, s;\; t > 0.
\end{equation*}\]
In this formulation, \(U_1, \ldots, U_s\) are iid normal with mean
zero and unknown variance \(\sigma^2\).
Another popular choice of distribution for the \(Z\):s is the gamma
distribution.
In R, the package coxme (Therneau 2011) fits frailty models.
We look at the fertility data set in the R package eha.
## id parity age year next.ivl event prev.ivl ses parish
## 1 1 0 24 1825 0.411 1 NA farmer SKL
## 2 1 1 25 1826 22.348 0 0.411 farmer SKL
## 3 2 0 18 1821 0.304 1 NA unknown SKL
## 4 2 1 19 1821 1.837 1 0.304 unknown SKL
## 5 2 2 21 1823 2.546 1 1.837 unknown SKL
## 6 2 3 23 1826 2.541 1 2.546 unknown SKL
It seems natural to assume that the lengths of birth intervals vary with
mother, so we try a frailty model with id as the grouping
variable. Also notice that the first interval for each woman is measured
from marriage (only married women are included in this data set) to first
birth, so we will start by removing them. They are characterized by
parity being 0.
fe <- fert[fert$parity != 0, ]
library(coxme)
fit <- coxme(Surv(next.ivl, event) ~ age + ses + parity +
(1 | id), data = fe)
summary(fit, digits = 3)## Cox mixed-effects model fit by maximum likelihood
## Data: fe
## events, n = 8458, 10312
## Iterations= 15 65
## NULL Integrated Fitted
## Log-likelihood -71308.31 -69905.17 -68211.87
##
## Chisq df p AIC BIC
## Integrated loglik 2806 6 0 2794 2752
## Penalized loglik 6193 1212 0 3770 -4764
##
## Model: Surv(next.ivl, event) ~ age + ses + parity + (1 | id)
## Fixed coefficients
## coef exp(coef) se(coef) z p
## age -0.0796 0.923 0.00417 -19.11 0.00000
## sesunknown -0.0779 0.925 0.06067 -1.28 0.20000
## sesupper 0.1009 1.106 0.15799 0.64 0.52000
## seslower -0.1703 0.843 0.04967 -3.43 0.00061
## parity -0.1141 0.892 0.01014 -11.26 0.00000
##
## Random effects
## Group Variable Std Dev Variance
## id Intercept 0.752 0.565
The estimates of the fixed effects have the same interpretation as in ordinary Cox regression. The question is if the results point to the significance of including frailty terms? In the last line of the output we get the estimate of the frailty variance, \(\sigma^2 = 0.565\), but no \(p\)-value for the test of the null hypothesis \(H_0: \sigma = 0\). One explanation to this is that ordinary asymptotic theory does not hold for parameter values at the boundary of the parameter space.
One way to get a feeling for the impact of the frailty effect is to fit the
same model but without frailty, i.e., the term (1 | id).
## Covariate Mean Coef Rel.Risk S.E. LR p
## age 33.088 -0.080 0.923 0.003 0.000
## ses 0.000
## farmer 0.491 0 1 (reference)
## unknown 0.186 -0.085 0.919 0.030
## upper 0.018 0.147 1.158 0.083
## lower 0.305 -0.083 0.920 0.026
## parity 4.242 0.013 1.013 0.007 0.068
##
## Events 8458
## Total time at risk 29806
## Max. log. likelihood -70391
## LR test statistic 1834.39
## Degrees of freedom 5
## Overall p-value 0
We can compare the two “max log likelihoods”, in the frailty model the “Integrated” value -69905, and in the fixed effects case -70391. The difference is so large (486) that we safely can reject the hypothesis that the frailty model is not needed. As an “informal” test, you could take twice that difference and treat it as as a \(\chi^2\) statistic with 1 degree of freedom, calculate a \(p\)-value and take as real \(p\)-value one half of that (all this because ordinary asymptotic theory does not hold for parameter values on the boundary of the parameter space!). This gives an approximation of the true \(p\)-value that is not too bad.
As a final example, let us look at old age mortality in the R package
eha. This example also shows a dangerous situation that is too easy
to overlook. It has nothing to do with frailty, but with a problem caused
by missing data.
oldmort:
## id enter exit event birthdate m.id f.id sex civ ses.50
## 1 765000603 94.510 95.813 TRUE 1765.490 NA NA female widow unknown
## 2 765000669 94.266 95.756 TRUE 1765.734 NA NA female unmarried unknown
## 3 768000648 91.093 91.947 TRUE 1768.907 NA NA female widow unknown
## 4 770000562 89.009 89.593 TRUE 1770.991 NA NA female widow unknown
## 5 770000707 89.998 90.211 TRUE 1770.002 NA NA female widow middle
## 6 771000617 88.429 89.762 TRUE 1771.571 NA NA female widow unknown
## birthplace imr.birth region
## 1 remote 22.20000 rural
## 2 parish 17.71845 industry
## 3 parish 12.70903 rural
## 4 parish 16.90544 industry
## 5 region 11.97183 rural
## 6 parish 13.08594 rural
The variable m.id is mother’s id. This means that siblings
will have the same value on that variable, and we can check whether we find
a ``sibling effect’’ in the sense that siblings tend to have a similar risk
of dying.
## Cox mixed-effects model fit by maximum likelihood
## Data: oldmort
## events, n = 888, 3340 (3155 observations deleted due to missingness)
## Iterations= 16 84
## NULL Integrated Fitted
## Log-likelihood -5702.662 -5693.43 -5646.252
##
## Chisq df p AIC BIC
## Integrated loglik 18.46 4.00 1.0012e-03 10.46 -8.69
## Penalized loglik 112.82 49.23 6.7155e-07 14.36 -221.39
##
## Model: Surv(enter, exit, event) ~ sex + civ + (1 | m.id)
## Fixed coefficients
## coef exp(coef) se(coef) z p
## sexfemale -0.2026325 0.8165783 0.07190559 -2.82 0.00480
## civmarried -0.4653648 0.6279060 0.12144548 -3.83 0.00013
## civwidow -0.3305040 0.7185615 0.11964284 -2.76 0.00570
##
## Random effects
## Group Variable Std Dev Variance
## m.id Intercept 0.23719205 0.05626007
Now, compare with the corresponding fixed effects model:
## Covariate Mean Coef Rel.Risk S.E. LR p
## sex 0.000
## male 0.406 0 1 (reference)
## female 0.594 -0.243 0.784 0.047
## civ 0.000
## unmarried 0.080 0 1 (reference)
## married 0.530 -0.397 0.672 0.081
## widow 0.390 -0.261 0.770 0.079
##
## Events 1971
## Total time at risk 37824
## Max. log. likelihood -13558
## LR test statistic 41.23
## Degrees of freedom 3
## Overall p-value 5.83367e-09
Note that we now got a very much smaller value of the maximized log likelihood,
-13558 compared to
-5693! Something is wrong, and the big problem is
that the two analyzes
were performed on different data sets. How is that possible, we used
oldmort on both occasions? The variable m.id has a lot of missing
values, almost 50 per cent are missing (NA), and the standard treatment of NA:s in
R is to simply remove each record that contains an NA on any of the
variables in the analysis. So, in the first case, the frailty model, a lot
of records are removed before analysis, but not in the second. To be able
to compare the models we must remove all records with m.id = NA from
the second analysis.
olm <- oldmort[!is.na(oldmort$m.id), ]
fit0 <- coxreg(Surv(enter, exit, event) ~ sex + civ,
data = olm)
summary(fit0)## Covariate Mean Coef Rel.Risk S.E. LR p
## sex 0.006
## male 0.418 0 1 (reference)
## female 0.582 -0.196 0.822 0.070
## civ 0.001
## unmarried 0.076 0 1 (reference)
## married 0.555 -0.443 0.642 0.118
## widow 0.369 -0.310 0.733 0.116
##
## Events 888
## Total time at risk 19855
## Max. log. likelihood -5693.8
## LR test statistic 17.71
## Degrees of freedom 3
## Overall p-value 0.000504597
This is another story! We now got very similar values of the maximized log likelihoods, -5693.81 compared to -5693.43! The conclusion is that in this case, there is no frailty effect whatsoever.
One lesson to learn from this example is that you have to be very cautious
when a data set contains missing values. Some functions, like drop1,
give a warning when a situation like this is detected, but especially when
comparisons are made in more than on step, it is too easy to forget the
dangers.
Also note the warning that is printed in the results of
coxme:
3163 observations deleted due to missingness. This is a warning that
should be taken seriously.
\(\Box\)
References
Therneau, T. 2011. Coxme: Mixed Effects Cox Models. http://CRAN.R-project.org/package=coxme.