10.4 Regression

It is also possible to include covariates in the estimating procedure. \[\begin{equation*} P_k(t) = 1 - \exp\left\{-\Gamma_k(t)\exp\left(X\beta^{(k)}\right)\right\}, \quad t > 0, \quad k = 1, 2, 3. \end{equation*}\] These equations are estimated separately. In R, this can be done with the package cmprsk (Fine and Gray 1999; Gray 2011).

Example 10.1 (Mortality and emigration)

People born in Skellefteå in northern Sweden around the year 1800 are followed over time from 15 years of age to out-migration or death. Right censoring occurs for those who are still present on January 1, 1870. At exit the cause is noted, see Figure 10.2.

Mortality and emigration, Skellefteå, 1800--1870.

FIGURE 10.2: Mortality and emigration, Skellefteå, 1800–1870.

The data at hand looks like this:

##         Pnr      Mpnr      Fpnr birthdate enter     exit event parity  ab
## 1 801000758 775000541 771000493  1801.023    15 39.47241     2      3 leg
## 2 801000759 758000309 753000242  1801.028    15 50.00000     0      4 leg
## 3 801000760 770000458 765000452  1801.031    15 50.00000     0     10 leg
## 4 801000763 777000494 775000404  1801.039    15 26.10868     2      3 leg
## 5 801000768 780000572 774000453  1801.061    15 16.43292     2      2 leg
## 6 801000770 774000442 776000421  1801.072    15 48.59237     1      2 leg

A summary:

##    birthdate        enter         exit           event            parity      
##  Min.   :1801   Min.   :15   Min.   :15.00   Min.   :0.0000   Min.   : 1.000  
##  1st Qu.:1835   1st Qu.:15   1st Qu.:20.47   1st Qu.:0.0000   1st Qu.: 2.000  
##  Median :1856   Median :15   Median :28.34   Median :0.0000   Median : 3.000  
##  Mean   :1852   Mean   :15   Mean   :31.82   Mean   :0.7138   Mean   : 3.916  
##  3rd Qu.:1872   3rd Qu.:15   3rd Qu.:44.82   3rd Qu.:2.0000   3rd Qu.: 6.000  
##  Max.   :1886   Max.   :15   Max.   :50.00   Max.   :4.0000   Max.   :19.000  
##      ab       
##  leg  :16423  
##  illeg:  850  
##               
##               
##               
## 
## 
##     0     1     2     3     4 
## 10551  2102  4097    59   464

The variable event is coded as shown in Table 10.1.

TABLE 10.1: Competing risks in Skellefteå
Code Meaning Frequency
0 Censored 10543
1 Dead 2102
3 Moved to other parish in Sweden 4097
4 Moved to other Scandinavian country 59
5 Moved outside tha Scandinavian countries 464

The code for the function comp is displayed at the end of this chapter. It is not necessary to understand before reading on. It may be useful as a template for those who wants to analyze competing risks.

Cause-specific exit probabilities; mortality and migration from Skellefteå.

FIGURE 10.3: Cause-specific exit probabilities; mortality and migration from Skellefteå.

And with covariates and cmprsk; the syntax to get it done is a little bit nonstandard. Here moving to other parish in Sweden is analysed (failcode = 3).

library(cmprsk)
xx <- model.matrix(~ -1 + parity, zz)
fit <- crr(zz$exit, zz$event, xx, failcode = 3)
fit
## convergence:  TRUE 
## coefficients:
##  parity 
## 0.03927 
## standard errors:
## [1] 0.04489
## two-sided p-values:
## parity 
##   0.38

Note that this is not Cox regression (but close!). The corresponding Cox regression is

fit <- coxreg(Surv(enter, exit, event == 3) ~ parity, data = zz)
summary(fit)
Covariate           Mean       Coef     Rel.Risk   S.E.    LR p
parity              3.890     0.045     1.046     0.048    0.363 

Events                    59 
Total time at risk        290569 
Max. log. likelihood      -548.83 
LR test statistic         0.83 
Degrees of freedom        1 
Overall p-value           0.363362

References

Fine, J. P., and R. J. Gray. 1999. “A Proportional Hazards Model for the Subdistribution of a Competing Risk.” Journal of the American Statistical Association 94: 496–509.

Gray, Bob. 2011. Cmprsk: Subdistribution Analysis of Competing Risks. http://CRAN.R-project.org/package=cmprsk.