5.1 Time-varying covariates

Only a special kind of time-varying covariates can be treated in R by the packages eha and survival, and that is so-called piecewise constant functions. How this is done is best described by an example.

Example 5.1 (Civil status)

The covariate (factor) civil status (called \(civst\) in the R data frame) is an explanatory variable in a mortality study, which changes value from 0 to 1 at marriage. How should this be coded in the data file? The solution is to create two records (for individuals that marry), each with a fixed value of \(civ\_st\):

  1. Original record: \((t_0, t, d, x(s), t_0 < s \le t)\), married at time \(T\), \(t_0 < T < t\):

\[ \text{civst}(s) = \left\{\begin{array}{ll} \text{unmarried} , & s < T \\ \text{married}, & s \ge T \end{array}\right. \]

  1. First new record: \((t_0, T, 0, \text{unmarried})\), always censored.

  2. Second new record: \((T, t, d, \text{married})\).

The data file will contain two records like (with \(T = 30\)) what you can see in Table 5.1.

TABLE 5.1: The coding of a time-varying covariate.
id enter exit event civst
23 0 30 0 unmarried
23 30 80 1 married

In this way, a time-varying covariate can always be handled by utilizing left truncation and right censoring. See also Figure 5.1 for an illustration.

A time-varying covariate (unmarried or married). Right censored and left truncated at age 30 (at marriage).

FIGURE 5.1: A time-varying covariate (unmarried or married). Right censored and left truncated at age 30 (at marriage).

And note that this situation is formally equivalent to a situation with two individuals, one unmarried and one married. The first one is right censored at exactly the same time as the second one enters the study (left truncation). \(\Box\)

A word of caution: Marriage status may be interpreted as an internal covariate, i.e., the change of marriage status is individual, and may depend on health status. For instance, maybe only healthy persons get married. If so, a vital condition in Cox regression is violated; the risk of dying may only be modeled by conditions from the strict history of each individual. Generally, the use of time dependent covariates is dangerous, and one should always think of possible reversed causality taking place when allowing for it.