

survexp(survival5)                           R Documentation

_C_o_m_p_u_t_e _E_x_p_e_c_t_e_d _S_u_r_v_i_v_a_l

_D_e_s_c_r_i_p_t_i_o_n_:

     Returns either the expected survival of a cohort of
     subjects, or the individual expected survival for each
     subject.

_U_s_a_g_e_:

     survexp(formula, data, weights, subset, na.action, times, cohort=T,
             conditional=F, ratetable=survexp.us, scale=1, npoints,
             se.fit=<<see below>>, model=F, x=F, y=F)

_A_r_g_u_m_e_n_t_s_:

 formula: formula object.  The response variable is a vector
          of follow-up times and is optional.  The predic-
          tors consist of optional grouping variables sepa-
          rated by the `+' operator (as in `survfit'), along
          with a `ratetable' term.  The `ratetable' term
          matches each subject to his/her expected cohort.

    data: data frame in which to interpret the variables
          named in the `formula', `subset' and `weights'
          arguments.

 weights: case weights.

  subset: expression indicating a subset of the rows of
          `data' to be used in the fit.

na.action: function to filter missing data. This is applied
          to the model frame after `subset' has been
          applied.  Default is `options()$na.action'. A pos-
          sible value for `na.action' is `na.omit', which
          deletes observations that contain one or more
          missing values.

   times: vector of follow-up times at which the resulting
          survival curve is evaluated.  If absent, the
          result will be reported for each unique value of
          the vector of follow-up times supplied in `for-
          mula'.

  cohort: logical value: if `FALSE', each subject is treated
          as a subgroup of size 1.  The default is `TRUE'.

conditional: logical value: if `TRUE', the follow-up times
          supplied in `formula' are death times and condi-
          tional expected survival is computed.  If `FALSE',
          the follow-up times are potential censoring times.
          If follow-up times are missing in `formula', this
          argument is ignored.

ratetable: a table of event rates, such as
          `survexp.uswhite', or a fitted Cox model.

   scale: numeric value to scale the results.  If
          `ratetable' is in units/day, `scale = 365.25'
          causes the output to be reported in years.

 npoints: number of points at which to calculate intermedi-
          ate results, evenly spaced over the range of the
          follow-up times.  The usual (exact) calculation is
          done at each unique follow-up time. For very large
          data sets specifying `npoints' can reduce the
          amount of memory and computation required.  For a
          prediction from a Cox model `npoints' is ignored.

  se.fit: compute the standard error of the predicted sur-
          vival.  The default is to compute this whenever
          the routine can, which at this time is only for
          the Ederer method and a Cox model as the rate
          table.

  model,: flags to control what is returned.  If any of
          these is true, then the model frame, the model
          matrix, and/or the vector of response times will
          be returned as components of the final result,
          with the same names as the flag arguments.

_D_e_t_a_i_l_s_:

     Individual expected survival is usually used in models
     or testing, to instance, assume that birth date, entry
     date into the study, sex and actual survival time are
     all known for a group of subjects.  The
     `survexp.uswhite' population tables contain expected
     death rates based on calendar year, sex and age.  Then
     haz <- -log(survexp(death.time ~ ratetable(sex=sex,
     year=entry.dt, age=(birth.dt-entry.dt)), cohort=F))
     gives for each subject the total hazard experienced up
     to their observed death time or censoring time.  This
     probability can be used as a rescaled time value in
     models: glm(status ~ 1 + offset(log(haz)), family=pois-
     son) glm(status ~ x + offset(log(haz)), family=poisson)
     In the first model, a test for intercept=0 is the one
     sample log-rank test of whether the observed group of
     subjects has equivalent survival to the baseline popu-
     lation.  The second model tests for an effect of vari-
     able `x' after adjustment for age and sex.

     Cohort survival is used to produce an overall survival
     curve.  This is then added to the Kaplan-Meier plot of
     the study group for visual comparison between these
     subjects and the population at large.  There are three
     common methods of computing cohort survival.  In the
     "exact method" of Ederer the cohort is not censored;
     this corresponds to having no response variable in the
     formula.  Hakulinen recommends censoring the cohort at
     the anticipated censoring time of each patient, and
     Verheul recommends censoring the cohort at the actual
     observation time of each patient.  The last of these is
     the conditional method.  These are obtained by using
     the respective time values as the follow-up time or
     response in the formula.

_V_a_l_u_e_:

     if `cohort=T' an object of class `survexp', otherwise a
     vector of per-subject expected survival values.  The
     former contains the number of subjects at risk and the
     expected survival for the cohort at each requested
     time.

_R_e_f_e_r_e_n_c_e_s_:

     G. Berry.  The analysis of mortality by the subject-
     years method.  Biometrics 1983, 39:173-84.  F Ederer, L
     Axtell, and S Cutler.  The relative survival rate: a
     statistical methodology. Natl Cancer Inst Monogr 1961,
     6:101-21.  T. Hakulinen.  Cancer survival corrected for
     heterogeneity in patient withdrawal.  Biometrics 1982,
     38:933.  H. Verheul, E. Dekker, P. Bossuyt, A. Moulijn,
     and A. Dunning.  Background mortality in clinical sur-
     vival studies.  Lancet 1993, 341:872-5.

_E_x_a_m_p_l_e_s_:

     data(ratetables)
     efit <- survexp( ~ ratetable(sex=sex, year=entry.dt, age=entry.dt-birth.dt))
     plot(survfit(Surv(futime, status) ~1))
     lines(efit)

