

coxph(survival5)                             R Documentation

_F_i_t _P_r_o_p_o_r_t_i_o_n_a_l _H_a_z_a_r_d_s _R_e_g_r_e_s_s_i_o_n _M_o_d_e_l

_D_e_s_c_r_i_p_t_i_o_n_:

     Fits a Cox proportional hazards regression model.  Time
     dependent variables, time dependent strata, multiple
     events per subject, and other extensions are incorpo-
     rated using the counting process formulation of Ander-
     sen and Gill.

_U_s_a_g_e_:

     coxph(formula, data=sys.parent(), subset,
            na.action, weights, eps=0.0001, init,
            iter.max=10, method=c("efron","breslow","exact"),
            singular.ok=T, robust,
            model=F, x=F, y=T)

_A_r_g_u_m_e_n_t_s_:

 formula: a formula object, with the response on the left of
          a `~' operator, and the terms on the right.  The
          response must be a survival object as returned by
          the `Surv' function.

    data: a data.frame in which to interpret the variables
          named in the `formula', or in the `subset' and the
          `weights' argument.

  subset: expression saying that only a subset of the rows
          of the data should be used in the fit.

na.action: a missing-data filter function, applied to the
          model.frame, after any subset argument has been
          used.  Default is `options()$na.action'.

 weights: case weights.

     eps: convergence threshold.  Iteration will continue
          until the relative change in the log-likelihood is
          less than eps.  Default is .0001.

    init: vector of initial values of the iteration.
          Default initial value is zero for all variables.

iter.max: maximum number of iterations to perform.  Default
          is 10.

  method: a character string specifying the method for tie
          handling.  If there are no tied death times all
          the methods are equivalent.  Nearly all Cox
          regression programs use the Breslow method by
          default, but not this one.  The Efron approxima-
          tion is used as the default here, as it is much
          more accurate when dealing with tied death times,
          and is as efficient computationally.  The exact
          method computes the exact partial likelihood,
          which is equivalent to a conditional logistic
          model.  If there are a large number of ties the
          computational time will be excessive.

singular.ok: logical value indicating how to handle
          collinearity in the model matrix.  If `TRUE', the
          program will automatically skip over columns of
          the X matrix that are linear combinations of ear-
          lier columns.  In this case the coefficients for
          such columns will be NA, and the variance matrix
          will contain zeros.  For ancillary calculations,
          such as the linear predictor, the missing coeffi-
          cients are treated as zeros.

  robust: if TRUE a robust variance estimate is returned.
          Default is `TRUE' if the model includes a `clus-
          ter()' operative, `FALSE' otherwise.

  model,: flags to control what is returned.  If these are
          true, then the model frame, the model matrix,
          and/or the response is returned as components of
          the fitted model, with the same names as the flag
          arguments.

_D_e_t_a_i_l_s_:

     The proportional hazards model is usually expressed in
     terms of a single survival time value for each person,
     with possible censoring.  Andersen and Gill reformu-
     lated the same problem as a counting process; as time
     marches onward we observe the events for a subject,
     rather like watching a Geiger counter.  The data for a
     subject is presented as multiple rows or "observa-
     tions", each of which applies to an interval of obser-
     vation (start, stop].

_V_a_l_u_e_:

     an object of class `"coxph"'. See `coxph.object' for
     details.

_S_i_d_e _E_f_f_e_c_t_s_:

     Depending on the call, the `predict', `residuals', and
     `survfit' routines may need to reconstruct the x matrix
     created by `coxph'.  Differences in the environment,
     such as which data frames are attached or the value of
     `options()$contrasts', may cause this computation to
     fail or worse, to be incorrect.  See the survival
     overview document for details.

_S_P_E_C_I_A_L _T_E_R_M_S_:

     There are two special terms that may be used in the
     model equation.  A 'strata' term identifies a strati-
     fied Cox model; separate baseline hazard functions are
     fit for each strata.  The `cluster' term is used to
     compute a robust variance for the model.  The term `+
     cluster(id)', where `id == unique(id)', is equivalent
     to specifying the `robust=T' argument, and produces an
     approximate jackknife estimate of the variance.  If the
     `id' variable were not unique, but instead identifies
     clusters of correlated observations, then the variance
     estimate is based on a grouped jackknife.

_C_O_N_V_E_R_G_E_N_C_E_:

     In certain data cases the actual MLE estimate of a
     coefficient is infinity, e.g., a dichotomous variable
     where one of the groups has no events.  When this hap-
     pens the associated coefficient grows at a steady pace
     and a race condition will exist in the fitting routine:
     either the log likelihood converges, the information
     matrix becomes effectively singular, an argument to exp
     becomes too large for the computer hardware, or the
     maximum number of interactions is exceeded.  The rou-
     tine attempts to detect when this has happened, not
     always successfully.

_P_E_N_A_L_I_S_E_D _R_E_G_R_E_S_S_I_O_N_:

     `coxph' can now maximise a penalised partial likelihood
     with arbitrary user-defined penalty.  Supplied penalty
     functions include ridge regression (ridge), smoothing
     splines (pspline), and frailty models (frailty).

_R_e_f_e_r_e_n_c_e_s_:

     P. Andersen and R. Gill. "Cox's regression model for
     counting processes, a large sample study", Annals of
     Statistics,
      10:1100-1120, 1982.

     T. Therneau, P. Grambsch, and T. Fleming. "Martingale
     based residuals for survival models", Biometrika, March
     1990.

_E_x_a_m_p_l_e_s_:

     # Create the simplest test data set
     #
      test1 <- list(time=  c(4, 3,1,1,2,2,3),
                     status=c(1,NA,1,0,1,1,0),
                     x=     c(0, 2,1,1,1,0,0),
                     sex=   c(0, 0,0,0,1,1,1))
      coxph( Surv(time, status) ~ x + strata(sex), test1)  #stratified model

     #
     # Create a simple data set for a time-dependent model
     #
     test2 <- list(start=c(1, 2, 5, 2, 1, 7, 3, 4, 8, 8),
                     stop =c(2, 3, 6, 7, 8, 9, 9, 9,14,17),
                     event=c(1, 1, 1, 1, 1, 1, 1, 0, 0, 0),
                     x    =c(1, 0, 0, 1, 0, 1, 1, 1, 0, 0) )

     summary( coxph( Surv(start, stop, event) ~ x, test2))

