

step {base}                                  R Documentation

_C_h_o_o_s_e _a _m_o_d_e_l _b_y _A_I_C _i_n _a _S_t_e_p_w_i_s_e _A_l_g_o_r_i_t_h_m

_D_e_s_c_r_i_p_t_i_o_n_:

     Select a formula-based model by AIC.

_U_s_a_g_e_:

     step(object, scope, scale=0, direction=c("both", "backward", "forward"),
             trace=1, keep=NULL, steps=1000, k=2, ...)

_A_r_g_u_m_e_n_t_s_:

  object: an object representing a model of an appropriate
          class.  This is used as the initial model in the
          stepwise search.

   scope: defines the range of models examined in the step-
          wise search.

   scale: used in the definition of the AIC statistic for
          selecting the models, currently only for `lm',
          `aov' and `glm' models.

direction: the mode of stepwise search, can be one of
          `"both"', `"backward"', or `"forward"', with a
          default of `"both"'.  If the `scope' argument is
          missing, the default for `direction' is `"back-
          ward"'.

   trace: if positive, information is printed during the
          running of `step'.

    keep: a filter function whose input is a fitted model
          object and the associated `AIC' statistic, and
          whose output is arbitrary.  Typically `keep' will
          select a subset of the components of the object
          and return them. The default is not to keep any-
          thing.

   steps: the maximum number of steps to be considered.  The
          default is 1000 (essentially as many as required).
          It is typically used to stop the process early.

       k: the multiple of the number of degrees of freedom
          used for the penalty.  Only `k=2' gives the gen-
          uine AIC: `k = log(n)' is sometimes referred to as
          BIC or SBC.

     ...: any additional arguments to `extractAIC'.

_D_e_t_a_i_l_s_:

     `step' uses `add1' and `drop1' repeatedly; it will work
     for any method for which they work, and that is deter-
     mined by having a valid method for `extractAIC'.  When
     the additive constant can be chosen so that AIC is
     equal to Mallows' Cp, this is done and the tables are
     labelled appropriately.

     There is a potential problem in using `glm' fits with a
     variable `scale', as in that case the deviance is not
     simply related to the maximized log-likelihood. The
     function `extractAIC.glm' makes the appropriate adjust-
     ment for a `gaussian' family, but may need to be
     amended for other cases. (The `binomial' and `poisson'
     families have fixed `scale' by default and do not cor-
     respond to a particular maximum-likelihood problem for
     variable `scale'.)

_V_a_l_u_e_:

     the stepwise-selected model is returned, with up to two
     additional components.  There is an `"anova"' component
     corresponding to the steps taken in the search, as well
     as a `"keep"' component if the `keep=' argument was
     supplied in the call. The `"Resid. Dev"' column of the
     analysis of deviance table refers to a constant minus
     twice the maximized log likelihood: it will be a
     deviance only in cases where a saturated model is well-
     defined (thus excluding `lm', `aov' and `survreg' fits,
     for example).

_N_o_t_e_:

     This function differs considerably from the function in
     S, which uses a number of approximations and does not
     compute the correct AIC.

_A_u_t_h_o_r_(_s_)_:

     B.D. Ripley

_S_e_e _A_l_s_o_:

     `add1', `drop1'

_E_x_a_m_p_l_e_s_:

     example(lm)
     step(lm.D9)

     data(swiss)
     summary(lm1 <- lm(Fertility ~ ., data = swiss))
     slm1 <- step(lm1)
     summary(slm1)
     slm1 $ anova

