

lqs {lqs}                                    R Documentation

_R_e_s_i_s_t_a_n_t _R_e_g_r_e_s_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n_:

     Fit a regression to the `good' points in the dataset,
     thereby achieving a regression estimator with a high
     breakdown point.  `lmsreg' and `ltsreg' are compatibil-
     ity wrappers.

_U_s_a_g_e_:

     lqs(x, ...)
     lqs.formula(formula, data = NULL, ...,
                 method = c("lts", "lqs", "lms", "S", "model.frame"),
                 subset, na.action = na.fail, model = TRUE,
                 x = FALSE, y = FALSE, contrasts = NULL)
     lqs.default(x, y, intercept, method = c("lts", "lqs", "lms", "S"),
                 quantile, control = lqs.control(...), k0 = 1.548, seed, ...)
     lmsreg(...)
     ltsreg(...)

_A_r_g_u_m_e_n_t_s_:

 formula: a formula of the form `y ~ x1 + x2 + ...{}{}'.

    data: data frame from which variables specified in `for-
          mula' are preferentially to be taken.

  subset: An index vector specifying the cases to be used in
          fitting. (NOTE: If given, this argument must be
          named exactly.)

na.action: A function to specify the action to be taken if
          `NA's are found. The default action is for the
          procedure to fail. An alternative is `na.omit',
          which leads to omission of cases with missing val-
          ues on any required variable.  (NOTE: If given,
          this argument must be named exactly.)

       x: a matrix or data frame containing the explanatory
          variables.

       y: the response: a vector of length the number of
          rows of `x'.

intercept: should the model include an intercept?

  method: the method to be used. `model.frame' returns the
          model frame: for the others see the `Details' sec-
          tion. Using `lmsreg' or `ltsreg' forces `"lms"'
          and `"lts"' respectively.

quantile: the quantile to be used: see `Details'. This is
          over-ridden if `method = "lms"'.

 control: additional control items: see `Details'.

    seed: the seed to be used for random sampling: see
          `.Random.seed'. The current value of `.Ran-
          dom.seed' will be preserved if it is set..

     ...: arguments to be passed to `lqs.default' or
          `lqs.control'.

_D_e_t_a_i_l_s_:

     Suppose there are `n' data points and `p' regressors,
     including any intercept.

     The first three methods minimize some function of the
     sorted squared residuals. For methods `"lqs"' and
     `"lms"' is the `quantile' squared residual, and for
     `"lts"' it is the sum of the `quantile' smallest
     squared residuals. `"lqs"' and `"lms"' differ in the
     defaults for `quantile', which are `floor((n+p+1)/2)'
     and `floor((n+1)/2)' respectively.  For `"lts"' the
     default is `floor(n/2) + floor((p+1)/2)'.

     The `"S"' estimation method solves for the scale `s'
     such that the average of a function chi of the residu-
     als divided by `s' is equal to a given constant.

     The `control' argument is a list with components:

     `psamp': the size of each sample. Defaults to `p'.

     `nsamp': the number of samples or `"best"' or `"exact"'
     or `"sample"'. If `"sample"' the number chosen is
     `min(5*p, 3000)', taken from Rousseeuw and Hubert
     (1997).  If `"best"' exhaustive enumeration is done up
     to 5000 samples: if `"exact"' exhaustive enumeration
     will be attempted however many samples are needed.

     `adjust': should the intercept be optimized for each
     sample?

_V_a_l_u_e_:

     An object of class `"lqs"'.

_N_o_t_e_:

     There seems no reason other than historical to use the
     `lms' and `lqs' options.  LMS estimation is of low
     efficiency (converging at rate n^{-1/3}) whereas LTS
     has the same asymptotic efficiency as an M estimator
     with trimming at the quartiles (Marazzi, 1993, p.201).
     LQS and LTS have the same maximal breakdown value of
     `(floor((n-p)/2) + 1)/n' attained if `floor((n+p)/2) <=
     quantile <= floor((n+p+1)/2)'.  The only drawback men-
     tioned of LTS is greater computation, as a sort was
     thought to be required (Marazzi, 1993, p.201) but this
     is not true as a partial sort can be used (and is used
     in this implementation).

     Adjusting the intercept for each trial fit does need
     the residuals to be sorted, and may be significant
     extra computation if `n' is large and `p' small.

     Opinions differ over the choice of `psamp'. Rousseeuw
     and Hubert (1997) only consider p; Marazzi (1993) rec-
     ommends p+1 and suggests that more samples are better
     than adjustment for a given computational limit.

     The computations are exact for a model with just an
     intercept and adjustment, and for LQS for a model with
     an intercept plus one regressor and exhaustive search
     with adjustment. For all other cases the minimization
     is only known to be approximate.

_A_u_t_h_o_r_(_s_)_:

     B.D. Ripley

_R_e_f_e_r_e_n_c_e_s_:

     P. J. Rousseeuw and A. M. Leroy (1987) Robust Regres-
     sion and Outlier Detection.  Wiley.

     A. Marazzi (1993) Algorithms, Routines and S Functions
     for Robust Statistics.  Wadsworth and Brooks/Cole.

     P. Rousseeuw and M. Hubert (1997) Recent developments
     in PROGRESS. In L1-Statistical Procedures and Related
     Topics ed Y. Dodge, IMS Lecture Notes volume 31, pp.
     201-214.

_S_e_e _A_l_s_o_:

     `predict.lqs'

_E_x_a_m_p_l_e_s_:

     data(stackloss)
     .Random.seed <- 1:4
     lqs(stack.loss ~ ., data=stackloss)
     lqs(stack.loss ~ ., data=stackloss, method="S", nsamp="exact")

