

dpill(KernSmooth)                            R Documentation

_S_e_l_e_c_t _a _B_a_n_d_w_i_d_t_h _f_o_r _L_o_c_a_l _L_i_n_e_a_r _R_e_g_r_e_s_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n_:

     Use direct plug-in methodology to select the bandwidth
     of a local linear Gaussian kernel regression estimate,
     as described by Ruppert, Sheather and Wand (1995).

_U_s_a_g_e_:

     dpill(x, y, blockmax=5, divisor=20, trim=0.01, proptrun=0.05,
           gridsize=401, range.x=<<see below>>, truncate=T)

_A_r_g_u_m_e_n_t_s_:

       x: vector of x data.  Missing values are not
          accepted.

       y: vector of y data.  This must be same length as
          `x', and missing values are not accepted.

blockmax: the maximum number of blocks of the data for con-
          struction of an initial parametric estimate.

 divisor: the value that the sample size is divided by to
          determine a lower limit on the number of blocks of
          the data for construction of an initial parametric
          estimate.

    trim: the proportion of the sample trimmed from each end
          in the `x' direction before application of the
          plug-in methodology.

proptrun: the proportion of the range of `x' at each end
          truncated in the functional estimates.

gridsize: number of equally-spaced grid points over which
          the function is to be estimated.

 range.x: vector containing the minimum and maximum values
          of `x' at which to compute the estimate.  For den-
          sity estimation the default is the minimum and
          maximum data values with 5% of the range added to
          each end.  For regression estimation the default
          is the minimum and maximum data values.

truncate: logical flag: if `TRUE', data with `x' values out-
          side the range specified by `range.x' are ignored.

_V_a_l_u_e_:

     the selected bandwidth.

_D_e_t_a_i_l_s_:

     The direct plug-in approach, where unknown functionals
     that appear in expressions for the asymptotically opti-
     mal bandwidths are replaced by kernel estimates, is
     used.  The kernel is the standard normal density.
     Least squares quartic fits over blocks of data are used
     to obtain an initial estimate. Mallow's Cp is used to
     select the number of blocks.

_W_A_R_N_I_N_G_:

     If there are severe irregularities (i.e. outliers,
     sparse regions) in the `x' values then the local poly-
     nomial smooths required for the bandwidth selection
     algorithm may become degenerate and the function will
     crash. Outliers in the `y' direction may lead to dete-
     rioration of the quality of the selected bandwidth.

_R_e_f_e_r_e_n_c_e_s_:

     Ruppert, D., Sheather, S. J. and Wand, M. P. (1995).
     An effective bandwidth selector for local least squares
     regression.  Journal of the American Statistical Asso-
     ciation, 90, 1257-1270.

     Wand, M. P. and Jones, M. C. (1995).  Kernel Smoothing.
     Chapman and Hall, London.

_S_e_e _A_l_s_o_:

     `ksmooth', `locpoly'.

_E_x_a_m_p_l_e_s_:

     data(geyser)
     x <- geyser$duration
     y <- geyser$waiting
     plot(x,y)
     h <- dpill(x,y)
     fit <- locpoly(x,y,bandwidth=h)
     lines(fit)

