

ssanova(gss)                                 R Documentation

_F_i_t_t_i_n_g _S_m_o_o_t_h_i_n_g _S_p_l_i_n_e _A_N_O_V_A _M_o_d_e_l_s

_D_e_s_c_r_i_p_t_i_o_n_:

     `ssanova' fits smoothing spline ANOVA models with cubic
     spline, linear spline, or thin-plate spline marginals.
     The symbolic model specification via `formula' follows
     the same rule as in `lm'.

_U_s_a_g_e_:

     ssanova(formula, type="cubic", data=list(), weights, subset,
             offset, na.action=na.omit, partial=NULL, method="v",
             varht=1, prec=1e-7, maxiter=30, ext=.05, order=2)

_A_r_g_u_m_e_n_t_s_:

 formula: a symbolic description of the model to be fit.

    type: the type of marginals to be used.  Supported cur-
          rently are `type="cubic"' for cubic spline
          marginals, `type="linear"' for linear spline
          marginals, and `type="tp"' for thin-plate spline
          marginals.

    data: an optional data frame containing the variables in
          the model.

 weights: an optional vector of weights to be used in the
          fitting process.

  subset: an optional vector specifying a subset of observa-
          tions to be used in the fitting process.

  offset: an optional offset term with known parameter 1.

na.action: a function which indicates what should happen
          when the data contain NAs.

 partial: optional extra fixed effect terms in partial
          spline models.

  method: the method for smoothing parameter selection.
          Supported are `method="v"' for GCV, `method="m"'
          for type-II ML, and `method="u"' for Mallow's CL.

   varht: an external variance estimate needed for
          `method="u"'.  It is ignored when `method="v"' or
          `method="m"' are specified.

    prec: the precision in the fit required to stop the
          iteration for multiple smoothing parameter selec-
          tion.  It is ignored when only one smoothing
          parameter is involved.

 maxiter: the maximum number of iterations allowed for mul-
          tiple smoothing parameter selection.  It is
          ignored when only one smoothing parameter is
          involved.

     ext: for cubic spline and linear spline marginals, this
          option specifies how far to extend the domain
          beyond the minimum and the maximum as a percentage
          of the range.  The default `ext=.05' specifies
          marginal domains of lengths 110 percent of their
          respective ranges.  Prediction outside of the
          domain will result in an error.  It is ignored if
          `type="tp"' is specified.

   order: for thin-plate spline marginals, this option spec-
          ifies the order of the marginal penalties.  It is
          ignored if `type="cubic"' or `type="linear"' are
          specified.

_D_e_t_a_i_l_s_:

     `ssanova' and the affiliated `methods' provide a front
     end to RKPACK, a collection of RATFOR routines for
     structural multivariate nonparametric regression via
     the penalized least squares method.  The algorithms
     implemented in RKPACK are of the orders O(n^3) in exe-
     cution time and O(n^2) in memory requirement.  The con-
     stants in front of the orders vary with the complexity
     of the model to be fit.

     The model specification via `formula' is intuitive.
     For example, `y~x1*x2' yields a model of the form

        y = c + f_{1}(x1) + f_{2}(x2) + f_{12}(x1,x2) + e

     with the terms denoted by `"1"', `"x1"', `"x2"', and
     `"x1:x2"'.  Through the specifications of the side con-
     ditions, these terms are uniquely defined.  In the cur-
     rent implementation, f_{1} and f_{12} integrate to 0 on
     the `x1' domain for cubic spline and linear spline
     marginals, and add to 0 over the `x1' (marginal) sam-
     pling points for thin-plate spline marginals.

     The penalized least squares problem is equivalent to a
     certain empirical Bayes model or a mixed effect model,
     and the model terms themselves are generally sums of
     finer terms of two types, the unpenalized fixed effects
     and the penalized random effects.  Attached to every
     random effect there is a smoothing parameter, and the
     model complexity is largely determined by the number of
     smoothing parameters.

     The method `predict' can be used to evaluate the sum of
     selected or all model terms at arbitrary points within
     the domain, along with standard errors derived from a
     certain Bayesian calculation.  The method `summary' has
     a flag to request diagnostics for the practical identi-
     fiability and significance of the model terms.

_V_a_l_u_e_:

     `ssanova' returns a list object of `class "ssanova"'.

     The method `summary' is used to obtain summaries of the
     fits.  The method `predict' can be used to evaluate the
     fits at arbitrary points, along with the standard
     errors to be used in Bayesian confidence intervals.
     The methods `residuals' and `fitted.values' extract the
     respective traits from the fits.

_N_o_t_e_:

     The independent variables appearing in `formula' can be
     multivariate themselves.  In particular,
     `ssanova(y~x,"tp",order=order)' can be used to fit
     ordinary thin-plate splines in any dimension, of any
     order permissible, and with standard errors available
     for Bayesian confidence intervals.  Note that thin-
     plate splines reduce to polynomial splines in one
     dimension.

     For univariate marginals, the additive models using
     `type="cubic"' and `type="tp"' yield identical fit
     through different internal makes.  For example,
     `ssanova(y~x1+x2,"cubic")' and `ssanova(y~x1+x2,"tp")'
     yield the same fit.  The same is not true for models
     with interactions, however.

     Mathematically, the domain (through `ext' for
     `type="cubic"') or the order (through `order' for
     `type="tp"') could be specified individually for each
     of the variables.  No option is provided in our imple-
     mentation, however, as it would be more a source for
     confusion rather than a practical utility.

_A_u_t_h_o_r_(_s_)_:

     Chong Gu, chong@stat.purdue.edu

_R_e_f_e_r_e_n_c_e_s_:

     Wahba, G. (1990), Spline Models for Observational Data,
     Philadelphia: SIAM.

_S_e_e _A_l_s_o_:

     `predict.ssanova' for predictions and `summary.ssanova'
     for summaries.

_E_x_a_m_p_l_e_s_:

     ## Fit a cubic smoothing spline
     x <- runif(100); y <- 5 + 3*sin(2*pi*x) + rnorm(x)
     cubic.fit <- ssanova(y~x,method="m")

     ## The same fit with different internal makes
     tp.fit <- ssanova(y~x,"tp",method="m")

     ## Obtain estimates and standard errors on a grid
     new <- data.frame(x=seq(min(x),max(x),len=50))
     est <- predict(cubic.fit,new,se=TRUE)

     ## Plot the fit and the Bayesian confidence intervals
     plot(x,y,col=1); lines(new$x,est$fit,col=2)
     lines(new$x,est$fit+1.96*est$se,col=3)
     lines(new$x,est$fit-1.96*est$se,col=3)

