

tree(tree)                                   R Documentation

_F_i_t _a _C_l_a_s_s_i_f_i_c_a_t_i_o_n _o_r _R_e_g_r_e_s_s_i_o_n _T_r_e_e

_D_e_s_c_r_i_p_t_i_o_n_:

     A tree is grown by binary recursive partitioning using
     the response in the specified formula and choosing
     splits from the terms of the right-hand-side. Numeric
     variables and ordered factors are divided into `X < a'
     and `X > a'; the levels of an unordered factor are
     divided into two non-empty groups. The split which max-
     imizes the reduction in impurity is chosen, the data
     set split and the process repeated. Splitting continues
     until the terminal nodes are too small or too few to be
     split.

     Factor predictor variables can have up to 32 levels.
     This limit is imposed for ease of labelling, but since
     their use in a classification tree with three or more
     levels in a response involves a search over 2^(k-1)-1
     groupings for k levels, the practical limit is much
     less.

_U_s_a_g_e_:

     tree(formula=formula(data), data=sys.parent(), weights, subset,
      na.action=na.pass, control=tree.control(nobs, ...),
      method="recursive.partition", split=c("deviance", "gini"),
      model=NULL, x=F, y=T, wts=T, ...)

_A_r_g_u_m_e_n_t_s_:

 formula: A formula expression. The left-hand-side
          (response) should be either a numerical vector
          when a regression tree will be fitted or a factor,
          when a classification tree is produced. The right-
          hand-side should be a series of numeric or factor
          or ordered variables separated by `+'; there
          should be no interaction terms. Both `.' and `-'
          are allowed: regression trees can have `offset'
          terms.

    data: A data frame in which to preferentially interpret
          `formula', `weights' and `subset'.

 weights: Vector of non-negative observational weights;
          fractional weights are allowed.

  subset: An expression specifying the subset of cases to be
          used.

na.action: A function to filter missing data from the model
          frame. The default is `na.pass' (to do nothing) as
          `tree' handles missing values (by dropping them
          down the tree as far as possible).

 control: A list as returned by `tree.control'.

  method: character string giving the method to use. The
          only other useful value is `"model.frame"'.

   split: Splitting criterion to use.

   model: If this argument is itself a model frame, then the
          `formula' and `data' arguments are ignored, and
          `model' is used to define the model.

       x: If TRUE, the matrix of variables for each case is
          returned.

       y: If TRUE, the response variable is returned.

     wts: If TRUE, the weights are returned.

     ...: Additional arguments that are passed to `tree.con-
          trol'. Normally used for `mincut', `minsize' or
          `mindev'.

_A_u_t_h_o_r_(_s_)_:

     B.D. Ripley

_R_e_f_e_r_e_n_c_e_s_:

     Breiman L., Friedman J.H., Olshen R.A., and  Stone,
     C.J. (1984).  Classification  and Regression Trees.
     Wadsworth.

     Ripley, B.D. (1996).  Pattern Recognition and Neural
     Networks.  Cambridge University Press, Cambridge.

_S_e_e _A_l_s_o_:

     `tree.control', `prune.tree', `predict.tree',
     `snip.tree'

_E_x_a_m_p_l_e_s_:

     library(MASS)
     data(cpus)
     cpus.ltr <- tree(log10(perf) ~ syct+mmin+mmax+cach+chmin+chmax, cpus)
     cpus.ltr
     summary(cpus.ltr)
     plot(cpus.ltr);  text(cpus.ltr)

     data(iris)
     ir.tr <- tree(Species ~., iris)
     ir.tr
     summary(ir.tr)

