formula                 package:base                 R Documentation

_M_o_d_e_l _F_o_r_m_u_l_a_e

_D_e_s_c_r_i_p_t_i_o_n:

     The generic function `formula' and its specific methods provide a
     way of extracting formulae which have been included in other
     objects.

     `as.formula' is almost identical, additionally preserving
     attributes when `object' already inherits from `"formula"'.

_U_s_a_g_e:

     y ~ model
     formula(object)
     formula.default(anything)
     formula.formula(formula.obj)
     formula.terms(terms.obj)
     formula.data.frame(df)
     as.formula(object)
     I(name)

_D_e_t_a_i_l_s:

     The models fit by, e.g., the `lm' and `glm' functions are
     specified in a compact symbolic form. The `~' operator is basic in
     the formation of such models. An expression of the form `y ~
     model' is interpreted as a specification that the response `y' is
     modelled by a linear predictor specified symbolically by `model'.
     Such a model consists of a series of terms separated by `+'
     operators. The terms themselves consist of variable and factor
     names separated by `:' operators. Such a term is interpreted as
     the interaction of all the variables and factors appearing in the
     term.

     In addition to `+' and `:', a number of other operators are useful
     in model formulae.  The `*' operator denotes factor crossing:
     `a*b' interpreted as `a+b+a:b'.  The `^' operator indicates
     crossing to the specified degree.  For example `(a+b+c)^2' is
     identical to `(a+b+c)*(a+b+c)' which in turn expands to a formula
     containing the main effects for `a', `b' and `c' together with
     their second-order interactions. The `%in%' operator indicates
     that the terms on its left are nested within those on the right. 
     For example `a+b%in%a' expands to the formula `a+a:b'.  The `-'
     operator removes the specified terms, so that `(a+b+c)^2 - a:b' is
     identical to `a + b + c + b:c + a:c'. It can also used to remove
     the intercept term: `y~x - 1' is a line through the origin. A
     model with no intercept can be also specified as `y~x + 0' or `0 +
     y~x'.

     While formulae usually involve just variable and factor names,
     they can also involve arithmetic expressions. The formula `log(y)
     ~ a + log(x)' is quite legal. When such arithmetic expressions
     involve operators which are also used symbolically in model
     formulae, there can be confusion between arithmetic and symbolic
     operator use.

     To avoid this confusion, the function `I()' can be used to bracket
     those portions of a model formula where the operators are used in
     their arithmetic sense.  For example, in the formula `y ~ a +
     I(b+c)', the term `b+c' is to be interpreted as the sum of `b' and
     `c'.

_V_a_l_u_e:

     All the functions above produce an object of class `formula' which
     contains a symbolic model formula.

_S_e_e _A_l_s_o:

     `lm', `glm', `terms'.

_E_x_a_m_p_l_e_s:

     class(fo <- y ~ x1*x2) # "formula"
     fo
     typeof(fo)# R internal : "language"
     terms(fo)

     ## Create a formula for a model with a large number of variables:
     xnam <- paste("x", 1:25, sep="")
     (fmla <- as.formula(paste("y ~ ", paste(xnam, collapse= "+"))))

