

formula {base}                               R Documentation

_M_o_d_e_l _F_o_r_m_u_l_a_e

_D_e_s_c_r_i_p_t_i_o_n_:

     The generic function `formula' and its specific methods
     provide a way of extracting formulae which have been
     included in other objects.

     `as.formula' is almost identical, additionally preserv-
     ing attributes when `object' already inherits from
     `"formula"'.

_U_s_a_g_e_:

     y ~ model
     formula(object)
     formula.default(anything)
     formula.formula(formula.obj)
     formula.terms(terms.obj)
     formula.data.frame(df)
     as.formula(object)
     I(name)

_D_e_t_a_i_l_s_:

     The models fit by, e.g., the `lm' and `glm' functions
     are specified in a compact symbolic form.  The `~'
     operator is basic in the formation of such models.  An
     expression of the form `y~model' is interpreted as a
     specification that the response `y' is modelled by a
     linear predictor specified symbolically by `model'.
     Such a model consists of a series of terms separated by
     `+' operators.  The terms themselves consist of vari-
     able and factor names separated by `:' operators.  Such
     a term is interpreted as the interaction of all the
     variables and factors appearing in the term.

     In addition to `+' and `:', a number of other operators
     are useful in model formulae.  The `*' operator denotes
     factor crossing: `a*b' interpreted as `a+b+a:b'.  The
     `^' operator indicates crossing to the specified
     degree.  For example `(a+b+c)^2' is identical to
     `(a+b+c)*(a+b+c)' which in turn expands to a formula
     containing the main effects for `a', `b' and `c'
     together with their second-order interactions.  The
     `%in%' operator indicates that the terms on its left
     are nested within those on the right.  For example
     `a+b%in%a' expands to the formula `a+a:b'.  The `-'
     operator removes the specified terms, so that
     `(a+b+c)^2-a:b' is identical to `a+b+c+b:c+a:c'. It can
     also used to remove the intercept term: `y~x-1' is a
     line through the origin. A model with no intercept can
     be also specified like `y~x+0'.

     While formulae usually involve just variable and factor
     names, they can also involve arithmetic expressions.
     The formula `log(y)~a+log(x)' is quite legal.  When
     such arithmetic expressions involve operators which are
     also used symbolically in model formulae, there can be
     confusion between arithmetic and symbolic operator use.

     To avoid this confusion, the function `I()' can be used
     to bracket those portions of a model formula where the
     operators are used in their arithmetic sense.  For
     example, in the formula `y~a+I(b+c)', the term `b+c' is
     to be interpreted as the sum of `b' and `c'.

_V_a_l_u_e_:

     All the functions above produce an object of class
     `formula' which contains a symbolic model formula.

_S_e_e _A_l_s_o_:

     `lm', `glm', `terms'.

_E_x_a_m_p_l_e_s_:

     class(fo <- y ~ x1*x2) # "formula"
     fo
     typeof(fo)# R internal : "language"
     terms(fo)

     ## Create a formula for a model with a large number of variables:
     xnam <- paste("x", 1:25, sep="")
     (fmla <- as.formula(paste("y ~ ", paste(xnam, collapse= "+"))))

