factor                 package:base                 R Documentation

_F_a_c_t_o_r_s

_D_e_s_c_r_i_p_t_i_o_n:

     The function `factor' is used to encode a vector as a factor (the
     names category and enumerated type are also used for factors).  If
     `ordered' is `TRUE', the factor levels are assumed to be ordered.
     For compatibility with S there is also a function `ordered'.

     `is.factor', `is.ordered', `as.factor' and `as.ordered' are the
     membership and coercion functions for these classes.

_U_s_a_g_e:

     factor(x, levels = sort(unique(x), na.last = TRUE), labels = levels,
            exclude = NA, ordered = is.ordered(x))
     ordered(x, ...)

     is.factor(x)
     is.ordered(x)

     as.factor(x)
     as.ordered(x)

_A_r_g_u_m_e_n_t_s:

       x: a vector of data, usually taking a small number of distinct
          values

  levels: an optional vector of the values that `x' might have taken.
          The default is the set of values taken by `x', sorted into
          increasing order.

  labels: either an optional vector of labels for the levels (in the
          same order as `levels' after removing those in `exclude'), or
          a character string of length 1.

 exclude: a vector of values to be excluded when forming the set of
          levels. This should be of the same type as `x', and will be
          coerced if necessary.

 ordered: logical flag to determine if the levels should be regarded as
          ordered (in the order given).

     ...: (in `ordered(.)'): any of the above, apart from `ordered'
          itself.

_D_e_t_a_i_l_s:

     The type of the vector `x' is not restricted.

     Ordered factors differ from factors only in their class, but
     methods and the model-fitting functions treat the two classes
     quite differently.

     The encoding of the vector happens as follows. First all the
     values in `exclude' are removed from `levels'. If `x[i]' equals
     `levels[j]', then the `i'-th element of the result is `j'.  If no
     match is found for `x[i]' in `levels', then the `i'-th element of
     the result is set to `NA'.

     Normally the `levels' used as an attribute of the result are the
     reduced set of levels after removing those in `exclude', but this
     can be altered by supplying `labels'. This should either be a set
     of new labels for the levels, or a character string, in which case
     the levels are that character string with a sequence number
     appended.

     `factor(x)' applied to a factor is a no-operation unless there are
     unused levels: in that case, a factor with the reduced level set
     is returned. If `exclude' is used it should also be a factor with
     the same level set as `x' or a set of codes for the levels to be
     excluded.

     The codes of a factor may contain `NA'. For a numeric `x', set
     `exclude=NULL' to make `NA' an extra level (`"NA"'), by default
     the last level.

_V_a_l_u_e:

     `factor' returns an object of class `"factor"' which has a set of
     numeric codes the length of `x' with a `"levels"' attribute of
     mode `character'.  If `ordered' is true (or `ordered' is used) the
     result has class `c("ordered", "factor")'.

     `is.factor' returns `TRUE' or `FALSE' depending on whether its
     argument is of type factor or not.  Correspondingly, `is.ordered'
     returns `TRUE' when its argument is ordered and `FALSE' otherwise.

     `as.factor' coerces its argument to a factor. It is an abbreviated
     form of `factor'.

     `as.ordered(x)' returns `x' if this is ordered, and `ordered(x)'
     otherwise.

_W_a_r_n_i_n_g:

     The interpretation of a factor depends on both the codes and the
     `"levels"' attribute. Be careful only to compare factors with the
     same set of levels (in the same order).  In particular,
     `as.numeric' applied to a factor is meaningless, and may happen by
     implicit coercion.

     The levels of a factor are by default sorted, but the sort order
     may well depend on the locale at the time of creation, and should
     not be assumed to be ASCII.

_S_e_e _A_l_s_o:

     `gl' for construction of ``balanced'' factors and `C' for factors
     with specified contrasts. `levels' and `nlevels' for accessing the
     levels,  and `codes' to get integer codes.

_E_x_a_m_p_l_e_s:

     ff <- factor(substring("statistics", 1:10, 1:10), levels=letters)
     ff
     codes(ff)
     factor(ff)# drops the levels that do not occur
     factor(factor(letters[7:10])[2:3]) # exercise indexing and reduction
     factor(letters[1:20], label="letter")

     class(ordered(4:1))# "ordered", inheriting from "factor"

