

factor {base}                                R Documentation

_F_a_c_t_o_r_s

_D_e_s_c_r_i_p_t_i_o_n_:

     The function `factor' is used to encode a vector as a
     factor (the names category and enumerated type are also
     used for factors).  If `ordered' is `TRUE', the factor
     levels are assumed to be ordered.  For compatibility
     with S there is also a function `ordered'.

     `is.factor', `is.ordered', `as.factor' and `as.ordered'
     are the membership and coercion functions for these
     classes.

_U_s_a_g_e_:

     factor(x, levels = sort(unique(x), na.last = TRUE), labels,
            exclude = NA, ordered = FALSE)
     ordered(x, ...)

     is.factor(x)
     is.ordered(x)

     as.factor(x)
     as.ordered(x)

_A_r_g_u_m_e_n_t_s_:

       x: a vector of data, usually taking a small number of
          distinct values

  levels: an optional vector of the values that `x' might
          have taken. The default is the set of values taken
          by `x', sorted into increasing order.

  labels: either an optional vector of labels for the levels
          (in the same order as `levels' after removing
          those in `exclude'), or a character string of
          length 1.

 exclude: a vector of values to be excluded when forming the
          set of levels. This should be of the same type as
          `x', and will be coerced if necessary.

 ordered: logical flag to determine if the levels should be
          regraded as ordered (in the order given).

     ...: (in `ordered(.)'): any of the above, apart from
          `ordered' itself.

_D_e_t_a_i_l_s_:

     The type of the vector `x' is not restricted.

     Ordered factors differ from factors only in their
     class, but methods and the model-fitting functions
     treat the two classes quite differently.

     The encoding of the vector happens as follows. First
     all the values in `exclude' are removed from `levels'.
     If `x[i]' equals `levels[j]', then the `i'-th element
     of the result is `j'.  If no match is found for `x[i]'
     in `levels', then the `i'-th element of the result is
     set to `NA'.

     Normally the `levels' used as an attribute of the
     result are the reduced set of levels after removing
     those in `exclude', but this can be altered by supply-
     ing `labels'. This should either be a set of new labels
     for the levels, or a character string, in which case
     the levels are that character string with a sequence
     number appended.

     `factor(x)' applied to a factor is a no-operation
     unless there are unused levels: in that case, a factor
     with the reduced level set is returned. If `exclude' is
     used it should also be a factor with the same level set
     as `x' or a set of codes for the levels to be excluded.

     The codes of a factor may contain `NA'. For a numeric
     `x', set `exclude=NULL' to make `NA' an extra level
     (`"NA"'), by default the last level.

_V_a_l_u_e_:

     `factor' returns an object of class `"factor"' which
     has a set of numeric codes the length of `x' with a
     `"levels"' attribute of mode `character'.  If `ordered'
     is true (or `ordered' is used) the result has class
     `c("ordered", "factor")'.

     `is.factor' returns `TRUE' or `FALSE' depending on
     whether its argument is of type factor or not.  Corre-
     spondingly, `is.ordered' returns `TRUE' when its argu-
     ment is ordered and `FALSE' otherwise.

     `as.factor' coerces its argument to a factor.  It is an
     abbreviated form of `factor'.

     `as.ordered(x)' returns `x' if this is ordered, and
     `ordered(x)' otherwise.

_W_a_r_n_i_n_g_:

     The interpretation of a factor depends on both the
     codes and the `"levels"' attribute. Be careful only to
     compare factors with the same set of levels (in the
     same order).  In particular, `as.numeric' applied to a
     factor is meaningless, and may happen by implicit coer-
     cion.

     The levels of a factor are by default sorted, but the
     sort order may well depend on the locale at the time of
     creation, and should not be assumed to be ASCII.

_S_e_e _A_l_s_o_:

     `gl' for construction of ``balanced'' factors and `C'
     for factors with specified contrasts.  `levels' and
     `nlevels' for accessing the levels,  and `codes' to get
     integer codes.

_E_x_a_m_p_l_e_s_:

     ff <- factor(substring("statistics", 1:10, 1:10), levels=letters)
     ff
     codes(ff)
     factor(ff)# drops the levels that do not occur
     factor(factor(letters[7:10])[2:3]) # exercise indexing and reduction
     factor(letters[1:20], label="letter")

     class(ordered(4:1))# "ordered", inheriting from "factor"

