tapply                 package:base                 R Documentation

_A_p_p_l_y _a _F_u_n_c_t_i_o_n _O_v_e_r _a ``_R_a_g_g_e_d'' _A_r_r_a_y

_D_e_s_c_r_i_p_t_i_o_n:

     Apply a function to each cell of a ragged array, that is to each
     (non-empty) group of values given by a unique combination of the
     levels of certain factors.

_U_s_a_g_e:

     tapply(X, INDEX, FUN = NULL, simplify = TRUE, ...)

_A_r_g_u_m_e_n_t_s:

       X: an atomic object, typically a vector.

   INDEX: list of factors, each of same length as `X'.

     FUN: the function to be applied.  In the case of functions like
          `+', `%*%', etc., the function name must be quoted.  If `FUN'
          is `NULL', tapply returns a vector which can be used to
          subscript the multi-way array `tapply' normally produces.

simplify: If `FALSE', `tapply' always returns an array of mode
          `"list"'.  If `TRUE' (the default), then if `FUN' always
          returns a scalar, `tapply' returns an array with the mode of
          the scalar.

     ...: optional arguments to `FUN'.

_V_a_l_u_e:

     When `FUN' is present, `tapply' calls `FUN' for each cell that has
     any data in it.  If `FUN' returns a single atomic value for each
     cell (e.g., functions `mean' or `var') and when `simplify' is
     `TRUE', `tapply' returns a multi-way array containing the values. 
     The array has the same number of dimensions as `INDEX' has
     components; the number of levels in a dimension is the number of
     levels (`nlevels()') in the corresponding component of `INDEX'.

     Note that contrary to S, `simplify = TRUE' always returns an
     array, possibly 1-dimensional.

     If `FUN' does not return a single atomic value, `tapply' returns
     an array of mode `list' whose components are the values of the
     individual calls to `FUN', i.e., the result is a list with a `dim'
     attribute.

_S_e_e _A_l_s_o:

     the convenience function `aggregate' (using `tapply'); `apply',
     `lapply' with its version `sapply'.

_E_x_a_m_p_l_e_s:

     groups <- as.factor(rbinom(32, n = 5, p = .4))
     tapply(groups, groups, length) #- is almost the same as
     table(groups)

     data(warpbreaks)
     ## contingency table from data.frame : array with named dimnames
     tapply(warpbreaks$breaks, warpbreaks[,-1], sum)
     tapply(warpbreaks$breaks, warpbreaks[, 3, drop = FALSE], sum)

     n <- 17; fac <- factor(rep(1:3, len = n), levels = 1:5)
     table(fac)
     tapply(1:n, fac, sum)
     tapply(1:n, fac, sum, simplify = FALSE)
     tapply(1:n, fac, range)
     tapply(1:n, fac, quantile)

     ind <- list(c(1, 2, 2), c("A", "A", "B"))
     table(ind)
     tapply(1:3, ind) #-> the split vector
     tapply(1:3, ind, sum)

