

emclust(mclust)                              R Documentation

_B_I_C _f_r_o_m _h_i_e_r_a_r_c_h_i_c_a_l _c_l_u_s_t_e_r_i_n_g _f_o_l_l_o_w_e_d _b_y _E_M _f_o_r _s_e_v_e_r_a_l
_p_a_r_a_m_e_t_e_r_i_z_e_d _G_a_u_s_s_i_a_n _m_i_x_t_u_r_e _m_o_d_e_l_s_.

_U_s_a_g_e_:

     emclust(data, nclus, modelid, k, equal=F, noise, Vinv)

_A_r_g_u_m_e_n_t_s_:

    data: matrix of observations.

   nclus: An integer vector specifying the numbers of clus-
          ters for which the BIC is to be calculated.
          Default: 1:9 without noise; 0:9 with noise.

 modelid: A vector of character strings indicating the mod-
          els to be fitted.  The allowed values or `modelid'
          and their interpretation are as follows: `"EI"' :
          uniform spherical, `"VI"' : spherical, `"EEE"' :
          uniform variance, `"VVV"' : unconstrained vari-
          ance, `"EEV"' : uniform shape and volume, `"VEV"'
          : uniform shape.  The default is to fit all of the
          models.

       k: If `k' is specified, the hierarchical clustering
          phase will use a sample of size `k' of the data in
          the initial hierarchical clustering phase. The
          default is to use the entire data set.

   equal: Logical variable indicating whether or not the
          mixing proportions are equal in the model. The
          default is to assume they are unequal.

   noise: A logical vector of length equal to the number of
          observations in the data, whose elements indicate
          an initial estimate of noise (indicated by `T') in
          the data. By default, `emclust' fits Gaussian mix-
          ture models in which it is assumed there is no
          noise. If `noise' is specified, `emclust' will fit
          a Gaussian mixture with a Poisson term for noise
          in the EM phase.

    Vinv: An estimate of the inverse hypervolume of the data
          region (needed only if `noise' is specified).
          Default : determined by function `hypvol'

_V_a_l_u_e_:

     Bayesian Information Criterion for the six mixture mod-
     els and specified numbers of clusters. Auxiliary infor-
     mation returned as attributes.

_D_E_S_C_R_I_P_T_I_O_N_:

     Bayesian Information Criterion for various models and
     numbers of clusters computed from hierarchical cluster-
     ing followed by EM for several parameterizations of
     Gaussian mixture models possibly with Poisson noise.

_N_O_T_E_:

     The hierarchical clustering phase uses the uncon-
     strained model.  The reciprocal condition estimate
     returned as an attribute ranges in value between 0 and
     1. The closer this estimate is to zero, the more likely
     it is that the corresponding EM result (and BIC) are
     contaminated by roundoff error.

_R_e_f_e_r_e_n_c_e_s_:

     C. Fraley and A. E. Raftery, How many clusters? Which
     clustering method?  Answers via model-based cluster
     analysis. Technical Report No. 329, Dept. of Statis-
     tics, U. of Washington (February 1998).

     R. Kass and A. E. Raftery, Bayes Factors. Journal of
     the American Statistical Association90:773-795 (1995).

_S_e_e _A_l_s_o_:

     `summary.emclust', `emclust1', `mhtree', `me'

_E_x_a_m_p_l_e_s_:

     data(iris)
     bicvals _ emclust(iris[,1:4], nclus=1:3, modelid=c("VVV","EEV","VEV"))

