

emclust1(mclust)                             R Documentation

_B_I_C _f_r_o_m _h_i_e_r_a_r_c_h_i_c_a_l _c_l_u_s_t_e_r_i_n_g _f_o_l_l_o_w_e_d _b_y _E_M _f_o_r _a _p_a_r_a_m_-
_e_t_e_r_i_z_e_d _G_a_u_s_s_i_a_n _m_i_x_t_u_r_e _m_o_d_e_l_.

_U_s_a_g_e_:

     emclust1(data, nclus, modelid, equal=F, noise, Vinv)

_A_r_g_u_m_e_n_t_s_:

    data: matrix of observations.

   nclus: An integer vector specifying the numbers of clus-
          ters for which the BIC is to be calculated.
          Default: 1:9 without noise; 0:9 with noise.

 modelid: An integer or vector of two integers specifying
          the model(s) to be used in the hierarchical clus-
          tering and EM phases of the BIC calculations.  The
          allowed values or `modelid' and their interpreta-
          tion are as follows: `"EI"' : uniform spherical,
          `"VI"' : spherical, `"EEE"' : uniform variance,
          `"VVV"' : unconstrained  variance, `"EEV"' : uni-
          form shape and volume, `"VEV"' : uniform shape.
          Default: `c("VVV","VVV")' (unconstrained variance
          for both phases)

       k: If `k' is specified, the hierarchical clustering
          phase will use a sample of size `k' of the data in
          the initial hierarchical clustering phase. The
          default is to use the entire data set.

   equal: Logical variable indicating whether or not the
          mixing proportions are equal in the model. The
          default is to assume they are unequal.

   noise: A logical vector of length equal to the number of
          observations in the data, whose elements indicate
          an initial estimate of noise (indicated by `T') in
          the data. By default, `emclust1' fits Gaussian
          mixture models in which it is assumed there is no
          noise. If `noise' is specified, `emclust1' will
          fit a Gaussian mixture with a Poisson term for
          noise in the EM phase.

    Vinv: An estimate of the inverse hypervolume of the data
          region (needed only if `noise' is specified).
          Default : determined by the function `hypvol'.

_V_a_l_u_e_:

     Bayesian Information Criterion for the six mixture mod-
     els and specified numbers of clusters. Auxiliary infor-
     mation returned as attributes.

_D_E_S_C_R_I_P_T_I_O_N_:

     Bayesian Information Criterion for various numbers of
     clusters computed from hierarchical clustering followed
     by EM for a selected parameterization of Gaussian mix-
     ture models possibly with Poisson noise.

_N_O_T_E_:

     The reciprocal condition estimate returned as an
     attribute ranges in value between 0 and 1. The closer
     this estimate is to zero, the more likely it is that
     the corresponding EM result (and BIC) are contaminated
     by roundoff error.

_R_e_f_e_r_e_n_c_e_s_:

     C. Fraley and A. E. Raftery, How many clusters? Which
     clustering method?  Answers via model-based cluster
     analysis. Technical Report No. 329, Dept. of Statis-
     tics, U. of Washington (February 1998).

     R. Kass and A. E. Raftery, Bayes Factors. Journal of
     the American Statistical Association90:773-795 (1995).

_S_e_e _A_l_s_o_:

     `summary.emclust1', `emclust', `mhtree', `me'

_E_x_a_m_p_l_e_s_:

     data(iris)
     emclust1(iris[,1:4], nclus=2:3, modelid = c("VVV","EEV"))

