This dataset consists of salaries (in hundreds of dollars) for university faculty (all ranks) at 1,161 institutions in the United States for the academic year 1993-1994. It is one of several datasets available at the StatLib archive.

There is no a priori reason to choose any particular model for these random variates but the apparent bimodal nature of the histogram suggests that a binary mixture of two Normal (Gaussian) distributions might be acceptable. This is, indeed, the case although the Kolmogorov-Smirnov (K-S) statistic for the ML model [= 0.017317] is in the 87th percentile, based on a parametric bootstrap (with 1,000 bootstrap samples) for this model and sample size. While this value is "acceptable," it is nevertheless somewhat borderline and suggests that, were the sample much bigger, the fit would likely be unacceptable, or at least marginal.

The usual interpretation of a binary mixture is that it is an undifferentiated composite of two populations. If that is a valid interpretation here, then it would be up to the analyst to identify and characterize the two populations.


Normal(A,B)&Normal(C,D) = p Normal(A,B) + (1 − p) Normal(C,D)