; This example illustrates the ubiquity of random variables. ; At the same time that I was finalizing this Examples folder, ; I was reading a novel from our local public library. Like ; most novels, the text was right-justified, in this case, ; with hyphens at the end of many lines. The number of hyphens ; per page, Y, is a random variable. ; There were N = 18 lines per page, on average, that were long enough ; to have a hyphen. If you make the reasonable assumption that ; the probability of a hyphen at the end of any line, p, is a constant ; and independent of other lines, then Y should be ~Binomial(p, 18). ; Is it? ; A classic rule-of-thumb says that if N > 49 and Np < 5, then the ; event in question qualifies as a "rare" event. If so, then the ; distribution might be ~Poisson(Np). The presence of a hyphen would ; not seem to qualify as rare. Still, there is more than one reason why ; a variable might be ~Poisson. Why not try it? ; Here is the observed distribution of hyphens per page: @0 20 @1 77 @2 93 @3 83 @4 48 @5 32 @6 20 @7 3 @8 2
Comments are prefixed with a semicolon. The discrete data are grouped, with bin values prefixed by @, followed by the bin frequency. The density function shown is a Poisson distribution. Like all examples shown here, this model was found to be "acceptable" (probability > 0.1), based on the worse of two measured goodness-of-fit metrics, here maximum-likelihood (ML), which was also the (user-selected) optimization criterion, and Chi-squared.
The ML Binomial model, pictured below, is deemed to be "unacceptable" (probability < 0.05) but not "very unacceptable" (probability < 0.01) based on a parametric boostrap with 1,000 bootstrap samples synthesized from the optimum, ML, model.
This dataset is one of the examples included in the Regress+ software package.