hihi, all - i thought about this same problem some years ago, and will try to recreate where i ended - the original problem was to estimate the size of a picture archive on the web, from the repetition (presumed independent) of single experiment that extracted (presumably uniformly randomly) and presented one picture the question was when to stop looking at the set, so i wanted a reasonable estimate for the total size N of the set clearly, until there is a repeated element, there is no maximum likelihood estimate for the size N (since larger sets are more likely to have n distinct selections, for any fixed N>n>1) what was interesting to me is that as soon as there is one repeat, a maximum likelihood estimate for N can be made, which turns out to be quadratic in the number n of selections made up to and including the repeated one (the expression is something like O(n*n/3), but the specific formula was slightly different for different parities of n, or maybe for different remainders for n (mod 3), i forget) however, the curve is VERY flat near that maximum, so the confidence interval estimate is very wide - i expected that more selections with more repeats should help get a sharper estimate for N i verified experimentally that more repeats make the peak narrower, but i could not quantify the improvement enough to get an analytic expression i tried to find the problem statement in some published application, and the closest i could get was the population estimation problem in ecological sampling more later, cal Chris Landauer Aerospace Integration Science Center The Aerospace Corporation cal@aero.org