3 Aug
2004
3 Aug
'04
2:32 p.m.
Thomas Colthurst wrote:
This appears to be a variant of the coupon collector's problem, as discussed in http://www.math.uci.edu/~mfinkels/COUPON.PDF . They give a maximum likelihood estimate of N as the smallest integer j >= C_k satisfying
j + 1 ( j )^k ----------- (-----) < 1 j + 1 - C_k ( j+1 )
where C_k is the number of distinct things you saw in your k samples.
Thanks for the pointer; I'll read through the paper. Both in their maximum likelihood answer above, and in the paper in general, they seem to use only the total number of distinct things seen, and not the distribution of multiplicities with which you saw them. I wonder whether allowing the use of that extra data helps at all. --Michael Kleber