While we're at it, the "trimmed mean" is another way to combine the robustness of the median with the acuity of the mean. the k-trimmed-mean throws out the k most extreme data points (if I remember correctly, and probably k should be even) then takes the mean of the rest. The pure mean is then equivalent to the 0-trimmed mean, and the pure median is equivalent to the (n-1)-trimmed mean (where n is the number of data points in the set). Here's an interesting question: suppose we have data X_1, ..., X_n drawn from a Gaussian distribution with unknown mean mu and known variance 1. We wish to estimate mu with a guess muhat. Virtually everyone uses the sample mean of the dataset as an estimate of mu, but note that mu is also the *median* of the distribution. Under what circumstances would we be justified in prefering the sample median of the data to estimate mu? Since the sample average is a sufficient statistic, the answer might be never, but I'm not sure. Might it be the case that that the sample median is preferable if we are using L1 loss, i.e., seeking to minimize E_mu |mu - muhat| ? Here is another question about the median: is there a median that makes sense in two or more dimensions? Suppose (X,Y) ~ f(x,y) where f(x,y) is the continuous joint pdf of the random variables X and Y. Is there a reasonable quantity to call the median? -Joshua On 9/29/05, Mike Speciner <speciner@ll.mit.edu> wrote:
So, if y(x) is the histogram, the median is the m such that
integral(x<m) y(x) = integral(x>m) y(x)
while the mean is the m such that
integral(x<m) |x-m|*y(x) = integral(x>m) |x-m|*y(x)
This suggests a whole family of averages (using various functions of (x-m) for the weighting), though what use they might have escapes me.
--ms
David Gale wrote:
Jim, what is the Propp median if there are m zeros and m fives (and zero everything else)? Dan, if you're going to bring in averages at all then why not go all the way and use THE average? But maybe the CDC was using some sort of hybrid like the one you suggest.
D
At 09:14 PM 9/28/2005, you wrote:
The picture was supposed to show a rectangle of width 1 and height 2 whose bottom is centered at x=1, and to the right of it, a rectangle of width 1 and height 1 whose bottom is centered at x=2.
The base of the first rectangle goes from x=1/2 to x=3/2, and the base of the second rectangle goes from x=3/2 to x=5/2.
The total area under the histogram is (1)(2)+(1)(1) = 3.
The area to the left of the line x=5/4 is (5/4-1/2)(2) = 3/2, which is half of the total area. So x=5/4 is the "median".
Jim
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun