[math-fun] Scientific Method, Experiments and Causality
I'm not an expert in statistical analysis, and I'm having a hard time reconciling all of the features of the modern scientific method. In particular, the usual process goes something like the following: 1. A scientist observes some phenomena and detects some correlations between observations of type A and observations of type B. 2. The scientist *hypothesizes* some causality among the observations. 3. The scientist designs some *experiment* to try to determine causality. 4. But since the scientist already has preconceived notions about the causality, he/she is not the appropriate person to *perform* the experiment; better to *double blind* the study and have someone *completely ignorant of the experimental design* perform the experiment on subjects (in the case of animate subjects) who are also *completely ignorant of the experimental design*. 5. The data from the experiment can be analyzed by yet another party who is *completely ignorant of the experimental design*, so that his/her biases cannot affect the analysis. In a perfect, causal world, such a proper experiment should show causality if and only if the causality exists. In particular, a "proper" experiment should have N large enough so that the probability of false positives and false negatives are unbelievably small. ----- Here's my problem: Scientists have been accused of *fitting to the facts* -- i.e., coming up with hypotheses *after the experiment* that match the experimental results. Furthermore, some have recommended that all such "a posteriori" papers be firmly rejected as scientific fraud. My question is: "how can our universe possibly tell whether the hypothesis was suggested before or after the experiment?" In a classically causal universe, the timing of the hypothesis and the timing of the experiment should make no difference, because the mental state of the scientist can't possibly affect the results of the experiment. If an experiment is indeed performed completely blind by disinterested third parties, why should anyone care how or *when* the hypothesis was obtained?
Are the fraud suggestions a result of the increasing awareness that one shouldn't report common statistical "evidence" for a post-hoc theory? There's a growing hatred of the p-value in soft science publications, since it's meaningless or fraudulent to report for a hypothesis generated after an experiment. I recall an anecdote about a psychology student asking a mathematician to calculate the probability of a result after it happened. On the other hand, I don't know any conditions where simply hypothesizing after an experiment would cause problems - in some fields it's almost required to present a (often) post-hoc theory along with data. (eg, chemists grow a novel sample, then physicists measure some properties, then they design or select a microscopic model that matches the data, then publish) On Sun, Jun 24, 2018 at 2:01 PM Henry Baker <hbaker1@pipeline.com> wrote:
I'm not an expert in statistical analysis, and I'm having a hard time reconciling all of the features of the modern scientific method.
In particular, the usual process goes something like the following:
1. A scientist observes some phenomena and detects some correlations between observations of type A and observations of type B.
2. The scientist *hypothesizes* some causality among the observations.
3. The scientist designs some *experiment* to try to determine causality.
4. But since the scientist already has preconceived notions about the causality, he/she is not the appropriate person to *perform* the experiment; better to *double blind* the study and have someone *completely ignorant of the experimental design* perform the experiment on subjects (in the case of animate subjects) who are also *completely ignorant of the experimental design*.
5. The data from the experiment can be analyzed by yet another party who is *completely ignorant of the experimental design*, so that his/her biases cannot affect the analysis.
In a perfect, causal world, such a proper experiment should show causality if and only if the causality exists. In particular, a "proper" experiment should have N large enough so that the probability of false positives and false negatives are unbelievably small.
----- Here's my problem:
Scientists have been accused of *fitting to the facts* -- i.e., coming up with hypotheses *after the experiment* that match the experimental results. Furthermore, some have recommended that all such "a posteriori" papers be firmly rejected as scientific fraud.
My question is: "how can our universe possibly tell whether the hypothesis was suggested before or after the experiment?"
In a classically causal universe, the timing of the hypothesis and the timing of the experiment should make no difference, because the mental state of the scientist can't possibly affect the results of the experiment.
If an experiment is indeed performed completely blind by disinterested third parties, why should anyone care how or *when* the hypothesis was obtained?
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
There's nothing wrong with hypothesizing after the experiment. That's exploratory. What's wrong is to apply a statistical analysis based on experimental design which assumes the hypothesis was made and confidence bounds defined before the experiment. This is called p-hacking: searching post-hoc for some result that has a p-value<0.20 so it will meet (the very low) standard for publication in some journals. Brent On 6/24/2018 12:12 PM, James Davis wrote:
Are the fraud suggestions a result of the increasing awareness that one shouldn't report common statistical "evidence" for a post-hoc theory? There's a growing hatred of the p-value in soft science publications, since it's meaningless or fraudulent to report for a hypothesis generated after an experiment. I recall an anecdote about a psychology student asking a mathematician to calculate the probability of a result after it happened.
On the other hand, I don't know any conditions where simply hypothesizing after an experiment would cause problems - in some fields it's almost required to present a (often) post-hoc theory along with data. (eg, chemists grow a novel sample, then physicists measure some properties, then they design or select a microscopic model that matches the data, then publish)
On Sun, Jun 24, 2018 at 2:01 PM Henry Baker <hbaker1@pipeline.com> wrote:
I'm not an expert in statistical analysis, and I'm having a hard time reconciling all of the features of the modern scientific method.
In particular, the usual process goes something like the following:
1. A scientist observes some phenomena and detects some correlations between observations of type A and observations of type B.
2. The scientist *hypothesizes* some causality among the observations.
3. The scientist designs some *experiment* to try to determine causality.
4. But since the scientist already has preconceived notions about the causality, he/she is not the appropriate person to *perform* the experiment; better to *double blind* the study and have someone *completely ignorant of the experimental design* perform the experiment on subjects (in the case of animate subjects) who are also *completely ignorant of the experimental design*.
5. The data from the experiment can be analyzed by yet another party who is *completely ignorant of the experimental design*, so that his/her biases cannot affect the analysis.
In a perfect, causal world, such a proper experiment should show causality if and only if the causality exists. In particular, a "proper" experiment should have N large enough so that the probability of false positives and false negatives are unbelievably small.
----- Here's my problem:
Scientists have been accused of *fitting to the facts* -- i.e., coming up with hypotheses *after the experiment* that match the experimental results. Furthermore, some have recommended that all such "a posteriori" papers be firmly rejected as scientific fraud.
My question is: "how can our universe possibly tell whether the hypothesis was suggested before or after the experiment?"
In a classically causal universe, the timing of the hypothesis and the timing of the experiment should make no difference, because the mental state of the scientist can't possibly affect the results of the experiment.
If an experiment is indeed performed completely blind by disinterested third parties, why should anyone care how or *when* the hypothesis was obtained?
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
There are two reasons your concerns are mitigated. First, you assume that causality is to be experimentally demonstrated purely by statistical correlation. That's common in, for example, medical trials (where double blind tests are standard). But in the hard sciences it is more likely done by direct interference in a well controlled and isolated system so that only one variable is changed at a time. No statistics is needed. Second, where statistical analysis is needed most researchers consider Bayesian analysis to be more relevant than Neyman-Pearson experimental designs, and Bayesian analysis is independent of when or by whom an hypothesis is formed. But a Bayesian analysis requires quantifying prior probabilities. If the experimenter reports he tested for 20 different possible causes but just happened to assign a prior of 0.5, instead of a uniform 0.05, to the one that seemed to work, his bias would be obvious. There are also post-hoc correction factors for frequentist statistics that collective are called Bonferroni corrections. Failure to use those is a sign of poor, if not fraudulent, analysis. Of course personal bias can distort any scheme (c.f. N-rays & Blondlot, which also illustrates my first point); that is why detailed description of experiments is required in order to allow replication by independent experimenters. That's why at the Large Hadron Collider there are two independent detectors, ATLAS and CMS, designed by different groups and operating on different principles to study the same physics. Brent On 6/24/2018 10:59 AM, Henry Baker wrote:
I'm not an expert in statistical analysis, and I'm having a hard time reconciling all of the features of the modern scientific method.
In particular, the usual process goes something like the following:
1. A scientist observes some phenomena and detects some correlations between observations of type A and observations of type B.
2. The scientist *hypothesizes* some causality among the observations.
3. The scientist designs some *experiment* to try to determine causality.
4. But since the scientist already has preconceived notions about the causality, he/she is not the appropriate person to *perform* the experiment; better to *double blind* the study and have someone *completely ignorant of the experimental design* perform the experiment on subjects (in the case of animate subjects) who are also *completely ignorant of the experimental design*.
5. The data from the experiment can be analyzed by yet another party who is *completely ignorant of the experimental design*, so that his/her biases cannot affect the analysis.
In a perfect, causal world, such a proper experiment should show causality if and only if the causality exists. In particular, a "proper" experiment should have N large enough so that the probability of false positives and false negatives are unbelievably small.
----- Here's my problem:
Scientists have been accused of *fitting to the facts* -- i.e., coming up with hypotheses *after the experiment* that match the experimental results. Furthermore, some have recommended that all such "a posteriori" papers be firmly rejected as scientific fraud.
My question is: "how can our universe possibly tell whether the hypothesis was suggested before or after the experiment?"
In a classically causal universe, the timing of the hypothesis and the timing of the experiment should make no difference, because the mental state of the scientist can't possibly affect the results of the experiment.
If an experiment is indeed performed completely blind by disinterested third parties, why should anyone care how or *when* the hypothesis was obtained?
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
https://plus.google.com/+DanPiponi/posts/dcGDyMgDtJ9 Researcher: I've just completed this amazing experiment that's given me results significant at the p<0.01 level. Can I publish it in your journal? Journal: Did you pre-register? R: No? Why do I need to do that? J: For all I know you carried out 100 experiments and just picked the best result. In that case you'd likely get p<0.01 just by chance. R: But I didn't do that. J: I don't know that. My readers don't know that. So you need to pre-register any experiment with us first. If you pre-register 100 experiments but only write up one my readers will know you've just been trawling for significant results. R: Oh, OK. I have another great experiment called A coming up and I'll pre-register that. J: Can I help you with anything else? R: Well, we've developed this new piece of hardware to automate experiments and we have a string of 100 statistically independent experiments B1-100 that we want to run on it. J: Sure. Just register 100 experiments on our web site. R: No way. If experiment A shows a significant result, but I then register 100 more experiments, people will think that A is just the result of trawling. J: Well them's the rules. Register or go unpublished. R: OK, I have an idea. I'll batch B1-100 together as one experiment. If the individual experiments have p-values p1-100 I'll compute a "meta" p-value, q=min(p1/0.01, p2/0.01, ..., p100/0.01). For small enough x, P(q<x) is around x. So I'll treat q like an ordinary p-value. If it's significant, I'll write a paper about the individual underlying pi that made it significant. J: Um...well this is a bit irregular, but I have to admit it follows the letter of the rules, so I'll allow that. R: But even this is going to dilute experiment A by a factor of two. I really care about A, whereas B1-100 are highly speculative. Your journal policy is going to tend to freeze speculative research. J: I'm sorry you feel that way. R: I have another idea. I'll batch everything together. If p0 is the significance level of experiment A, let me construct another meta p-value q=min(p0/0.9, p1/0.001, p2/0.001, ..., p100/0.001). I'll pre-register just one meta-experiment based on this single q-value. If I get q<0.01 we know that something significant happened even though I performed 101 experiments. And now experiments B1-100 only have a small dilution effect on A. J: Um...I guess I have to accept that. R: In fact, why don't you just give me a budget of 1 unit? I'll share out this unit between any experiments I like as I see fit. I'll choose wi so that w1+...+wn=1. I'll then incrementally pre-register the individual parts of my one meta-experiment. For each individual experiment, instead of the usual p-value pi I'll use the modified p-value, pi/wi to judge significance. [1] J: OK R: Even better, why don't you just give me some "currency" to spend as modifiers to my p-values. I'll pre-register all of my experiments but with this proviso: for each one you publish the amount I spent on it and I'll divide the p-value by what I spent on it. So even though I appear to have many experiments, people can see which ones really were just me trawling, and which ones are significant. J: This will mean rewriting the policy. But it seems like a good scheme. You can do as many experiments as you like, even using combinatorial methods to trawl millions of possibilities. As long as you pay some of your budget for each experiment, and don't go over your budget, we can easily judge the significance of your work. And you get to choose which experiments you think are more important. Question: Is this a reasonable policy for a journal? [1] Suppose w1+...+wn=1, wi>0. Define q = min(pi/wi). Suppose pi uniform on [0,1]. P(min(pi/wi)<x) = P(p1/w1<x or ... or pn/wn<x) = 1-(1-P(p1<w1 x))...(1-P(pn<wn x)) = approx. w1 x+...+wn x = x for small x. So we can treat q like an ordinary p-value for small enough x. On Sun, Jun 24, 2018 at 11:59 AM, Henry Baker <hbaker1@pipeline.com> wrote:
I'm not an expert in statistical analysis, and I'm having a hard time reconciling all of the features of the modern scientific method.
In particular, the usual process goes something like the following:
1. A scientist observes some phenomena and detects some correlations between observations of type A and observations of type B.
2. The scientist *hypothesizes* some causality among the observations.
3. The scientist designs some *experiment* to try to determine causality.
4. But since the scientist already has preconceived notions about the causality, he/she is not the appropriate person to *perform* the experiment; better to *double blind* the study and have someone *completely ignorant of the experimental design* perform the experiment on subjects (in the case of animate subjects) who are also *completely ignorant of the experimental design*.
5. The data from the experiment can be analyzed by yet another party who is *completely ignorant of the experimental design*, so that his/her biases cannot affect the analysis.
In a perfect, causal world, such a proper experiment should show causality if and only if the causality exists. In particular, a "proper" experiment should have N large enough so that the probability of false positives and false negatives are unbelievably small.
----- Here's my problem:
Scientists have been accused of *fitting to the facts* -- i.e., coming up with hypotheses *after the experiment* that match the experimental results. Furthermore, some have recommended that all such "a posteriori" papers be firmly rejected as scientific fraud.
My question is: "how can our universe possibly tell whether the hypothesis was suggested before or after the experiment?"
In a classically causal universe, the timing of the hypothesis and the timing of the experiment should make no difference, because the mental state of the scientist can't possibly affect the results of the experiment.
If an experiment is indeed performed completely blind by disinterested third parties, why should anyone care how or *when* the hypothesis was obtained?
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Mike Stay - metaweta@gmail.com http://www.math.ucr.edu/~mike http://reperiendi.wordpress.com
It's cryptographically possible to prevent p-hacking, provided the experimenter does not have access to the (unencrypted) data on which the experiment is being performed. Specifically: (a) You 'pre-register' the experiment not with a journal, but with a fully-homomorphic-encrypted virtual machine. (b) If your experiment has a probability threshold of p, then you need to pay p bitcoins to an invalid address (to 'destroy' them), where the address is a function of the hash of the algorithm you intend to run. (c) The VM takes the proof of payment and the algorithm, runs it on the encrypted data, and returns a (digitally signed by the VM) certificate saying either 'true' or 'false'. -- APG.
Sent: Sunday, June 24, 2018 at 9:23 PM From: "Mike Stay" <metaweta@gmail.com> To: math-fun <math-fun@mailman.xmission.com> Subject: Re: [math-fun] Scientific Method, Experiments and Causality
https://plus.google.com/+DanPiponi/posts/dcGDyMgDtJ9
Researcher: I've just completed this amazing experiment that's given me results significant at the p<0.01 level. Can I publish it in your journal? Journal: Did you pre-register? R: No? Why do I need to do that? J: For all I know you carried out 100 experiments and just picked the best result. In that case you'd likely get p<0.01 just by chance. R: But I didn't do that. J: I don't know that. My readers don't know that. So you need to pre-register any experiment with us first. If you pre-register 100 experiments but only write up one my readers will know you've just been trawling for significant results. R: Oh, OK. I have another great experiment called A coming up and I'll pre-register that. J: Can I help you with anything else? R: Well, we've developed this new piece of hardware to automate experiments and we have a string of 100 statistically independent experiments B1-100 that we want to run on it. J: Sure. Just register 100 experiments on our web site. R: No way. If experiment A shows a significant result, but I then register 100 more experiments, people will think that A is just the result of trawling. J: Well them's the rules. Register or go unpublished. R: OK, I have an idea. I'll batch B1-100 together as one experiment. If the individual experiments have p-values p1-100 I'll compute a "meta" p-value, q=min(p1/0.01, p2/0.01, ..., p100/0.01). For small enough x, P(q<x) is around x. So I'll treat q like an ordinary p-value. If it's significant, I'll write a paper about the individual underlying pi that made it significant. J: Um...well this is a bit irregular, but I have to admit it follows the letter of the rules, so I'll allow that. R: But even this is going to dilute experiment A by a factor of two. I really care about A, whereas B1-100 are highly speculative. Your journal policy is going to tend to freeze speculative research. J: I'm sorry you feel that way. R: I have another idea. I'll batch everything together. If p0 is the significance level of experiment A, let me construct another meta p-value q=min(p0/0.9, p1/0.001, p2/0.001, ..., p100/0.001). I'll pre-register just one meta-experiment based on this single q-value. If I get q<0.01 we know that something significant happened even though I performed 101 experiments. And now experiments B1-100 only have a small dilution effect on A. J: Um...I guess I have to accept that. R: In fact, why don't you just give me a budget of 1 unit? I'll share out this unit between any experiments I like as I see fit. I'll choose wi so that w1+...+wn=1. I'll then incrementally pre-register the individual parts of my one meta-experiment. For each individual experiment, instead of the usual p-value pi I'll use the modified p-value, pi/wi to judge significance. [1] J: OK R: Even better, why don't you just give me some "currency" to spend as modifiers to my p-values. I'll pre-register all of my experiments but with this proviso: for each one you publish the amount I spent on it and I'll divide the p-value by what I spent on it. So even though I appear to have many experiments, people can see which ones really were just me trawling, and which ones are significant. J: This will mean rewriting the policy. But it seems like a good scheme. You can do as many experiments as you like, even using combinatorial methods to trawl millions of possibilities. As long as you pay some of your budget for each experiment, and don't go over your budget, we can easily judge the significance of your work. And you get to choose which experiments you think are more important.
Question: Is this a reasonable policy for a journal? [1] Suppose w1+...+wn=1, wi>0. Define q = min(pi/wi). Suppose pi uniform on [0,1]. P(min(pi/wi)<x) = P(p1/w1<x or ... or pn/wn<x) = 1-(1-P(p1<w1 x))...(1-P(pn<wn x)) = approx. w1 x+...+wn x = x for small x. So we can treat q like an ordinary p-value for small enough x.
On Sun, Jun 24, 2018 at 11:59 AM, Henry Baker <hbaker1@pipeline.com> wrote:
I'm not an expert in statistical analysis, and I'm having a hard time reconciling all of the features of the modern scientific method.
In particular, the usual process goes something like the following:
1. A scientist observes some phenomena and detects some correlations between observations of type A and observations of type B.
2. The scientist *hypothesizes* some causality among the observations.
3. The scientist designs some *experiment* to try to determine causality.
4. But since the scientist already has preconceived notions about the causality, he/she is not the appropriate person to *perform* the experiment; better to *double blind* the study and have someone *completely ignorant of the experimental design* perform the experiment on subjects (in the case of animate subjects) who are also *completely ignorant of the experimental design*.
5. The data from the experiment can be analyzed by yet another party who is *completely ignorant of the experimental design*, so that his/her biases cannot affect the analysis.
In a perfect, causal world, such a proper experiment should show causality if and only if the causality exists. In particular, a "proper" experiment should have N large enough so that the probability of false positives and false negatives are unbelievably small.
----- Here's my problem:
Scientists have been accused of *fitting to the facts* -- i.e., coming up with hypotheses *after the experiment* that match the experimental results. Furthermore, some have recommended that all such "a posteriori" papers be firmly rejected as scientific fraud.
My question is: "how can our universe possibly tell whether the hypothesis was suggested before or after the experiment?"
In a classically causal universe, the timing of the hypothesis and the timing of the experiment should make no difference, because the mental state of the scientist can't possibly affect the results of the experiment.
If an experiment is indeed performed completely blind by disinterested third parties, why should anyone care how or *when* the hypothesis was obtained?
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Mike Stay - metaweta@gmail.com http://www.math.ucr.edu/~mike http://reperiendi.wordpress.com _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
It's a reasonable policy for a journal if they are limited to p-values as the only possible critereon for publication and they restrict the amount of p-currency and the scheme is transparent to their readers. I'm assuming the amount spent on each experiment is stated in the pre-registration. Brent On 6/24/2018 1:23 PM, Mike Stay wrote:
https://plus.google.com/+DanPiponi/posts/dcGDyMgDtJ9
Researcher: I've just completed this amazing experiment that's given me results significant at the p<0.01 level. Can I publish it in your journal? Journal: Did you pre-register? R: No? Why do I need to do that? J: For all I know you carried out 100 experiments and just picked the best result. In that case you'd likely get p<0.01 just by chance. R: But I didn't do that. J: I don't know that. My readers don't know that. So you need to pre-register any experiment with us first. If you pre-register 100 experiments but only write up one my readers will know you've just been trawling for significant results. R: Oh, OK. I have another great experiment called A coming up and I'll pre-register that. J: Can I help you with anything else? R: Well, we've developed this new piece of hardware to automate experiments and we have a string of 100 statistically independent experiments B1-100 that we want to run on it. J: Sure. Just register 100 experiments on our web site. R: No way. If experiment A shows a significant result, but I then register 100 more experiments, people will think that A is just the result of trawling. J: Well them's the rules. Register or go unpublished. R: OK, I have an idea. I'll batch B1-100 together as one experiment. If the individual experiments have p-values p1-100 I'll compute a "meta" p-value, q=min(p1/0.01, p2/0.01, ..., p100/0.01). For small enough x, P(q<x) is around x. So I'll treat q like an ordinary p-value. If it's significant, I'll write a paper about the individual underlying pi that made it significant. J: Um...well this is a bit irregular, but I have to admit it follows the letter of the rules, so I'll allow that. R: But even this is going to dilute experiment A by a factor of two. I really care about A, whereas B1-100 are highly speculative. Your journal policy is going to tend to freeze speculative research. J: I'm sorry you feel that way. R: I have another idea. I'll batch everything together. If p0 is the significance level of experiment A, let me construct another meta p-value q=min(p0/0.9, p1/0.001, p2/0.001, ..., p100/0.001). I'll pre-register just one meta-experiment based on this single q-value. If I get q<0.01 we know that something significant happened even though I performed 101 experiments. And now experiments B1-100 only have a small dilution effect on A. J: Um...I guess I have to accept that. R: In fact, why don't you just give me a budget of 1 unit? I'll share out this unit between any experiments I like as I see fit. I'll choose wi so that w1+...+wn=1. I'll then incrementally pre-register the individual parts of my one meta-experiment. For each individual experiment, instead of the usual p-value pi I'll use the modified p-value, pi/wi to judge significance. [1] J: OK R: Even better, why don't you just give me some "currency" to spend as modifiers to my p-values. I'll pre-register all of my experiments but with this proviso: for each one you publish the amount I spent on it and I'll divide the p-value by what I spent on it. So even though I appear to have many experiments, people can see which ones really were just me trawling, and which ones are significant. J: This will mean rewriting the policy. But it seems like a good scheme. You can do as many experiments as you like, even using combinatorial methods to trawl millions of possibilities. As long as you pay some of your budget for each experiment, and don't go over your budget, we can easily judge the significance of your work. And you get to choose which experiments you think are more important.
Question: Is this a reasonable policy for a journal? [1] Suppose w1+...+wn=1, wi>0. Define q = min(pi/wi). Suppose pi uniform on [0,1]. P(min(pi/wi)<x) = P(p1/w1<x or ... or pn/wn<x) = 1-(1-P(p1<w1 x))...(1-P(pn<wn x)) = approx. w1 x+...+wn x = x for small x. So we can treat q like an ordinary p-value for small enough x.
On Sun, Jun 24, 2018 at 11:59 AM, Henry Baker <hbaker1@pipeline.com> wrote:
I'm not an expert in statistical analysis, and I'm having a hard time reconciling all of the features of the modern scientific method.
In particular, the usual process goes something like the following:
1. A scientist observes some phenomena and detects some correlations between observations of type A and observations of type B.
2. The scientist *hypothesizes* some causality among the observations.
3. The scientist designs some *experiment* to try to determine causality.
4. But since the scientist already has preconceived notions about the causality, he/she is not the appropriate person to *perform* the experiment; better to *double blind* the study and have someone *completely ignorant of the experimental design* perform the experiment on subjects (in the case of animate subjects) who are also *completely ignorant of the experimental design*.
5. The data from the experiment can be analyzed by yet another party who is *completely ignorant of the experimental design*, so that his/her biases cannot affect the analysis.
In a perfect, causal world, such a proper experiment should show causality if and only if the causality exists. In particular, a "proper" experiment should have N large enough so that the probability of false positives and false negatives are unbelievably small.
----- Here's my problem:
Scientists have been accused of *fitting to the facts* -- i.e., coming up with hypotheses *after the experiment* that match the experimental results. Furthermore, some have recommended that all such "a posteriori" papers be firmly rejected as scientific fraud.
My question is: "how can our universe possibly tell whether the hypothesis was suggested before or after the experiment?"
In a classically causal universe, the timing of the hypothesis and the timing of the experiment should make no difference, because the mental state of the scientist can't possibly affect the results of the experiment.
If an experiment is indeed performed completely blind by disinterested third parties, why should anyone care how or *when* the hypothesis was obtained?
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
Some examples of the kind of results that seem statistically significant that you can find if you don't have to specify what you're looking for in advance, but can just trawl a huge pile of data in search of correlations; http://www.tylervigen.com/spurious-correlations Or look at http://journals.sagepub.com/doi/full/10.1177/0956797611417632#_i1, a study that shows that listening to "When I"m Sixty Four" makes you younger. Not makes you feel younger or look or act younger; it actually makes you younger1 The people who listened to the song were younger afterwards than the people who didn't! And this study follows currently accepted practices, so there are published studies no more likely to be true than this one. See http://journals.sagepub.com/doi/full/10.1177/0956797611417632#_i1 for a discussion of more stringent requirements journals could use, but don't, that would prevent stuff like this from getting through. Andy On Sun, Jun 24, 2018 at 2:01 PM Henry Baker <hbaker1@pipeline.com> wrote:
I'm not an expert in statistical analysis, and I'm having a hard time reconciling all of the features of the modern scientific method.
In particular, the usual process goes something like the following:
1. A scientist observes some phenomena and detects some correlations between observations of type A and observations of type B.
2. The scientist *hypothesizes* some causality among the observations.
3. The scientist designs some *experiment* to try to determine causality.
4. But since the scientist already has preconceived notions about the causality, he/she is not the appropriate person to *perform* the experiment; better to *double blind* the study and have someone *completely ignorant of the experimental design* perform the experiment on subjects (in the case of animate subjects) who are also *completely ignorant of the experimental design*.
5. The data from the experiment can be analyzed by yet another party who is *completely ignorant of the experimental design*, so that his/her biases cannot affect the analysis.
In a perfect, causal world, such a proper experiment should show causality if and only if the causality exists. In particular, a "proper" experiment should have N large enough so that the probability of false positives and false negatives are unbelievably small.
----- Here's my problem:
Scientists have been accused of *fitting to the facts* -- i.e., coming up with hypotheses *after the experiment* that match the experimental results. Furthermore, some have recommended that all such "a posteriori" papers be firmly rejected as scientific fraud.
My question is: "how can our universe possibly tell whether the hypothesis was suggested before or after the experiment?"
In a classically causal universe, the timing of the hypothesis and the timing of the experiment should make no difference, because the mental state of the scientist can't possibly affect the results of the experiment.
If an experiment is indeed performed completely blind by disinterested third parties, why should anyone care how or *when* the hypothesis was obtained?
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Andy.Latto@pobox.com
Of course the study that shows that listening to "When I"m Sixty Four" makes you younger is one that shouldn't require statistical correlation analysis at all. Whenever you see some statistical correlation used to imply causation the first thing you ask yourself is, "Could this have been tested directly?" There's no point trying to clean up statistical inference if none is needed in the first place. Brent On 6/24/2018 9:40 PM, Andy Latto wrote:
Some examples of the kind of results that seem statistically significant that you can find if you don't have to specify what you're looking for in advance, but can just trawl a huge pile of data in search of correlations;
http://www.tylervigen.com/spurious-correlations
Or look at
http://journals.sagepub.com/doi/full/10.1177/0956797611417632#_i1,
a study that shows that listening to "When I"m Sixty Four" makes you younger. Not makes you feel younger or look or act younger; it actually makes you younger1 The people who listened to the song were younger afterwards than the people who didn't!
And this study follows currently accepted practices, so there are published studies no more likely to be true than this one. See http://journals.sagepub.com/doi/full/10.1177/0956797611417632#_i1 for a discussion of more stringent requirements journals could use, but don't, that would prevent stuff like this from getting through.
Andy
On Sun, Jun 24, 2018 at 2:01 PM Henry Baker <hbaker1@pipeline.com> wrote:
I'm not an expert in statistical analysis, and I'm having a hard time reconciling all of the features of the modern scientific method.
In particular, the usual process goes something like the following:
1. A scientist observes some phenomena and detects some correlations between observations of type A and observations of type B.
2. The scientist *hypothesizes* some causality among the observations.
3. The scientist designs some *experiment* to try to determine causality.
4. But since the scientist already has preconceived notions about the causality, he/she is not the appropriate person to *perform* the experiment; better to *double blind* the study and have someone *completely ignorant of the experimental design* perform the experiment on subjects (in the case of animate subjects) who are also *completely ignorant of the experimental design*.
5. The data from the experiment can be analyzed by yet another party who is *completely ignorant of the experimental design*, so that his/her biases cannot affect the analysis.
In a perfect, causal world, such a proper experiment should show causality if and only if the causality exists. In particular, a "proper" experiment should have N large enough so that the probability of false positives and false negatives are unbelievably small.
----- Here's my problem:
Scientists have been accused of *fitting to the facts* -- i.e., coming up with hypotheses *after the experiment* that match the experimental results. Furthermore, some have recommended that all such "a posteriori" papers be firmly rejected as scientific fraud.
My question is: "how can our universe possibly tell whether the hypothesis was suggested before or after the experiment?"
In a classically causal universe, the timing of the hypothesis and the timing of the experiment should make no difference, because the mental state of the scientist can't possibly affect the results of the experiment.
If an experiment is indeed performed completely blind by disinterested third parties, why should anyone care how or *when* the hypothesis was obtained?
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
participants (6)
-
Adam P. Goucher -
Andy Latto -
Brent Meeker -
Henry Baker -
James Davis -
Mike Stay