6
When the Detectives Disagree: Choosing Your Method 17:13 Nia: So, Eli, we have these three main "detective tools" now. We have the blunt hammer of Bonferroni, the sequential scalpel of Holm, and the pragmatic "risk manager" of Benjamini-Hochberg. How do we actually choose? Because I could see myself "p-hacking" just by picking the method that gives me the result I want.
17:33 Eli: And that is the biggest trap! You can't just run all of them and pick the one that lets you publish your paper. That is just multiplicity all over again! You have to decide your correction method *before* you see the data.
17:44 Nia: "Pre-registration." I have heard that term. It sounds like a statistical pre-nup.
8:38 Eli: It really is! You are defining the rules of the relationship before things get messy. Our sources give us a really clear framework for this decision. It mostly comes down to two things: the "Cost of being Wrong" and the "Phase of Research."
18:03 Nia: Okay, let’s break those down. If the cost of a false positive is super high—like we are talking about a drug that could have dangerous side effects—we go with the FWER group, right?
0:52 Eli: Exactly. If one single mistake is unacceptable, you use Holm. It is your best FWER tool. It is what regulatory agencies like the FDA usually demand for primary endpoints in a clinical trial.
18:27 Nia: But what if we are in "Exploratory Mode"? Like, we are screening 5,000 genes to see which ones might be linked to a certain cancer.
18:35 Eli: Then you use BH. In that case, missing a real link (a Type II error) is actually more dangerous than finding a few false leads. You want to cast a wide net, and you can always validate those leads with a follow-up study later.
18:51 Nia: That makes sense. It is like the "Discovery Funnel." Use BH at the top of the funnel to find as much as possible, then use Holm or even Bonferroni at the bottom of the funnel when you are making the final, high-stakes decision.
19:05 Eli: You've hit on a great strategy. But there are some nuances. What if your tests are highly correlated? Like, you are testing a new education program and you are measuring "Math scores," "Logic scores," and "Arithmetic scores." Those are obviously going to move together.
19:22 Nia: I remember reading that Bonferroni gets really "harsh" there.
1:33 Eli: It does! It becomes overly conservative because it assumes the tests are independent. If your tests have "positive dependence," BH and Holm are usually fine, but if things get really messy and negatively correlated, there is an even tougher tool called the Benjamini-Yekutieli procedure.
19:45 Nia: Oh, another detective! What is Yekutieli’s specialty?
19:48 Eli: He handles "arbitrary dependence." It is a more conservative version of BH that works even when your tests are working against each other in weird ways. It uses a correction factor based on the natural log of the number of tests. It is the "worst-case scenario" tool for FDR.
20:04 Nia: It is amazing how much thought has gone into this. It is like there is a specific tool for every flavor of uncertainty. But I noticed something in the sources about "Post-hoc tests" for ANOVA. That feels like a different category.
6:54 Eli: It is! ANOVA is when you are comparing more than two groups—like, does this fertilizer work better than the other three brands? If the ANOVA says "Yes, there is a difference," it doesn't tell you *which* groups are different.
20:38 Nia: So you start doing pairwise comparisons. Fertilizer A vs B, A vs C, B vs C... and suddenly, you are back in the multiplicity trap!
13:58 Eli: Exactly! And for that specific case, we have specialized tools like Tukey’s HSD—the "Honest Significant Difference" test. It is designed specifically to control FWER for all pairwise comparisons. Or Dunnett’s test, if you only care about comparing everything to a single "Control" group.
21:06 Nia: I love that there is a "Honest" version. It implies that the uncorrected version is a bit of a liar.
21:17 Eli: In statistics, it usually is! If you run enough tests without these tools, the truth just gets buried under a mountain of coincidences.
21:25 Nia: So, to summarize the playbook: High stakes? Use Holm. Discovery? Use BH. Comparing groups? Use Tukey. And whatever you do, decide *before* you look at the p-values!
Eli: That is the golden rule. If you follow that, you are doing better science than a huge percentage of the papers that actually get published!