Lets look at the Wikipedia definition first
Berkson’s fallacy is a result in conditional probability and statistics which is often found to be counterintuitive, and hence a veridical paradox. It is a complicating factor arising in statistical tests of proportions. Specifically, it arises when there is an ascertainment bias inherent in a study design … The most common example of Berkson’s paradox is a false observation of a negative correlation between two positive traits, i.e., that members of a population which have some positive trait tend to lack a second.
The original example is developed unsing the example of an hospital based group of patients. The only thing to know is that diabetes is a risk for cholecystitis in the general population.
Any given hospital in-patient without diabetes must have another disease (otherwise he would not be there), for example cholecystitis. And by definition this will be cholecystitis without diabetes caused by some other risk facors (female, fat, forty…) So in this group of in-patients there maybe a spurious negative association between cholecystitis and diabetes.
My example here is with families who are living on farms. Since around 1960 [Leynaert 2001] there is this interesting observation that farming families have less allergy, an effect that I found back in 1989 and that is most likely a healthy farmer effect.
This selected farm population has a lower allergy prevalence and of course their children will also have less allergy. All the negative correlations (that are interpreted as protection) with endotoxin, microbiome, etc could be caused by Berkson’s fallacy. The observation will also be even replicated as the same selection criteria are also present in the replication sample.
Many more cognitive biases could also be involved: anchoring, availability cascade, confirmation and expectation bias and of course: law of the instrument.