A passion for precision

The title is borrowed from  the opening Lecture of Theodor Hänsch on July, 7th, 2006 at the The Euroscience Open Forum in Munich and the Nobel lecture on December 8, 2005 [video] – one of the real science heroes.

(essay written 1/10/2009)

With a never ending stream of genetic association studies in allergy research we are facing severe problems as most of these studies are never reproduced 1. The Lancet editors already think that genetic association is “in danger of becoming a rather dirty word” 2 with even consensus recommendations are not being able to dam up this flood 3, 4. Should we rethink about reasons and consequences 5,6 ?

The Inevitable. The many contradictory results are frequently explained by genetic heterogeneity across populations. There are indeed good examples where populations differ in genetic terms. Lactase variants for example show a clear European South / North gradient. Estimating the genetic risk in an additional populations for replication, however, may have only limited value if population specific risks are of such an enormous importance  – which may be particular relevant for founder populations  and for rare variants. A recent family study 7 found a rare mutation in filaggrin (FLG) associated with atopic dermatitis (AD). Every replication study will depend on the fact that we are sampling enough FLG carriers from the population as well as enough AD patients while replication would fail in black-skinned AD patients (where FLG variants are absent). Non-replication therefore does not necessarily refute a true association while on the other hand replication in a further sample does not sort out a biological relevant association from a chance finding.

Another frequently invoked argument for non replication is phenotypic heterogeneity in complex diseases. Lacking precision in how outcomes are defined and how they are validated may explain non-replication in particular when the phenotype under interest relies on a simple diagnosis only and ignores the underlying QTLs. But there may be less doubt about a positive association like that of FLG and AD when looking at linkage scores, functional properties of the truncated profilaggrin, the general skin physiology, other irritant effects and finally the itchy and dry skin in these patients. Many genetic associations, however, are ignoring the biological context (as well as all Bradford-Hill criteria of causality) and have entered the (never) ending replication loop. Curiously, some associations being published also in high impact journals do not received any interest at all 8. Replication of genetic studies is not an art 9 – it is science or no science at all.

The evitable. Even if we assume for a moment that nature leaves us a detectable genetic risk in complex diseases (that is being worthwile to be found) the field is plagued by studies that are poorly planned, biased sampled, sloppy analysed and botchy reported. Although being accepted on a theoretically level 10, 11 these problems are tacitly ignored. While running a reference database of genetic association  over many years, we recognized an increasing number of studies lacking precision in reporting.

No doubt, errors occur while writing and typesetting of manuscripts, however, it seems that careful writing and copy-editing of printed material has become unfashionable. As a consequence a recent commentary 12 defined scientific misconduct in a new way – as a continuum ranging from honest errors to outright fraud. While the impact of fraud in the field of genetic association studies will be low (due to its low prevalence), the impact of inappropriate research conduct is being high (due to the high prevalence). There are many ways how genetic associations are being distorted by ignoring previous work, insufficient reporting of methods, suppressing inconsistent own data, poor adjustment of confounders, unreported multiple testing, frouzy tables, wording, and references. Accurate reporting may be an ethical duty as this “is not merely a failure to satisfy a few highly critical readers. It not infrequently makes the data that are presented of little or no value.” (http://www.plos.org/cms/node/371). Unfortunately the overloading of referees with ever increasing review requests may be a reason that these errors are not being detected anymore. The number of publications doubled during the last decade while the number of journals basically remained the same.

Decades ago, the bulk of scientific work was done by single researchers. The dusty folios in the libraries that survived from that time list just one author (at the end and not at the beginning of a paper) who had the ultimate responsibility. He would get all credits for his achievements – or all malice for the failure. When researching the obvious errors in some current association papers I have found an “impersonation” effect: The first author moves his responsibility to a statistician who claims that the data administrator had failed while the data administrator does not feel to be responsible for genotyping and so on: “Success has many fathers while failure is an orphan”. It is unacceptable that even papers withtwo dozen errors are never retracted13.

A solution. The field probably requires a greater investment of resources and more careful attention to details. A way out could be joint MD/PhD programs where investigators will learn to deal with clinical protocol, the laboratory work, the data analysis, how to interpret and validate the conclusions. It can be taught in the class room (and by example) how to ask the right clinical question, to perform accurate laboratory work, to check the integrity of data, to perform an optimal analysis, to develop a proper interpretation and to adopt a good writing style 14. We need quality assurance courses covering all critical steps but also teaching scientific accurateness and research ethics.

Also post-publication strategies could be discussed 15. As already mentioned most erroneous papers are neither corrected nor retracted, a bias called by Jim Giles to be “reluctant to have my name attached to negative comments” 16. The comment function in BMC journals, the respond button at BMJ and the eLetter function at PLOS journals nevertheless may be a some first step into a more open discussion of research failure.

It is timely to develop, implement and supervise good scientific practice also for genetic association studies in allergy research. The hunt for impact has compromised the field in an unfortunate manner where the “public´s trust in science and scientists is deteriorating” 17. Most researchers agreed during a recent web survey that most published research findings are wrong 18. We need more passion for precision to get back the trust in our science.

Cited References
1. Ioannidis JP, Ntzani EE, Trikalinos TA, Contopoulos-Ioannidis DG (2001) Replication validity of genetic association studies. Nat Genet 29:306-309
2. Lancet T (2003) In search of genetic precision. The Lancet 361:357
3. NN (2005) Framework for a fully powered risk engine. Nat Genet 37:1153
4. Cordell HJ, Clayton DG (2005) Genetic association studies. Lancet 366:1121-1131
5. Buchanan AV, Weiss KM, Fullerton SM (2006) Dissecting complex disease: the quest for the Philosopher’s Stone? Int J Epidemiol 35:562-571
6. Edwards JH (1999) Unifactorial models are not appropriate for multifactorial disease. Bmj 318:1353-1354
7. Smith FJ, Irvine AD, Terron-Kwiatkowski A, et al. (2006) Loss-of-function mutations in the gene encoding filaggrin cause ichthyosis vulgaris. Nat Genet 38:337-342
8. Zhang Y, Leaves NI, Anderson GG, et al. (2003) Positional cloning of a quantitative trait locus on chromosome 13q14 that influences immunoglobulin E levels and asthma. Nat Genet 34:181-186
9. Kabesch M (2009). The art of replication. Thorax 64:370-371
10. Spilker B (1991) Guide to Clinical Trials. Lippincott Williams and Wilkins:24ff
11. Skrabanek P. M, J. (1990) Follies and Fallacies in Medicine. Prometheus, Buffalo
12. Nylenna M, Simonsen S (2006) Scientific misconduct: a new approach to prevention. Lancet 367:1882-1884
13. Kabesch M, Peters W, Carr D, Leupold W, Weiland SK, von Mutius E (2003) Association between polymorphisms in caspase recruitment domain containing protein 15 and allergy in two German populations. J Allergy Clin Immunol 111:813-817
14. Anonymous (2005) Good data need good writing. nature immunology 6:1061
15. Bracken MF (2005) Genomic epidemiology of complex diseases: The need for an electronic evidence-based approach to research synthesis. American Journal of Epidemiology 162:297-301
16. Giles J (2006) The trouble with replication. Nature 442:344-347
17. Neill US (2006) Stop misbehaving! Journal of Clinical Investigation 116:1740-1741
18. Ioannidis JP (2005) Why most published research findings are false. PLoS Med 2:e124