Tag Archives: epidemiology

Forget about multiple regression analysis

When starting in epidemiologyI had  only high school math skills. Nevertheless, I could usually find all major associations by rather simple tables and plots. Then I learned about multiple regression analysis and used in numerous research papers. Nevertheless I soon discovered that

The results are often somewhere between meaningless and quite damaging.

Continue reading Forget about multiple regression analysis

Leaflet.js – layer order, layer address and links

Leaflet is great for mapping in epidemiology with quick results of just cut & pasting a few lines. Problems do start, however, whenever running a more advanced project. It’s a pain, as plugins overwrite functions and basic css layouts. Or layers do not allow clickable links (as propation is being prohibited). Or geojson data that are rejected for whatever reason.
A showcase project, that had been planned for 2 days, took more than 1 week as the documentation is frequently unclear, incomplete and often hard to understand without any (jsfiddle) example. Numerous Google searches helped, as well as peaking into the sourcecode, while also other stack overflow posters have been very helpful. Continue reading Leaflet.js – layer order, layer address and links

Cause and effect in observational data: Magic, alchemy or just a new statistical tool?

Slashdot has a feature on that

Statisticians have long thought it impossible to tell cause and effect apart using observational data. The problem is to take two sets of measurements that are correlated, say X and Y, and to find out if X caused Y or Y caused X. That’s straightforward with a controlled experiment… But in the last couple of years, statisticians have developed a technique that can tease apart cause and effect from the observational data alone. It is based on the idea that any set of measurements always contain noise. However, the noise in the cause variable can influence the effect but not the other way round. So the noise in the effect dataset is always more complex than the noise in the cause dataset. .. The results suggest that the additive noise model can tease apart cause and effect correctly in up to 80 per cent of the cases (provided there are no confounding factors or selection effects).

and jmlr a more theoretical account

Based on these deliberations we propose an efficient new algorithm that is able to dis- tinguish between cause and effect for a finite sample of discrete variables.


Warum die Teilnehmerzahlen so niedrig sind in deutschen Studien

Responsezahlen sind für Epidemiologen entscheidend, wenn es um Repräsentativität und Verallgemeinerung von Schlussfolgerungen geht. Denn mit sinkender Response verändern sich nicht nur massiv Risikokonstellationen, auch werden Krankheitshäufigkeiten falsch geschätzt. Mit niedriger Response sind üblicherweise mehr Kranke in dem Untersuchungskollektiv (weil sie mehr Zeit haben und sich vielleicht auch mehr von einer Studie erwarten). Es fehlen dann aber die gesunden Probanden, an denen man protektive Faktoren studieren könnte. Mit niedriger Response sind gewöhnlich auch Frauen überrepräsentiert, oft auch Arbeitslose und bildungsfernere Schichten. Die einzelnen Faktoren haben zwar keine direkte Beziehung, in Kombination verzerren sie aber Studienergebnisse bis zur Unkenntlichkeit. In der Literatur ist dies auch als selection bias bekannt. Leider gibt es keine guten statistisches Verfahren, um für den selection bias zu korrigieren. Wie sollte das auch gehen? Daten können hier kaum extrapoliert werden. Wann ein solcher Selektionsbias einsetzt, kann man nicht eindeutig sagen. Mit 90% Teilnehmerate ist man auf der sicheren Seite, unter 50% wird es kritisch und irgendwann dann sinnlos. 75% Response sind in der Epidemiologie  Standard, in der Marktforschung begnügt man sich auch mit weniger. Der Qualitätsmaßstab in der Epidemiologie liegt aber wegen der Relevanz deutlich höher. In der Vorbereitung
Continue reading Warum die Teilnehmerzahlen so niedrig sind in deutschen Studien

The best vitamin D paper in 2013

I have probably two candidates here. The first one is by the Cantorna group in October 2013 and provides for the first time a link between between the gut microbiome and oral vitamin D exposure. We all thought that vitamin D has no influence on bacteria as they cannot utilize it. But that doesn’t seem to be true as the composition of the microbiome may change.

Mice that cannot produce 1,25(OH)2D3 [Cyp27b1 (Cyp) knockout (KO)], VDR KO as well as their wild-type littermates were used. Cyp KO and VDR KO mice had more bacteria from the Bacteroidetes and Proteobacteria phyla and fewer bacteria from the Firmicutes and Deferribacteres phyla in the feces compared with wild-type. In particular, there were more beneficial bacteria, including the Lactobacillaceae and Lachnospiraceae families, in feces from Cyp KO and VDR KO mice than in feces from wild-type … Our data demonstrate that vitamin D regulates the gut microbiome and that 1,25(OH)2D3 or VDR deficiency results in dysbiosis, leading to greater susceptibility to injury in the gut.

So while I always thought, oral vitamin D supplementation may have a direct effect on the gut mucosal system, this paper opens a completely new avenue. Continue reading The best vitamin D paper in 2013

The seven deadly sins

So, plenty of time now for reading papers. My recommendation – the deadly sins of epidemiology by Grandjean.

What an individual is capable of may be measured by how far his understanding is from his willing. What a person can understand he must also be able to make himself will. Between understanding and willing lie the excuses and evasions (Kierkegaard).

ok, here we start:

pride – a form of self-delusion
envy and wrath – leads to ingratitude and failure to recognize other colleagues’ achievement
lust, greed, and gluttony – obsessed by seeking satisfaction and seeks frequent and limitless attention and recognition, including the highest academic titles and prizes
sloth – indifference to public health and to the welfare of others

Obviously there was a need to publish that kind of papers, yea, yea.

What is the best logistic model?

I have never heard a formal lecture answering this question even after many years in epidemiology. It should be parsimonious of course to avoid too many missings but seems largely a subjective approach to keep or drop a variable. It was therefore quite helpful to find now an online lecture that exemplifies a sound approach – check out unc.edu/courses/2006spring. I already used anova to compare models (at least since my move from SAS to R) while using AIC is something that I am adding now to my toolbox. Continue reading What is the best logistic model?

Forget about genes III

Here is another opinion from a widely read German magazine (Spiegel online, 25. Mai 2009, Jörg Blech, Wahrsager im Labor) about our too great expectations in genetics:

Während die einen noch mehr und noch größere Vergleichsstudien fordern, halten Skeptiker wie Goldstein dies für reine Zeitverschwendung: “Wenn man 30.000 Patienten mit Diabetes Typ 2 untersucht hat, dann halte ich es für sinnlos, die Zahl auf 60.000 oder 100.000 zu erhöhen.”

good bye to the “common variant hypothesis”!