All posts by admin

Escaping from a swamp

The November AJHG has an excellent re-analysis of the dysbindin-schizophrenia association using new methodology that surpasses all previous meta-analysis techniques. As the single SNP association results from the previous 6 studies cannot be directly compared, they construct a European super-hap map from all tag SNPs in that region, place them in a phylogenetic tree before finally mapping all single associations on these haplotypes. Their Fig.1B show the main results; as the circles in Fig.1B are somewhat confusing, I have withdrawn their results – adding the haplotype frequencies and ordering the studies by year of publication.

pc010002-2.JPG

We may think of a triple-blind study – neither patients, nor PIs, nor we did know anything before. The results are alarming. I do not understand how the Kirov set could have included all haplotypes and why the Schwab/Williams set is in opposition to the Straub/Bogaert/Funke set.
What could have gone wrong? The authors of the current re-analysis believe that population differences are an unlikely reason for the inconsistency as the allele frequencies match between studies. Good news that genotyping errors may be largely excluded.
Unfortunately the authors remain vague why there is no common causal variant. Have there been different sampling schemes, different diagnostic thresholds, different environmental exposures in the previous studies? Is dysbindin at all a schizophrenia gene, or only under a certain genetic background? It seems possible that studies of one branch are false positives. Or is the haplotype reconstruction in the re-analysis erroneous for whatever reasons?
Von Münchhausen is well know for escaping from a swamp by pulling himself up by his own hair. I would like I could do that too.

Easter Eggs

In medieval ages messengers had tattoos under the scalp hair. Charles Dickens also described how women used to purl and to knit for hidden messages. Many software developers also insert messages or features in the code. The motivation may be to sign it or put some artistic touch on it – you will find a lot of websites out explaining the necessary keystrokes and web links.
I wonder if also other colleagues are hiding initials, words or messages in scientific papers? Unfortunately due to the online submission, publishers will now recognize faked references. What about using steganography to mark pictures or PDFs?

Zeitgeist

It seems that the German word Zeitgeist is increasingly used also in English texts. When thinking again and again about science and scientists, I always come back to a famous assay of Karl Jaspers written in 1932 (he lost his professorship in Heidelberg 1937; in 1938 he was forbidden to publish any more).

The title of the essay is “Die geistige Situation der Zeit”. The chapter “Wissenschaft” is always a comfort to me when being desperate about the inequity of the scientific world. Here is an excerpt:

Wissenschaften leisten auch heute Außerordentliches. Die
exakten Naturwissenschaften haben einen aufregenden Gang
rapider Fortschritte in Grundgedanken und empirischen Ergebnissen
begonnen. Ein über die Welt verbreiteter Kreis der
Forscher steht in den Beziehungen des rationalen Sichverstehens.
Einer wirft dem anderen den Ball zu. Dieser Vorgang
findet Widerhall in der Masse durch die Handgreiflichkeit der
Resultate. Das sachnahe Sehen in den Geisteswissenschaften
hat sich zu mikroskopischer Feinheit gesteigert. Ein nie dagewesener
Reichtum an Dokumenten und Monumenten ist vor
Augen gebracht. Kritische Sicherheit ist erreicht.

Die Krise der Wissenschaften besteht also nicht eigentlich
in den Grenzen ihres Könnens, sondern im Bewußtsein ihres
Sinns. Mit dem Zerfall eines Ganzen ist nun die Unermeßlichkeit
des Wißbaren der Frage unterstellt, ob es des Wissens wert
sei. Wo das Wissen ohne das Ganze einer Weltanschauung nur
noch richtig ist, wird es allenfalls nach seiner technischen
Brauchbarkeit geschätzt. Es versinkt in die Endlosigkeit dessen,
was eigentlich niemanden angeht.

Nicht also schon die immanente Entwicklung der Wissenschaften
macht die Krise zureichend begreiflich, sondern erst
der Mensch, auf den die wissenschaftliche Situation trifft. Nicht
Wissenschaft für sich, sondern er selbst in ihr ist in einer Krise.
Der historisch-soziologische Grund dieser Krise liegt im
Massendasein, Die Tatsache der Verwandlung der freien Forschung
Einzelner in den Betrieb der Wissenschaft hat zur Folge,
daß jedermann sich mitzuwirken für befähigt hält, wenn er nur
Verstand hat und fleißig ist. Es kommt ein wissenschaftliches
Plebejertum auf; man macht leere Analogiearbeiten, um sich
als Forscher auszuweisen, macht beliebige Feststellungen, Zählungen,
Beschreibungen und gibt sie für empirische Wissenschaft
aus. Die Endlosigkeit eingenommener Standpunkte, so
daß man in häufiger werdenden Fällen sich nicht mehr versteht,
ist allein die Folge davon, daß ein jeder unverantwortlich
seine Meinung zu sagen wagt, die er sich erquält, um auch
etwas zu bedeuten. Man hat die Unverfrorenheit, „nur zur
Diskussion zu stellen” was einem grade einfällt. Die Unmenge
gedruckter Rationalität wird in manchen Gebieten schließlich
zur Schaustellung des chaotischen Durcheinanderströmens der
nicht mehr eigentlich verstandenen Reste früher einmal lebendigen
Denkens in den Köpfen der Massenmenschen. Wenn so
Wissenschaft Funktion von Tausenden als jeweils zum Fach
als Beruf gehörender Interessenten wird, dann kann wegen
der Eigenschaften des Durchschnitts auch der Sinn von Forschung
… durcheinander geraten.

Number cruncher

In a recent blog I described high resolution SNP datasets that are available on the net. To work with these datasets you will probably need to upgrade your hardware and software. For data handling many people stick nowadays to commercial SQL databases that have plugins for PD software.
My recommendation is to save that money and store the data in a special format that may be more useful for these large dataset; details are in a technical report that I will upload later this day. In the meantime you can already check some software tools to work with these large datasets. This is what I know so far

  • David Duffy has recompiled his sibpair program |link
  • Geron(R) has something under development |link
  • Jochen Hampe and colleagues offer Genomizer |link
  • Franz Rüschendorf developed Alohomora |link
  • I renember about SNPGWA, a development at Wake Forest University |no link yet
  • there will be a R-Bioconductor package by Rob Scharpf |no link yet
  • R library GenABEL by Yurii Aulchenko |link
  • R library SNPassoc by Juan González |link

Addendum

A technical report how to work with large SNP dataset is now also available at my paper section. Alternatives to what I am suggesting in this paper, have been set out by an anonmyous reviewer

For R users, if SQLite limits are reached, hdf5 (http://hdf.ncsa.uiuc.edu/HDF5/) may be one way forward for really huge table structures since there is an R interface already available. PostgreSQL column limit depends on data type with a maximum of 1600 for simple types. MySQL with the BerkeleyDB backend may be like SQLite with no obvious column count limit. Metakit is not mentioned – it is column oriented and probably also has “unlimited” columns as long as each database is < 1GB or so.

What people search for

“Dissecting the complex genetic basis of mate choice” is the lengthy title of a lengthy text that tells us

males produce complex signals and displays that can consist of a combination of acoustic, visual, chemical and behavioural phenotypes…

The authors come from a school of integrative biology. I wonder why they have missed the excellent work in humans on HLA, fertility and mate choice.
Having said that, I would even suggest a radical different approach by looking at “What people search for” – hopefully I get now also hits on my blog for Paris Hilton, Renee Zellweger, Britney Spears, Heidi Klum, Pamela Anderson, Jessica Simpson and Jennifer Lopez ;-) Dissecting the complex genetic basis of mate choice shouldn´t be as complicated as you may imagine from this nature reviews genetics paper, yea, yea.

Better than the Delphi oracle

A new paper shows a nice workflow how to do an in vitro prediction which drug will suppress a certain tumor. The authors are simply linking the phenotype of the cell line “50% inhibitory concentration by drug X” with its expression signature. The good news are that doing both in one vial (phenotyping and expression analysis) is leading to excellent results.

genomicsignature.png

Is there any trick to do this also system-wide e.g. for the metabolism of a substance and its signalling pathway? Pharmacogenetics would greatly benefit from such an approach, nay, nay.

In the heat of the night

Sorry for a misleading title, but it is a nice idea to use heatmaps also for conditional linkage (or SNP association) results. Seen at the Heidelberg meeting. Sorry also, to show a figure that is severely cropped and blurred to maintain the authors right on their data, yea, yea.

pb250016.JPG

Helicopter epidemiology

Already at the very early beginning of my career I have been told about the dangers of “armchair” epidemiology – researchers only managing studies. There seems to be even another extreme, called “helicopter epidemiology” as pinpointed in the Lancet recently

…fly into a remote location containing “interesting individuals”, collect descriptive data and biological specimens, fly out, process, and publish the information elsewhere…

Gene lists by automatic literature extraction

Just found at the HUM MOLGEN bulletin board a link to Fable, a new automated literature extraction system. Fable is pretty fast and can output gene lists. Sure, the screenshot below shows only those genes that I mentioned in the abstract, but this is not so bad as the most important genes wil be placed there.
BTW, the number of reviews on asthma genetics have been falling to less than 50% after closing the Asthma Gene Database. Maybe this new service will help to re-establish the former output of reviews ;-) yea, yea.

fable.png

Science at work

I need to send back to the library my copy of the Altman book. It is a really excellent book, very informative and easy to read. Altman even does not stop at critical situations (page 12 of prologue) where he describes scientists as

Scientists are human. They have their jealousies. They gossip. They spread rumours. They exaggerate. Sometimes they treat hearsay as fact.

Me too, yea, yea.

IL4 cluster revisited

I am interested in 5q31 and the IL4 cluster since I met David Marsh in the lobby of a hotel in Heidelberg around 1993. David was one of the founding fathers of asthma genetics and I renember how he vividly told me that he has a forthcoming Science paper on the IL4 cluster and IgE. The cluster is still one of the best allergy regions where the signalling through IL4 and IL13 now gets more interest than the work of any of his competitors.
Nature genetics now has an update on the 3-dimensional resolution of the genomic region. It is not cristallographic work as might be expected but a nice study of the chromatin structure that is leading to a coordinated expression of these cytokines. SATB1 (special AT-rich sequence binding protein 1) is thought to anchor specialized sequences letting DNA loops come into interaction. I wonder if there might be even a direct physical interaction of the IL4 and IL13 promotor and if there will be any SNP influencing that interaction? David (who died of brain cancer in 1998) would have really liked this work. Yea, yea.

il4cluster.png

The first methylome available

-moblog- Having spent this weekend in Heidelberg city at a meeting of the German NGFN project I had the opportunity to listen to an excellent talk of Stephan Beck who works at the Wellcome Trust Sanger Center.

Epigenetics is the connecting link between the rather fixed genome and the variable transcriptome. To start with the end of the talk: Beck predicts for the near future highly parallel SNP, expression and methylation arrays. Although the first methylome has just been published 4 weeks ago by the Arabidopsis community (as with RNAi the plant people again at the forefront) there is still a long way ahead for a first human methylation map.

The latest information may be retrieved from www.epigenome.org, www.epigenome-noe.net, www.epitron.eu,
www.heroic-ip.eu and the German National Methylome Project on chromosome 21 (please google for the link). The methylome is largely an European initiative – the two US epigenome projects do not have any website so far. The network site has some introductory texts; Beck was also refering to a 2006 PLOS paper by Akhtar.

Currently there are 4 human chromosomes under work covering 873 genes (hopefully I captured this correctly as this was a very dense talk). 70% of genes examined so far are either clearly methylated or they are not methylated by testing 12 different tissues. Sperm stands out from all other tissues – which is not unexpected. Tissues originating from the same developmental background have similar methylation patterns – also not unexpected. A preliminary analysis of expression patterns shows that if the 5 prime end is methylated expression is suppressed- also not unexpected.

Fascinating: the colon cells that certainly have a close interaction with the environment do NEITHER show age NOR sex specific differences. Fascinating too: The most frequently methylated regions are ECRs (evolutionary conserved sequences) for whatever reason. Promotor methylation dips around the transcription start sites – from the plots I would say plus and minus 2000bp. Methylation seem to be also conserved between mouse and human tissues while methylation status seems stable over time.

Current bisulfite sequencing is still laborious, expensive and takes quite a long time while immunoprecipitation using MeDIP is getting an alternative. The Sanger people also did a study usinge Nimble(R) gene 50 mers where Ensembl and UCSC will soon have these data for display. Finally, methylation appears in blocks. TagMVPs (your guess is correct, these are tags for methylation variant profiles) construction is straightforward where the estimated 40 million CpG sites will probably be covered by less than 10 percent tagMVP – Haplo epi types are now called hepitypes, yea, yea.

pb250021.JPG

Addendum

Methyl Primer Express® Software – is a free software package to simplify and automate the primer design process in methylation experiments. The bisulfite kit is not free ;-)

Addendum

A new textbook and a nice preview

Hotel Dieu

The Hotel Dieu in Paris has been one of the first pediatric hospital in the world (see my photo of the hospital entrance). I recall from the detailed history of allergic diseases by Schadewaldt that at the beginning of the last century it was difficult to presen the students a case of the Bostock hayfever.
The disease was so rare that it took more than one week to find a child with the typical symptoms. Yea, yea.

p8250074_shiftn.png

Do you know …

…. why the language of the internet is English and not French? It is an interesting hypothesis that the yellow fever which decimated Napoleons troops in Santo Domingo was a crucial factor in the decision to sell Lousiana in 1803. Although German was also a science language around 1900, it certainly became discreted by two world wars. Yea. Yea.