Another layer of complexity in gene regulation

Yesterday evening I attended an excellent presentation by Nikolaus Rajewksy about microRNAs, small noncoding RNAs that are thought to have a role in posttranscriptional regulation. Nikolaus just moved 3 months ago from New York to follow Jens Reich at MDC in Berlin). Basically, he talked about his recent “l(ou)sy” paper and the “SNP” paper after giving a rather detailed history about the development of the field. It started in 1950 with Jacob and Monod, 1960 Britten and Davidson, 1970 Haywood (who even quit science after being dissappointed), finally to 1990 when the Ambros and Ruvkun labs discovered nematode microRNAs. Current research is mainly done in the Tuschl, Batel, Cohen, Lander and Rajewsky labs who produce the bulk of the 800 papers or so published in 2006.
Approximately 30% of genes are influenced by microRNAs, the total number of microRNA sites is under heavy debate (~22,000) as well as the number of human microRNAs (328); each microRNA regulates ~200 genes. Unfortunately there is still no highthroughput technique to detect targets. There is also no good prediction by free energy and even mismatches in the 5 prime of mRNA are possible (individual predictions can be obtained at Pictar that uses a hidden Markov model).
If I understood that correctly, miRNA are the feedback mechanism on RNA level (with transcription factors at the DNA level). He mentioned 3 classes known so far in humans: oncomiRNA, miRNA 375 myotrophin, and miRNA 122 acting on cholesterol (quite interesting as being described recently in the NEJM. The experimental knockdown of liver specific mouse microRNA shows ~300 up- and ~300 down regulated genes. Upregulated genes have in approximately 50% of cases one miRNA nucleus, downregulated ones have even less than average binding sites. There is no overrepresented GO category in upregulated genes but cholesterol is highly significant in downregulated genes whatever that means. Action of miRNA seem to heavily context dependent giving us many more questions than answers. Yea, yea.

Epidemiology in wartime

What was the best paper in 2006? I am voting for a Lancet paper by Gilbert Burnham, Riyadh Lafta, Shannon Doocy and Les Roberts. Between May and July, 2006, they did a national cross-sectional cluster sample survey of mortality in Iraq. Data from 1849 households was gathered, 1474 births and 629 deaths were reported. As of July, 2006, there have been 654 965 excess Iraqi deaths occured as a consequence of the war.
If you can’t imagine what it means to work in war regions you may read the biography of Robert Capa (1913 –1954) who worked as a photographer in the many wars, and died in the First Indochina War. He did wonderful photos together with his girl-friend Gerda Taro, one of the first woman photographers who died by a tank accident already in the Spanish civil war at the age of 26. I will pay for the flowers if you visit her grave at Père Lachaise in Paris.

p1000279-1.JPG p1000280-1.JPG

Addendum

As expected, the study raised criticism: scienceblog:doi:10.1126/science.316.5823.355a

Aleph (Codex Sinaiticus) online in 2009

So far we could admire the wonderful Gutenberg bible in Göttingen (1454)

gutenberg.png

Current students of theology seem to have much better tools… The exciting news are that Codex Sinaiticus (dating back to 350?) is currently being digitized where NT and half of AT will be available in 2 years or so.
In 1844 Konstantin von Tischendorf discovered the Codex in a paper basket at Saint Catherine monastery of Mount Sinai (I also visited the monastery some years ago but did not find anything useful in the basket). He was allowed to take 43 of the 129 sheets to Leipzig. On another visit he discovered even more papers that were donated to Tsar Alexander. In 1933 former USSR sold 347 sheets to the Britische Museum in London, 6 sheets are still in St. Petersburg. In 1975 another 38 pages were found that are still Saint Catherine. At the moment British Library London, Universitätsbibliothek Leipzig, Russian National Library St. Petersburg and Saint Catherine work together for a digital edition of the manuscript including the use of hyperspectral imaging to uncover erased or faded text. This is quite important as Codex Sinaiticus (together with Codex Vaticanus) has heavily influenced our textus receptus.

img001810.jpg

Retraction

Working in a field where hundreds of papers are published every year and none is ever retracted, I highly appreciate a letter in Science.

…D1 dopamine receptor (D1R)-stimulated intracellular Ca2+ release was attributed to a direct interaction with calcyon … the ability of calcyon and D1Rs to co-immunoprecipitate when co-expressed in cells as reported presumably stems from the association of both proteins with clathrin-coated vesicles…thus, the isolation of the calcyon clone in a Y2H screen with D1Rs appears to have been adventitious…

In my opinion this comment advance science and give the authors a much higher credibility than any further paper. Yea, yea.

A new analysis method for blood doping

I have been deeply disappointed this summer when I heard that Jan Ullrich will not participate at the Tour de France 2006 (although there are many more athletes that I am watching – I wish him all the best for the next year). Later on that year I heard a presentation in Bern about blood banking – how cells struggle to survive after leaving the body – and of course we did first gene expression experiments back in 2002.
So here is my idea how to identify autologous blood transfusion: Blood separated from the body will develop a unique RNA expression pattern that can be measured by conventional cDNA chips. Identifying this pattern – possibly only 10 upregulated RNAs – in the blood of an athelete could indicate autologous blood transfusion.
I guess that there will be only a minor chance to re-identify this pattern after retranfusion into the body as blood is being diluted around 1:30 and RNA being immediately degradaded.
However, some retransfused cells will probably maintain their death struggle program for some time leaving a good chance to profile them even after a couple of days if they have visited a freezer or not. Wikipedia is correct

In the case of detecting blood transfusions, a test for detecting homologous blood transfusions (from a donor to a doping athlete) has been in use since 2000. The test method is based on a technique known as fluorescent-activated cell sorting. By examining markers on the surface of blood cells, the method can determine whether blood from more than one person is present in an athlete’s circulation.

At present there is no accepted way of detecting autologous transfusions (using the athlete’s own RBCs) but research is in progress and the World Anti-Doping Agency (WADA) has promised that a test will eventually be introduced. The test method and its introduction date are to be kept secret in order to avoid tipping off doping athletes..

A potential example application may be found in the literature – no need to keep this idea secret as it will be nearly impossible to modify a particular gene expression pattern of a particular cell type.

Addendum 5/2/2010

Finally, the WADA recognizes the value of gene signatures in a new Science editorial.

All roads to NLM

This is not just an addendum to my previous post free-for-all or to number-cruncher: the 12 Dec NIH press release links to a new and exciting database

NIH Launches dbGaP, a Database of Genome Wide Association Studies
The National Library of Medicine (NLM), part of the National Institutes of Health (NIH), announces the introduction of dbGaP, a new database designed to archive and distribute data from genome wide association (GWA) studies. GWA studies explore the association between specific genes (genotype information) and observable traits, such as blood pressure and weight, or the presence or absence of a disease or condition (phenotype information).

Addendum

29-5-07 dbGaP suffers from some broken links but content improves!

dbgap.png

Christmas present – your digital book copy

Maybe you don´t want to wait until Google Scholar has it; maybe you are interested in a higher quality: Here is the web address digiwubu.gdz-cms.de at the Niedersächsische Staats- und Universitätsbibliothek Göttingen where I once studied theology.
There should be no problem to scan any book published before 1900, however you can ask also for books published later than that date. Costs will be are 0.25 € per page plus 5 € for handling and shipping a CD.
“Google Books Library Project” is currently scanning in Harvard, Stanford and Oxford >15 million volumes – German scan factories are in Göttingen and Munich. Yea, yea.

Exodus of science from Germany after 1933

A book that crossed my desk only very recently is about the exodus of science from Berlin after 1933. As a child I never understood the second commandment when God said to Mose that

Ex 20:4 I am a jealous God, punishing the children for the sin of the fathers to the third and fourth generation of those who hate me, but showing love to a thousand {generations} of those who love me and keep my commandments.

I always thought the idea to be unfair to be under collective guilt. Nevertheless when reading this book (published already in 1994 by Walter de Gruyter) we get a deeper meaning how science is affected for many generations by the displacement of the most prominent scientists.
Particular important in this book is the first chapter of Hubenstorf and Walther that highlights the situation in Berlin. Following world war I, Berlin had become the indisputable center of most scientific disciplines in the German speaking territory. In some disciplines Leipzig, Vienna or Munich may have been competitors, economics had been strong in Kiel, mathematics and physics in Göttingen, history in Marburg, however, for most scientists Berlin had been the highly desired “endpoint” of their career. he Friedrich-Wilhelms university had been the largest university, but there have been many more science organizations like Technical University Charlottenburg, Deutsche Hochschule für Politik, Preußische Akademie der Wissenschaften and Kaiser-Wilhelm-Gesellschaft (that is covered in more detail in another excellent chapter).
Medicine has been hit hardest during the Nazi period by having a large number of Jewish scientists. The resulting repercussions in the realm of science are described at different levels. Starting with a typology of transformation of scientific institutions, the establishment of new disciplines and the establishment of a military science sector, the autors give many historical details about the ways of scientific publishing or the organization of displaced scientists.
The cancer research department of the medical faculty fired 12 or 13 scientists; the hygiene institute dismissed 8 of 12 scientists including later Noble prize winner Erwin Chargaff. Hospital Lankwitz fired all physicians, Neukölln 67%, Freidrichshain 62% and Moabit 56%.
It is a terrible story – you can read how the editor of the Deutsche Medizinische Wochenschrift Paul Osswald Wolff was replaced by a Nazi supporter. Karger publisher even moved from Berlin to Basel (where they still reside today).
As any good science is strongly connected to teaching it may be understood that breaking this tradition has lead to a punishing of the children for the sin of the fathers to the third a fourth generation. Science politicians may even recognize the downside of spending money into science: they will be blessed by thousands of generations.

p1000238.JPG

R parallel computing

Following several unsuccessful attempts to implement a parallel computing platform for R statistical software, I am showing here my current approach that is largely influenced by a recent paper on cluster programming in c’t 6/06 by Oliver Lau (sorry, no online version). My primary interest is with the R library snow (or snow-ft) that offers the function clusterApplyLB. This function is all I need for my R programs.
Now it gets more complicated: library(snow) depends on library(Rmpi): Hao Yu has an excellent description at www.stats.uwo.ca/faculty/yu/Rmpi how to set up the mpi layer with MPICH2. I am currently experimenting with DeinoMPI a closely related high performance Windows interface. According to its developer David Ashton it has the following advantages

First, DeinoMPI does not require MPI applications to be started by mpiexec in order to call MPI_Comm_spawn so you could load Rmpi from the Rgui.exe without having to bother with calling mpiexec. Second, DeinoMPI loads the user profile when starting applications so if you query the user’s temporary directory you will get the user specific path and not the Windows system temp directory. Third, DeinoMPI handles arguments with spaces correctly if you quote them so you can pass environment variables with spaces in them. Fourth, DeinoMPI allows you to use the MPI Info object to pass extra options to MPI_Comm_spawn like drive mappings. So you could create an MPI_Info object and set wdir=z:\ and map=z:\\server\share. Then pass this info object in with the MPI_Comm_spawn command and you could map a network drive and launch an executable from this drive.

So far the Rmpi package is compiled for MPICH2 (not DeinoMPI) so it won’t run with only DeinoMPI installed but there is a good chance that this will change in the near future.
Further useful references are in the R newsletter 2003, p21 cran.r-project.org/doc/Rnewsand a paper in the UW Biostatistics Working Paper Series on “Simple Parallel Statistical Computing in R” by Anthony Rossini and LukeTierney.
BTW, haplotypes of the hapmap project were computed on a 110 node cluster provided by both Peter Donnelly’s Mathematical Genetics Group www.stats.ox.ac.uk based at the Oxford Centre for Gene Function and by a 128 node compute cluster provided by the Oxford e-Science Centre e-science.ox.ac.uk as part of the National Grid Service[to be cont’d…].

mpi1.png

3D LD

While waiting for genomewide SNP data to be re-partioned into LD blocks I found this page with some neat progamming tricks. It is part of the dissertation of Ben Fry / MIT about computational information design. Page 74 ff has a history of redesigning the widely used haploview pogram.

The design of these diagrams was first developed manually to work out
the details, but in interest of seeing them implemented, it was clear that
HaploView needed to be modified directly in order to demonstrate the
improvements in practice. Images of the redesigned version are seen
on this page and the page following. The redesigned version was even-
tually used as the base for a subsequence ‘version 2.0’ of the program,
which has since been released to the public and is distributed as one of
the analysis tools for the HapMap [www.hapmap.org] project.

3d-ld.png

How can we know?

A recent paper in Nature reported

Tissue samples were obtained from one of the following sources: Asterand, Pathlore, Tissue Transformation Technologies, Northwest Andrology, National Disease Research Interchange and Biocat. Only anonymized samples were used, and ethical approval was obtained for the study from Ärztekammer Berlin and the Cambridge Local Research Ethics Committee. […] Human primary cells were obtained from Cascade Biologics, Cell Applications, Analytical Biological Services, Cambrex Bio Science and the Deutsches Institut für Zell- und Gewebeersatz.

How did “Ärztekammer Berlin” or “Cambridge LREC” evaluate ethical performance of these companies? Or did anonymity automatically guarantee ethical research? Or is it just a formal requirement to mention ethics? Or …?

Open culture podcasts

As a frequent traveller I like podcasts. Here is a quick link to Open culture that have a huge university podcast collection including many foreign language selections (Boston College, Bowdoin College, Collège de France, Duke University Law School, Harvard University, Haverford College – Classic Texts, Johns Hopkins, Northwestern University, Ohio State, Princeton University, Stanford University, Swathmore College, University of California (the best collection), The University of Chicago, The University of Glasgow, The University of Pennsylvania, The University of Virginia, The University of Wisconsin-Madison, Vanderbilt University, Yale University and Ecole normale supérieure). If you don´t like proprietary formats you need to find the good and the bad apples.

Evolution in fast motion

Nature genetics as an advance online publication about comparative genome sequencing of E. coli where 13 de novo mutations in 5 strains were monitored over 44 d (or ~660 generations). It is a great study – not only because the author list includes one of my previous coauthors – but for giving a first insight about development of a mutation and fixing its allele frequency. Unfortunately, there is no flowchart and the methods are somewhat vague, what has been sequenced (or resequenced) in which strain at what time . In other words who are the winners? Did they manage that by their own strength or with a little help of some friends? Why rises the allele frequency always to 100% and what about some discrepancy of allele frequency and fitness? We will hopefully see more of these studies, yea, yea.

Dr. med. Sigmund Rascher, KL Dachau

On my way to work I am crossing every morning in Dachau East the former Nazi concentration camp/Konzentrationslager (KL). Its a monument of inhumanity and the deepest point in the history of “science”. A large number of prisoners were abused by SS doctors for medical experiments; an unknown number of prisoners suffered agonizing deaths in the course of atmospheric pressure, hypothermia, malaria and other experiments.

photo by 11 Dec 06
p1000219.JPG

Having a longstanding interest in history (and even published on the 50th anniversary of the Nuremberg trials) I have now been very interested in a new book by Sigfried Bär, one of the outstanding German science writers “Der Untergang des Hauses Rascher”, a history of the life of Dr. Sigmund Rascher, anthroposophic scholar, medical student, DFG-scholar, minion of of Heinrich Himmlers, air pressure and hypothermia researcher at KL Dachau and finally prisoner who died by being shot in the neck.

Dr. Bär spent several years researching the life of this mass murderer. He contacted relatives of Rascher, looked at family photos, talked to people who knew Rascher and went to archives. This is a unique document showing the avidity of a researcher for recognition by scientific colleagues. Other books from my own library that I recommend:

pb300004.JPG

Pathway to nowhere

I love pathway diagrams since I mounted the famous Biochemical Pathways of Boehringer in my bachelor flat. As far as complex disease genetics is concerned with many disease genes, an integration into a pathway context becomes critical. There are many attempts to extract this information from the literature and many companies that offer highly curated information (Biomax, Ariadne Genomics, Genomatix to name a few). Academia relies mainly on KEGG, the Kyoto encyclopedia of genes and genomes or Biocarta. Last week another pathway server appeared that is curated by the NCI and Nature magazine. Let’s have a look – I am currently working on Affymetrix 500K SNP annotation, yea, yea.