Tag Archives: bioinformatics

Die Korrelationsmanie

Materialsammlung bioinformatics / big data / deep learning / AI

 

Passend dazu auch der CCC Vortrag Nadja Geisler / Benjamin Hättasch am 28.12.2019

Deep Learning ist von einem Dead End zur ultimativen Lösung aller Machine Learning Probleme geworden. Die Sinnhaftigkeit und die Qualität der Lösung scheinen dabei jedoch immer mehr vom Buzzword Bingo verschluckt zu werden.
Ist es sinnvoll, weiterhin auf alle Probleme Deep Learning zu werfen? Wie gut ist sind diese Ansätze wirklich? Was könnte alles passieren, wenn wir so weiter machen? Und können diese Ansätze uns helfen, nachhaltiger zu leben? Oder befeuern sie die Erwärmung des Planetens nur weiter?

 

Dazu der gigantische Energieverbrauch durch die Rechenleistung.

 

Wozu es führt: lauter sinnlose Korrelationen

 

https://www.technologyreview.com

Hundreds of AI tools have been built to catch covid. None of them helped.

How to map a SNP or CpG site to the proximal gene in R

There may be many ways, how to do that. Here are my favorite methods

SNP <- c("rs123")
BiocManager::install('grimbough/biomaRt')
library(biomaRt)
library(plyr)
grch37.snp = useMart(biomart="ENSEMBL_MART_SNP", host="grch37.ensembl.org", path="/biomart/martservice",dataset="hsapiens_snp")
grch37 = useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl")
t1 <- getBM(attributes = c("refsnp_id", "ensembl_gene_stable_id", "chr_name","chrom_start","chrom_end"), filters = "snp_filter", values = SNP, mart = grch37.snp)
names(t1)[names(t1)=="ensembl_gene_stable_id"] <- c("ensembl_gene_id")
t2 <- getBM(attributes = c("ensembl_gene_id","external_gene_name","start_position","end_position","description"), filters = "ensembl_gene_id", values =  ensembl_gene_id, mart = grch37)
join(t1,t2, type="left", by="ensembl_gene_id", match = "first")

and

BiocManager::install('FDb.InfiniumMethylation.hg19')
library(FDb.InfiniumMethylation.hg19)
CpG <- c("cg00920043")
hm450 <- get450k()
probes <- hm450[CpG]
getNearestTSS(probes)

Science is an emergent system too

From Edge / NY Times

We often try to understand problems by taking apart and studying their constituent parts. But emergent problems can’t be understood this way. Emergent systems are ones in which many different elements interact. The pattern of interaction then produces a new element that is greater than the sum of the parts, which then exercises a top-down influence on the constituent elements. Continue reading Science is an emergent system too

SNP batch annotation of GWAs

Genowatch (paper|website) is doing pretty well by annotating large SNP sets that would require otherwise numerous hours to map their position on genes, biological function and pathways. Continue reading SNP batch annotation of GWAs

GPS for biological pathways

After running a dual core CPU for two weeks I have a list here of all transcripts that are associated with the “ORMDL3” SNP gene cluster. Making sense from this list is a difficult task even with dozen of dedicated websites.
To get an overview of what is available I would start Continue reading GPS for biological pathways